Commit Graph

75 Commits (main)

Author SHA1 Message Date
mrq 0f3261e071 you should have migrated by now, if anything breaks it's on (You) 2023-03-05 14:03:18 +07:00
mrq de46cf7831 adding magically deleted files back (might have a hunch on what happened) 2023-02-24 19:30:04 +07:00
mrq 2c7c02eb5c moved the old readme back, to align with how DLAS is setup, sorta 2023-02-19 17:37:36 +07:00
mrq d8c6739820 added constructor argument and function to load a user-specified autoregressive model 2023-02-18 14:08:45 +07:00
mrq 413703b572 fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait 2023-02-16 22:12:13 +07:00
mrq ec80ca632b added setting "device-override", less naively decide the number to use for results, some other thing 2023-02-15 21:51:22 +07:00
mrq ea1bc770aa added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation 2023-02-15 05:01:40 +07:00
mrq b721e395b5 modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways 2023-02-15 04:44:14 +07:00
mrq 2e777e8a67 done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability) 2023-02-15 04:39:31 +07:00
mrq 314feaeea1 added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file 2023-02-14 21:20:04 +07:00
mrq 48275899e8 added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents 2023-02-14 16:47:34 +07:00
Armored1065 d458e932be Update 'README.md'
Updated text to reflect the download and playback options
2023-02-13 06:19:42 +07:00
mrq 5b5e32338c DirectML: fixed redaction/aligner by forcing it to stay on CPU 2023-02-12 20:52:04 +07:00
mrq 824ad38cca fixed voicefixing not working as intended, load TTS before Gradio in the webui due to how long it takes to initialize tortoise (instead of just having a block to preload it) 2023-02-12 20:05:59 +07:00
mrq 4d01bbd429 added button to recalculate voice latents, added experimental switch for computing voice latents 2023-02-12 18:11:40 +07:00
mrq 88529fda43 fixed regression with computing conditional latencies outside of the CPU 2023-02-12 17:44:39 +07:00
mrq 65f74692a0 fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU 2023-02-12 14:46:21 +07:00
mrq 94757f5b41 instll python3.9, wrapped try/catch when parsing args.listen in case you somehow manage to insert garbage into that field and fuck up your config, removed a very redudnant setup.py install call since that only is required if you're just going to install it for using outside of the tortoise-tts folder 2023-02-12 04:35:21 +07:00
mrq 3a8ce5a110 Moved experimental settings to main tab, hidden under a check box 2023-02-11 17:21:08 +07:00
mrq a7330164ab Added integration for "voicefixer", fixed issue where candidates>1 and lines>1 only outputs the last combined candidate, numbered step for each generation in progress, output time per generation step 2023-02-11 15:02:11 +07:00
mrq 58e2b22b0e History tab (3/10 it works) 2023-02-11 01:45:25 +07:00
mrq 52a9ed7858 Moved voices out of the tortoise folder because it kept being processed for setup.py 2023-02-10 20:11:56 +07:00
mrq 8b83c9083d Cleanup 2023-02-10 19:55:33 +07:00
mrq 7baf9e3f79 Added a link to the colab notebook 2023-02-10 16:26:13 +07:00
mrq efa556b793 Added new options: "Output Sample Rate", "Output Volume", and documentation 2023-02-10 03:02:09 +07:00
mrq 504db0d1ac Added 'Only Load Models Locally' setting 2023-02-09 22:06:55 +07:00
mrq 460f5d6e32 Added and documented 2023-02-09 21:07:51 +07:00
mrq 3f8302a680 I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way) 2023-02-09 05:05:21 +07:00
mrq b23d6b4b4c owari da... 2023-02-09 01:53:25 +07:00
mrq 494f3c84a1 beginning to add DirectML support 2023-02-08 23:03:52 +07:00
mrq 81e4d261b7 Added two flags/settings: embed output settings, slimmer computed voice latents 2023-02-08 14:14:28 +07:00
mrq e45e4431d1 (finally) added the CVVP model weigh slider, latents export more data too for weighing against CVVP 2023-02-07 20:55:56 +07:00
mrq 5d76d47a49 added shell scripts for linux, wrapped sorted() for voice list, I guess 2023-02-06 21:54:31 +07:00
mrq a3c077ba13 added setting to adjust autoregressive sample batch size 2023-02-06 22:31:06 +07:00
mrq d8c88078f3 Added settings page, added checking for updates (disabled by default), some other things that I don't remember 2023-02-06 21:43:01 +07:00
mrq edb6a173d3 added another (somewhat adequate) example, added metadata storage to generated files (need to add in a viewer later) 2023-02-06 14:17:41 +07:00
mrq 3c0648beaf updated README (before I go mad trying to nitpick and edit it while getting distracted from an iToddler sperging) 2023-02-06 00:56:17 +07:00
mrq b23f583c4e Forgot to rename the cached latents to the new filename 2023-02-05 23:51:52 +07:00
mrq c2c9b1b683 modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents) 2023-02-05 23:25:41 +07:00
mrq daebc6c21c added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/) 2023-02-05 17:59:13 +07:00
mrq 7b767e1442 New tunable: pause size/breathing room (governs pause at the end of clips) 2023-02-05 14:45:51 +07:00
mrq 98dbf56d44 Skip combining if not splitting, also avoids reading back the audio files to combine them by keeping them in memory 2023-02-05 06:35:32 +07:00
mrq d2aeadd754 cleaned up element order with Blocks, also added preset updating the samples/iterations counts 2023-02-05 03:53:46 +07:00
mrq 4274cce218 Added small optimization with caching latents, dropped Anaconda for just a py3.9 + pip + venv setup, added helper install scripts for such, cleaned up app.py, added flag '--low-vram' to disable minor optimizations 2023-02-04 01:50:57 +07:00
mrq aafef3a140 Cleaned up the good-morning-sirs-dialect labels, fixed seed=0 not being a random seed, show seed on output 2023-02-03 01:25:03 +07:00
mrq 1eb92a1236 QoL fixes 2023-02-02 21:13:28 +07:00
James Betker aad67d0e78 Merge pull request #233 from kianmeng/fix-typos
Fix typos
2023-01-17 18:24:24 +07:00
chris 0793800526 add explicit requirements.txt usage for dep installation 2023-01-11 10:50:18 +07:00
원빈 정 092b15eded Add reference of univnet implementation 2023-01-06 15:57:02 +07:00
Kian-Meng Ang 49bbdd597e Fix typos
Found via `codespell -S *.json -L splitted,nd,ser,broadcat`
2023-01-06 11:04:36 +07:00