Commit Graph

72 Commits

Author SHA1 Message Date
mrq
06bdf72b89 load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior) 2023-03-03 13:53:21 +00:00
mrq
2ba0e056cd attribution 2023-03-03 06:45:35 +00:00
mrq
aca32a71f7 added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN) 2023-03-03 06:30:58 +00:00
mrq
a9de016230 added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now 2023-03-02 00:44:42 +00:00
mrq
7b839a4263 applied the bitsandbytes wrapper to tortoise inference (not sure if it matters) 2023-02-28 01:42:10 +00:00
mrq
7cc0250a1a added more kill checks, since it only actually did it for the first iteration of a loop 2023-02-24 23:10:04 +00:00
mrq
34b232927e Oops 2023-02-19 01:54:21 +00:00
mrq
d8c6739820 added constructor argument and function to load a user-specified autoregressive model 2023-02-18 14:08:45 +00:00
mrq
6ad3477bfd one more update 2023-02-16 23:18:02 +00:00
mrq
30298b9ca3 fixing brain worms 2023-02-16 21:36:49 +00:00
mrq
ea1bc770aa added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation 2023-02-15 05:01:40 +00:00
mrq
2e777e8a67 done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability) 2023-02-15 04:39:31 +00:00
mrq
0bc2c1f540 updates chunk size to the chunked tensor length, just in case 2023-02-14 17:13:34 +00:00
mrq
48275899e8 added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents 2023-02-14 16:47:34 +00:00
mrq
b648186691 history tab doesn't naively reuse the voice dir instead for results, experimental "divide total sound size until it fits under requests max chunk size" doesn't have a +1 to mess things up (need to re-evaluate how I want to calculate sizes of bests fits eventually) 2023-02-14 16:23:04 +00:00
mrq
5b5e32338c DirectML: fixed redaction/aligner by forcing it to stay on CPU 2023-02-12 20:52:04 +00:00
mrq
4d01bbd429 added button to recalculate voice latents, added experimental switch for computing voice latents 2023-02-12 18:11:40 +00:00
mrq
88529fda43 fixed regression with computing conditional latencies outside of the CPU 2023-02-12 17:44:39 +00:00
mrq
65f74692a0 fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU 2023-02-12 14:46:21 +00:00
mrq
1b55730e67 fixed regression where the auto_conds do not move to the GPU and causes a problem during CVVP compare pass 2023-02-11 20:34:12 +00:00
mrq
a7330164ab Added integration for "voicefixer", fixed issue where candidates>1 and lines>1 only outputs the last combined candidate, numbered step for each generation in progress, output time per generation step 2023-02-11 15:02:11 +00:00
mrq
4f903159ee revamped result formatting, added "kludgy" stop button 2023-02-10 22:12:37 +00:00
mrq
efa556b793 Added new options: "Output Sample Rate", "Output Volume", and documentation 2023-02-10 03:02:09 +00:00
mrq
57af25c6c0 oops 2023-02-09 22:17:57 +00:00
mrq
504db0d1ac Added 'Only Load Models Locally' setting 2023-02-09 22:06:55 +00:00
mrq
3f8302a680 I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way) 2023-02-09 05:05:21 +00:00
mrq
b23d6b4b4c owari da... 2023-02-09 01:53:25 +00:00
mrq
494f3c84a1 beginning to add DirectML support 2023-02-08 23:03:52 +00:00
mrq
e45e4431d1 (finally) added the CVVP model weigh slider, latents export more data too for weighing against CVVP 2023-02-07 20:55:56 +00:00
mrq
f7274112c3 un-hardcoded input output sampling rates (changing them "works" but leads to wrong audio, naturally) 2023-02-07 18:34:29 +00:00
mrq
55058675d2 (maybe) fixed an issue with using prompt redactions (emotions) on CPU causing a crash, because for some reason the wav2vec_alignment assumed CUDA was always available 2023-02-07 07:51:05 -06:00
mrq
328deeddae forgot to auto compute batch size again if set to 0 2023-02-06 23:14:17 -06:00
mrq
a3c077ba13 added setting to adjust autoregressive sample batch size 2023-02-06 22:31:06 +00:00
mrq
b8b15d827d added flag (--cond-latent-max-chunk-size) that should restrict the maximum chunk size when chunking for calculating conditional latents, to avoid OOMing on VRAM 2023-02-06 05:10:07 +00:00
mrq
319e7ec0a6 fixed up the computing conditional latents 2023-02-06 03:44:34 +00:00
mrq
c2c9b1b683 modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents) 2023-02-05 23:25:41 +00:00
mrq
4ea997106e oops 2023-02-05 20:10:40 +00:00
mrq
daebc6c21c added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/) 2023-02-05 17:59:13 +00:00
mrq
7b767e1442 New tunable: pause size/breathing room (governs pause at the end of clips) 2023-02-05 14:45:51 +00:00
mrq
f38c479e9b Added multi-line parsing 2023-02-05 06:17:51 +00:00
mrq
111c45b181 Set transformer and model folder to local './models/' instead of for the user profile, because I'm sick of more bloat polluting my C:\ 2023-02-05 04:18:35 +00:00
mrq
078dc0c6e2 Added choices to choose between diffusion samplers (p, ddim) 2023-02-05 01:28:31 +00:00
mrq
4274cce218 Added small optimization with caching latents, dropped Anaconda for just a py3.9 + pip + venv setup, added helper install scripts for such, cleaned up app.py, added flag '--low-vram' to disable minor optimizations 2023-02-04 01:50:57 +00:00
mrq
061aa65ac4 Reverted slight improvement patch, as it's just enough to OOM on GPUs with low VRAM 2023-02-03 21:45:06 +00:00
mrq
4f359bffa4 Added progress for transforming to audio, changed number inputs to sliders instead 2023-02-03 04:56:30 +00:00
mrq
ef237c70d0 forgot to copy the alleged slight performance improvement patch, added detailed progress information with passing gr.Progress, save a little more info with output 2023-02-03 04:20:01 +00:00
Johan Nordberg
dba14650cb Typofix 2022-06-11 21:19:07 +09:00
Johan Nordberg
5c7a50820c Allow running on CPU 2022-06-11 20:03:14 +09:00
Johan Nordberg
a641d8f29b Add tortoise_cli.py 2022-05-28 05:25:23 +00:00
Johan Nordberg
f396dcc023 Skip CLVP if cvvp_amount is 1
Also fixes formatting bug in log message
2022-05-25 11:12:53 +00:00