deviandice
e650800447
Update 'tortoise/utils/device.py'
...
Noticed that the autoregressive batch size was being set off of VRAM size. Adjusted to scale for the VRAM capacity of 90 series GPUs. In this case, 16 -> 32 batches.
Using the standard pre-set with ChungusVGAN, I went from 16 steps to 8.
Over an average of 3 runs, I achieved an average of 294 seconds with 16 batches, to 234 seconds with 32. Can't complain at a 1.2x speed increase with functionally 2 lines of code. Can't complain.
I restarted tortoise each run, and executing ```torch.cuda.empty_cache()``` just before loading the autoregressive model to clean the memory cache each time.
2023-03-07 14:05:27 +00:00
26133c2031
do not reload AR/vocoder if already loaded
2023-03-07 04:33:49 +00:00
e2db36af60
added loading vocoders on the fly
2023-03-07 02:44:09 +00:00
7b2aa51abc
oops
2023-03-06 21:32:20 +00:00
7f98727ad5
added option to specify autoregressive model at tts generation time (for a spicy feature later)
2023-03-06 20:31:19 +00:00
6fcd8c604f
moved bigvgan model to a huggingspace repo
2023-03-05 19:47:22 +00:00
0f3261e071
you should have migrated by now, if anything breaks it's on (You)
2023-03-05 14:03:18 +00:00
06bdf72b89
load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior)
2023-03-03 13:53:21 +00:00
2ba0e056cd
attribution
2023-03-03 06:45:35 +00:00
aca32a71f7
added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN )
2023-03-03 06:30:58 +00:00
a9de016230
added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now
2023-03-02 00:44:42 +00:00
7b839a4263
applied the bitsandbytes wrapper to tortoise inference (not sure if it matters)
2023-02-28 01:42:10 +00:00
7cc0250a1a
added more kill checks, since it only actually did it for the first iteration of a loop
2023-02-24 23:10:04 +00:00
de46cf7831
adding magically deleted files back (might have a hunch on what happened)
2023-02-24 19:30:04 +00:00
2c7c02eb5c
moved the old readme back, to align with how DLAS is setup, sorta
2023-02-19 17:37:36 +00:00
34b232927e
Oops
2023-02-19 01:54:21 +00:00
d8c6739820
added constructor argument and function to load a user-specified autoregressive model
2023-02-18 14:08:45 +00:00
00cb19b6cf
arg to skip voice latents for grabbing voice lists (for preparing datasets)
2023-02-17 04:50:02 +00:00
b255a77a05
updated notebooks to use the new "main" setup
2023-02-17 03:31:19 +00:00
150138860c
oops
2023-02-17 01:46:38 +00:00
6ad3477bfd
one more update
2023-02-16 23:18:02 +00:00
413703b572
fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait
2023-02-16 22:12:13 +00:00
30298b9ca3
fixing brain worms
2023-02-16 21:36:49 +00:00
d53edf540e
pip-ifying things
2023-02-16 19:48:06 +00:00
d159346572
oops
2023-02-16 13:23:07 +00:00
eca61af016
actually for real fixed incrementing filenames because i had a regex that actually only worked if candidates or lines>1, cuda now takes priority over dml if you're a nut with both of them installed because you can just specify an override anyways
2023-02-16 01:06:32 +00:00
ec80ca632b
added setting "device-override", less naively decide the number to use for results, some other thing
2023-02-15 21:51:22 +00:00
dcc5c140e6
fixes
2023-02-15 15:33:08 +00:00
729b292515
oops x2
2023-02-15 05:57:42 +00:00
5bf98de301
oops
2023-02-15 05:55:01 +00:00
3e8365fdec
voicefixed files do not overwrite, as my autism wants to hear the difference between them, incrementing file format fixed for real
2023-02-15 05:49:28 +00:00
ea1bc770aa
added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation
2023-02-15 05:01:40 +00:00
b721e395b5
modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways
2023-02-15 04:44:14 +00:00
2e777e8a67
done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability)
2023-02-15 04:39:31 +00:00
314feaeea1
added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file
2023-02-14 21:20:04 +00:00
0bc2c1f540
updates chunk size to the chunked tensor length, just in case
2023-02-14 17:13:34 +00:00
48275899e8
added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents
2023-02-14 16:47:34 +00:00
b648186691
history tab doesn't naively reuse the voice dir instead for results, experimental "divide total sound size until it fits under requests max chunk size" doesn't have a +1 to mess things up (need to re-evaluate how I want to calculate sizes of bests fits eventually)
2023-02-14 16:23:04 +00:00
47f4b5bf81
voicefixer uses CUDA if exposed
2023-02-13 15:30:49 +00:00
8250a79b23
Implemented kv_cache "fix" (from 1f3c1b5f4a
); guess I should find out why it's crashing DirectML backend
2023-02-13 13:48:31 +00:00
mrq
80eeef01fb
Merge pull request 'Download from Gradio' ( #31 ) from Armored1065/tortoise-tts:main into main
...
Reviewed-on: mrq/tortoise-tts#31
2023-02-13 13:30:09 +00:00
Armored1065
8c96aa02c5
Merge pull request 'Update 'README.md'' ( #1 ) from armored1065-patch-1 into main
...
Reviewed-on: Armored1065/tortoise-tts#1
2023-02-13 06:21:37 +00:00
Armored1065
d458e932be
Update 'README.md'
...
Updated text to reflect the download and playback options
2023-02-13 06:19:42 +00:00
f92e432c8d
added random voice option back because I forgot I accidentally removed it
2023-02-13 04:57:06 +00:00
a2bac3fb2c
Fixed out of order settings causing other settings to flipflop
2023-02-13 03:43:08 +00:00
5b5e32338c
DirectML: fixed redaction/aligner by forcing it to stay on CPU
2023-02-12 20:52:04 +00:00
824ad38cca
fixed voicefixing not working as intended, load TTS before Gradio in the webui due to how long it takes to initialize tortoise (instead of just having a block to preload it)
2023-02-12 20:05:59 +00:00
4d01bbd429
added button to recalculate voice latents, added experimental switch for computing voice latents
2023-02-12 18:11:40 +00:00
88529fda43
fixed regression with computing conditional latencies outside of the CPU
2023-02-12 17:44:39 +00:00
65f74692a0
fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU
2023-02-12 14:46:21 +00:00