tortoise-tts

Author	SHA1	Message	Date
deviandice	e650800447	Update 'tortoise/utils/device.py' Noticed that the autoregressive batch size was being set off of VRAM size. Adjusted to scale for the VRAM capacity of 90 series GPUs. In this case, 16 -> 32 batches. Using the standard pre-set with ChungusVGAN, I went from 16 steps to 8. Over an average of 3 runs, I achieved an average of 294 seconds with 16 batches, to 234 seconds with 32. Can't complain at a 1.2x speed increase with functionally 2 lines of code. Can't complain. I restarted tortoise each run, and executing ```torch.cuda.empty_cache()``` just before loading the autoregressive model to clean the memory cache each time.	2023-03-07 14:05:27 +00:00
mrq	26133c2031	do not reload AR/vocoder if already loaded	2023-03-07 04:33:49 +00:00
mrq	e2db36af60	added loading vocoders on the fly	2023-03-07 02:44:09 +00:00
mrq	7b2aa51abc	oops	2023-03-06 21:32:20 +00:00
mrq	7f98727ad5	added option to specify autoregressive model at tts generation time (for a spicy feature later)	2023-03-06 20:31:19 +00:00
mrq	6fcd8c604f	moved bigvgan model to a huggingspace repo	2023-03-05 19:47:22 +00:00
mrq	0f3261e071	you should have migrated by now, if anything breaks it's on (You)	2023-03-05 14:03:18 +00:00
mrq	06bdf72b89	load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior)	2023-03-03 13:53:21 +00:00
mrq	2ba0e056cd	attribution	2023-03-03 06:45:35 +00:00
mrq	aca32a71f7	added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN )	2023-03-03 06:30:58 +00:00
mrq	a9de016230	added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now	2023-03-02 00:44:42 +00:00
mrq	7b839a4263	applied the bitsandbytes wrapper to tortoise inference (not sure if it matters)	2023-02-28 01:42:10 +00:00
mrq	7cc0250a1a	added more kill checks, since it only actually did it for the first iteration of a loop	2023-02-24 23:10:04 +00:00
mrq	de46cf7831	adding magically deleted files back (might have a hunch on what happened)	2023-02-24 19:30:04 +00:00
mrq	2c7c02eb5c	moved the old readme back, to align with how DLAS is setup, sorta	2023-02-19 17:37:36 +00:00
mrq	34b232927e	Oops	2023-02-19 01:54:21 +00:00
mrq	d8c6739820	added constructor argument and function to load a user-specified autoregressive model	2023-02-18 14:08:45 +00:00
mrq	00cb19b6cf	arg to skip voice latents for grabbing voice lists (for preparing datasets)	2023-02-17 04:50:02 +00:00
mrq	b255a77a05	updated notebooks to use the new "main" setup	2023-02-17 03:31:19 +00:00
mrq	150138860c	oops	2023-02-17 01:46:38 +00:00
mrq	6ad3477bfd	one more update	2023-02-16 23:18:02 +00:00
mrq	413703b572	fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait	2023-02-16 22:12:13 +00:00
mrq	30298b9ca3	fixing brain worms	2023-02-16 21:36:49 +00:00
mrq	d53edf540e	pip-ifying things	2023-02-16 19:48:06 +00:00
mrq	d159346572	oops	2023-02-16 13:23:07 +00:00
mrq	eca61af016	actually for real fixed incrementing filenames because i had a regex that actually only worked if candidates or lines>1, cuda now takes priority over dml if you're a nut with both of them installed because you can just specify an override anyways	2023-02-16 01:06:32 +00:00
mrq	ec80ca632b	added setting "device-override", less naively decide the number to use for results, some other thing	2023-02-15 21:51:22 +00:00
mrq	dcc5c140e6	fixes	2023-02-15 15:33:08 +00:00
mrq	729b292515	oops x2	2023-02-15 05:57:42 +00:00
mrq	5bf98de301	oops	2023-02-15 05:55:01 +00:00
mrq	3e8365fdec	voicefixed files do not overwrite, as my autism wants to hear the difference between them, incrementing file format fixed for real	2023-02-15 05:49:28 +00:00
mrq	ea1bc770aa	added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation	2023-02-15 05:01:40 +00:00
mrq	b721e395b5	modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways	2023-02-15 04:44:14 +00:00
mrq	2e777e8a67	done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability)	2023-02-15 04:39:31 +00:00
mrq	314feaeea1	added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file	2023-02-14 21:20:04 +00:00
mrq	0bc2c1f540	updates chunk size to the chunked tensor length, just in case	2023-02-14 17:13:34 +00:00
mrq	48275899e8	added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents	2023-02-14 16:47:34 +00:00
mrq	b648186691	history tab doesn't naively reuse the voice dir instead for results, experimental "divide total sound size until it fits under requests max chunk size" doesn't have a +1 to mess things up (need to re-evaluate how I want to calculate sizes of bests fits eventually)	2023-02-14 16:23:04 +00:00
mrq	47f4b5bf81	voicefixer uses CUDA if exposed	2023-02-13 15:30:49 +00:00
mrq	8250a79b23	Implemented kv_cache "fix" (from `1f3c1b5f4a`); guess I should find out why it's crashing DirectML backend	2023-02-13 13:48:31 +00:00
mrq	80eeef01fb	Merge pull request 'Download from Gradio' (#31 ) from Armored1065/tortoise-tts:main into main Reviewed-on: mrq/tortoise-tts#31	2023-02-13 13:30:09 +00:00
Armored1065	8c96aa02c5	Merge pull request 'Update 'README.md'' (#1 ) from armored1065-patch-1 into main Reviewed-on: Armored1065/tortoise-tts#1	2023-02-13 06:21:37 +00:00
Armored1065	d458e932be	Update 'README.md' Updated text to reflect the download and playback options	2023-02-13 06:19:42 +00:00
mrq	f92e432c8d	added random voice option back because I forgot I accidentally removed it	2023-02-13 04:57:06 +00:00
mrq	a2bac3fb2c	Fixed out of order settings causing other settings to flipflop	2023-02-13 03:43:08 +00:00
mrq	5b5e32338c	DirectML: fixed redaction/aligner by forcing it to stay on CPU	2023-02-12 20:52:04 +00:00
mrq	824ad38cca	fixed voicefixing not working as intended, load TTS before Gradio in the webui due to how long it takes to initialize tortoise (instead of just having a block to preload it)	2023-02-12 20:05:59 +00:00
mrq	4d01bbd429	added button to recalculate voice latents, added experimental switch for computing voice latents	2023-02-12 18:11:40 +00:00
mrq	88529fda43	fixed regression with computing conditional latencies outside of the CPU	2023-02-12 17:44:39 +00:00
mrq	65f74692a0	fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU	2023-02-12 14:46:21 +00:00

1 2 3 4 5 ...

296 Commits