tortoise-tts

Author	SHA1	Message	Date
mrq	b10c58436d	pesky dot	2023-08-20 22:41:55 -05:00
mrq	cbd3c95c42	possible speedup with one simple trick (it worked for valle inferencing), also backported the voice list loading from aivc	2023-08-20 22:32:01 -05:00
mrq	e2cd07d560	Fix for redaction at end of text (#45 )	2023-06-10 21:16:21 +00:00
mrq	c90ee7c529	removed kludgy wrappers for passing progress when I was a pythonlet and didn't know gradio can hook into tqdm outputs anyways	2023-05-04 23:39:39 +00:00
mrq	086aad5b49	quick hotfix to remove offending codesmell (will actually clean it when I finish eating)	2023-05-04 22:59:57 +00:00
mrq	b6a213bbbd	removed some CPU fallback wrappers because directml seems to work now without them	2023-04-29 00:46:36 +00:00
mrq	2f7d9ab932	disable BNB for inferencing by default because I'm pretty sure it makes zero differences (can be force enabled with env vars if you'r erelying on this for some reason)	2023-04-29 00:38:18 +00:00
aJoe	eea4c68edc	Update tortoise/utils/devices.py vram issue Added line 85 to set the name variable as it was 'None' causing vram to be incorrect	2023-04-12 05:33:30 +00:00
NtTestAlert	2cd7b72688	feat: support .flac voice files	2023-04-01 15:08:31 +02:00
mrq	d1ad634ea9	added japanese preprocessor for tokenizer	2023-03-17 20:03:02 +00:00
mrq	af78e3978a	deduce if preprocessing text by checking the JSON itself instead	2023-03-16 14:41:04 +00:00
mrq	1f674a468f	added flag to disable preprocessing (because some IPAs will turn into ASCII, implicitly enable for using the specific ipa.json tokenizer vocab)	2023-03-16 04:33:03 +00:00
mrq	97cd58e7eb	maybe solved that odd VRAM spike when doing the clvp pass	2023-03-12 12:48:29 -05:00
mrq	fec0685405	revert muh clean code	2023-03-10 00:56:29 +00:00
mrq	00be48670b	i am very smart	2023-03-09 02:06:44 +00:00
mrq	bbeee40ab3	forgot to convert to gigabytes	2023-03-09 00:51:13 +00:00
mrq	6410df569b	expose VRAM easily	2023-03-09 00:38:31 +00:00
mrq	3dd5cad324	reverting additional auto-suggested batch sizes, per mrq/ai-voice-cloning#87 proving it in fact, is not a good idea	2023-03-07 19:38:02 +00:00
mrq	cc36c0997c	didn't get a chance to commit this this morning	2023-03-07 15:43:09 +00:00
mrq	a9de016230	added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now	2023-03-02 00:44:42 +00:00
mrq	7b839a4263	applied the bitsandbytes wrapper to tortoise inference (not sure if it matters)	2023-02-28 01:42:10 +00:00
mrq	00cb19b6cf	arg to skip voice latents for grabbing voice lists (for preparing datasets)	2023-02-17 04:50:02 +00:00
mrq	d159346572	oops	2023-02-16 13:23:07 +00:00
mrq	eca61af016	actually for real fixed incrementing filenames because i had a regex that actually only worked if candidates or lines>1, cuda now takes priority over dml if you're a nut with both of them installed because you can just specify an override anyways	2023-02-16 01:06:32 +00:00
mrq	ec80ca632b	added setting "device-override", less naively decide the number to use for results, some other thing	2023-02-15 21:51:22 +00:00
mrq	314feaeea1	added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file	2023-02-14 21:20:04 +00:00
mrq	a7330164ab	Added integration for "voicefixer", fixed issue where candidates>1 and lines>1 only outputs the last combined candidate, numbered step for each generation in progress, output time per generation step	2023-02-11 15:02:11 +00:00
mrq	52a9ed7858	Moved voices out of the tortoise folder because it kept being processed for setup.py	2023-02-10 20:11:56 +00:00
mrq	729be135ef	Added option: listen path	2023-02-09 20:42:38 +00:00
mrq	3f8302a680	I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way)	2023-02-09 05:05:21 +00:00
mrq	b23d6b4b4c	owari da...	2023-02-09 01:53:25 +00:00
mrq	f7274112c3	un-hardcoded input output sampling rates (changing them "works" but leads to wrong audio, naturally)	2023-02-07 18:34:29 +00:00
mrq	55058675d2	(maybe) fixed an issue with using prompt redactions (emotions) on CPU causing a crash, because for some reason the wav2vec_alignment assumed CUDA was always available	2023-02-07 07:51:05 -06:00
mrq	b23f583c4e	Forgot to rename the cached latents to the new filename	2023-02-05 23:51:52 +00:00
mrq	c2c9b1b683	modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents)	2023-02-05 23:25:41 +00:00
mrq	daebc6c21c	added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/)	2023-02-05 17:59:13 +00:00
mrq	078dc0c6e2	Added choices to choose between diffusion samplers (p, ddim)	2023-02-05 01:28:31 +00:00
mrq	4274cce218	Added small optimization with caching latents, dropped Anaconda for just a py3.9 + pip + venv setup, added helper install scripts for such, cleaned up app.py, added flag '--low-vram' to disable minor optimizations	2023-02-04 01:50:57 +00:00
mrq	4f359bffa4	Added progress for transforming to audio, changed number inputs to sliders instead	2023-02-03 04:56:30 +00:00
Johan Nordberg	5c7a50820c	Allow running on CPU	2022-06-11 20:03:14 +09:00
James Betker	ce30b5bbe5	Merge pull request #74 from jnordberg/improved-cli Add CLI tool	2022-05-28 21:33:53 -06:00
Johan Nordberg	491fe7f6d3	Remove some assumptions about working directory This allows cli tool to run when not standing in repository dir	2022-05-29 01:10:19 +00:00
Johan Nordberg	a641d8f29b	Add tortoise_cli.py	2022-05-28 05:25:23 +00:00
Johan Nordberg	821be4171b	Typofix	2022-05-28 01:29:34 +00:00
Johan Nordberg	069e7001ad	Improve splitting on text that has many quotes	2022-05-28 01:22:21 +00:00
Johan Nordberg	cf26074fa5	Add riding hood test Also fix a bug discovered by the test that would seek past the text end if it ended in a boundary	2022-05-27 23:08:53 +00:00
Johan Nordberg	acc0891e85	Improve sentence boundary detection	2022-05-27 05:58:09 +00:00
Josh Ziegler	53f6563e3e	avoid mutable default in aligner	2022-05-26 16:20:09 -04:00
Johan Nordberg	b4fa8c86b9	Allow passing additional voice directories when loading voices	2022-05-19 21:02:11 +09:00
Danila Berezin	ef5fb5f5fc	Fix bug in load_voices in audio.py The read.py script did not work with pth latents, so I fix bug in audio.py. It seems that in the elif statement, instead of voice, voices should be clip, clips. And torch stack doesn't work with tuples, so I had to split this operation.	2022-05-17 18:34:54 +03:00

1 2

66 Commits