tortoise-tts

Author	SHA1	Message	Date
Bikkies	64ae4bb563	Updated setup scripts to use cuda 11.8 and torch 2.0.0 to fix RTX 4090 compatibility Added API to the generate function, so it can be called from other scripts	2023-02-11 19:46:26 +11:00
mrq	9bf1ea5b0a	History tab (3/10 it works)	2023-02-11 01:45:25 +00:00
mrq	340a89f883	Numbering predicates on input_#.json files instead of "number of wavs"	2023-02-10 22:51:56 +00:00
mrq	8641cc9906	revamped result formatting, added "kludgy" stop button	2023-02-10 22:12:37 +00:00
mrq	8f789d17b9	Slight notebook adjust	2023-02-10 20:22:12 +00:00
mrq	7471bc209c	Moved voices out of the tortoise folder because it kept being processed for setup.py	2023-02-10 20:11:56 +00:00
mrq	2bce24b9dd	Cleanup	2023-02-10 19:55:33 +00:00
mrq	811539b20a	Added the remaining input settings	2023-02-10 16:47:57 +00:00
mrq	f5ed5499a0	Added a link to the colab notebook	2023-02-10 16:26:13 +00:00
mrq	07c54ad361	Colab notebook (part II)	2023-02-10 16:12:11 +00:00
mrq	939c89f16e	Colab notebook (part 1)	2023-02-10 15:58:56 +00:00
mrq	39b81318f2	Added new options: "Output Sample Rate", "Output Volume", and documentation	2023-02-10 03:02:09 +00:00
mrq	77b39e59ac	oops	2023-02-09 22:17:57 +00:00
mrq	3621e16ef9	Added 'Only Load Models Locally' setting	2023-02-09 22:06:55 +00:00
mrq	dccedc3f66	Added and documented	2023-02-09 21:07:51 +00:00
mrq	8c30cd1aa4	Oops	2023-02-09 20:49:22 +00:00
mrq	d7443dfa06	Added option: listen path	2023-02-09 20:42:38 +00:00
mrq	38ee19cd57	I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way)	2023-02-09 05:05:21 +00:00
mrq	716e227953	oops	2023-02-09 02:39:08 +00:00
mrq	a37546ad99	owari da...	2023-02-09 01:53:25 +00:00
mrq	6255c98006	beginning to add DirectML support	2023-02-08 23:03:52 +00:00
mrq	d9a9fa6a82	Added two flags/settings: embed output settings, slimmer computed voice latents	2023-02-08 14:14:28 +00:00
mrq	f03b6b8d97	disable telemetry/what-have-you if not requesting a public Gradio URL	2023-02-07 21:44:16 +00:00
mrq	479f30c808	Merge pull request 'Added convert.sh' (#8 ) from lightmare/tortoise-tts:convert_sh into main Reviewed-on: mrq/tortoise-tts#8	2023-02-07 21:09:00 +00:00
lightmare	40f52fa8d1	Added convert.sh	2023-02-07 21:09:00 +00:00
mrq	6ebdde58f0	(finally) added the CVVP model weigh slider, latents export more data too for weighing against CVVP	2023-02-07 20:55:56 +00:00
mrq	793515772a	un-hardcoded input output sampling rates (changing them "works" but leads to wrong audio, naturally)	2023-02-07 18:34:29 +00:00
mrq	5f934c5feb	(maybe) fixed an issue with using prompt redactions (emotions) on CPU causing a crash, because for some reason the wav2vec_alignment assumed CUDA was always available	2023-02-07 07:51:05 -06:00
mrq	d6b5d67f79	forgot to auto compute batch size again if set to 0	2023-02-06 23:14:17 -06:00
mrq	66cc6e2791	changed ROCm pip index URL from 5.2 to 5.1.1, because it's what worked for me desu	2023-02-06 22:52:40 -06:00
mrq	6515d3b6de	added shell scripts for linux, wrapped sorted() for voice list, I guess	2023-02-06 21:54:31 -06:00
mrq	edd642c3d3	fixed combining audio, somehow this broke, oops	2023-02-07 00:26:22 +00:00
mrq	be6fab9dcb	added setting to adjust autoregressive sample batch size	2023-02-06 22:31:06 +00:00
mrq	100b4d7e61	Added settings page, added checking for updates (disabled by default), some other things that I don't remember	2023-02-06 21:43:01 +00:00
mrq	240858487f	Added encoding and ripping latents used to generate the voice	2023-02-06 16:32:09 +00:00
mrq	92cf9e1efe	Added tab to read and copy settings from a voice clip (in the future, I'll see about enmbedding the latent used to generate the voice)	2023-02-06 16:00:44 +00:00
mrq	5affc777e0	added another (somewhat adequate) example, added metadata storage to generated files (need to add in a viewer later)	2023-02-06 14:17:41 +00:00
mrq	b441a84615	added flag (--cond-latent-max-chunk-size) that should restrict the maximum chunk size when chunking for calculating conditional latents, to avoid OOMing on VRAM	2023-02-06 05:10:07 +00:00
mrq	a1f3b6a4da	fixed up the computing conditional latents	2023-02-06 03:44:34 +00:00
mrq	2cfd3bc213	updated README (before I go mad trying to nitpick and edit it while getting distracted from an iToddler sperging)	2023-02-06 00:56:17 +00:00
mrq	945136330c	Forgot to rename the cached latents to the new filename	2023-02-05 23:51:52 +00:00
mrq	5bf21fdbe1	modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents)	2023-02-05 23:25:41 +00:00
mrq	f66754b557	oops	2023-02-05 20:10:40 +00:00
mrq	1c582b5dc8	added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/)	2023-02-05 17:59:13 +00:00
mrq	8831522de9	New tunable: pause size/breathing room (governs pause at the end of clips)	2023-02-05 14:45:51 +00:00
mrq	c7f85dbba2	Fix to keep prompted emotion for every split line	2023-02-05 06:55:09 +00:00
mrq	79e0b85602	Updated .gitignore (that does not apply to me because I have a bad habit of having a repo copy separate from a working copy)	2023-02-05 06:40:50 +00:00
mrq	bc567d7263	Skip combining if not splitting, also avoids reading back the audio files to combine them by keeping them in memory	2023-02-05 06:35:32 +00:00
mrq	bf32efe503	Added multi-line parsing	2023-02-05 06:17:51 +00:00
mrq	cd94cc8459	Fixed accidentally not passing user-provided samples/iteration values (oops), fixed error thrown when trying to write unicode because python sucks	2023-02-05 05:51:57 +00:00

1 2 3 4 5 ...

262 Commits