tortoise-tts

Commit Graph

Author	SHA1	Message	Date
mrq	0f3261e071	you should have migrated by now, if anything breaks it's on (You)	2023-03-05 14:03:18 +07:00
mrq	de46cf7831	adding magically deleted files back (might have a hunch on what happened)	2023-02-24 19:30:04 +07:00
mrq	2c7c02eb5c	moved the old readme back, to align with how DLAS is setup, sorta	2023-02-19 17:37:36 +07:00
mrq	d8c6739820	added constructor argument and function to load a user-specified autoregressive model	2023-02-18 14:08:45 +07:00
mrq	413703b572	fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait	2023-02-16 22:12:13 +07:00
mrq	ec80ca632b	added setting "device-override", less naively decide the number to use for results, some other thing	2023-02-15 21:51:22 +07:00
mrq	ea1bc770aa	added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation	2023-02-15 05:01:40 +07:00
mrq	b721e395b5	modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways	2023-02-15 04:44:14 +07:00
mrq	2e777e8a67	done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability)	2023-02-15 04:39:31 +07:00
mrq	314feaeea1	added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file	2023-02-14 21:20:04 +07:00
mrq	48275899e8	added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents	2023-02-14 16:47:34 +07:00
Armored1065	d458e932be	Update 'README.md' Updated text to reflect the download and playback options	2023-02-13 06:19:42 +07:00
mrq	5b5e32338c	DirectML: fixed redaction/aligner by forcing it to stay on CPU	2023-02-12 20:52:04 +07:00
mrq	824ad38cca	fixed voicefixing not working as intended, load TTS before Gradio in the webui due to how long it takes to initialize tortoise (instead of just having a block to preload it)	2023-02-12 20:05:59 +07:00
mrq	4d01bbd429	added button to recalculate voice latents, added experimental switch for computing voice latents	2023-02-12 18:11:40 +07:00
mrq	88529fda43	fixed regression with computing conditional latencies outside of the CPU	2023-02-12 17:44:39 +07:00
mrq	65f74692a0	fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU	2023-02-12 14:46:21 +07:00
mrq	94757f5b41	instll python3.9, wrapped try/catch when parsing args.listen in case you somehow manage to insert garbage into that field and fuck up your config, removed a very redudnant setup.py install call since that only is required if you're just going to install it for using outside of the tortoise-tts folder	2023-02-12 04:35:21 +07:00
mrq	3a8ce5a110	Moved experimental settings to main tab, hidden under a check box	2023-02-11 17:21:08 +07:00
mrq	a7330164ab	Added integration for "voicefixer", fixed issue where candidates>1 and lines>1 only outputs the last combined candidate, numbered step for each generation in progress, output time per generation step	2023-02-11 15:02:11 +07:00
mrq	58e2b22b0e	History tab (3/10 it works)	2023-02-11 01:45:25 +07:00
mrq	52a9ed7858	Moved voices out of the tortoise folder because it kept being processed for setup.py	2023-02-10 20:11:56 +07:00
mrq	8b83c9083d	Cleanup	2023-02-10 19:55:33 +07:00
mrq	7baf9e3f79	Added a link to the colab notebook	2023-02-10 16:26:13 +07:00
mrq	efa556b793	Added new options: "Output Sample Rate", "Output Volume", and documentation	2023-02-10 03:02:09 +07:00
mrq	504db0d1ac	Added 'Only Load Models Locally' setting	2023-02-09 22:06:55 +07:00
mrq	460f5d6e32	Added and documented	2023-02-09 21:07:51 +07:00
mrq	3f8302a680	I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way)	2023-02-09 05:05:21 +07:00
mrq	b23d6b4b4c	owari da...	2023-02-09 01:53:25 +07:00
mrq	494f3c84a1	beginning to add DirectML support	2023-02-08 23:03:52 +07:00
mrq	81e4d261b7	Added two flags/settings: embed output settings, slimmer computed voice latents	2023-02-08 14:14:28 +07:00
mrq	e45e4431d1	(finally) added the CVVP model weigh slider, latents export more data too for weighing against CVVP	2023-02-07 20:55:56 +07:00
mrq	5d76d47a49	added shell scripts for linux, wrapped sorted() for voice list, I guess	2023-02-06 21:54:31 +07:00
mrq	a3c077ba13	added setting to adjust autoregressive sample batch size	2023-02-06 22:31:06 +07:00
mrq	d8c88078f3	Added settings page, added checking for updates (disabled by default), some other things that I don't remember	2023-02-06 21:43:01 +07:00
mrq	edb6a173d3	added another (somewhat adequate) example, added metadata storage to generated files (need to add in a viewer later)	2023-02-06 14:17:41 +07:00
mrq	3c0648beaf	updated README (before I go mad trying to nitpick and edit it while getting distracted from an iToddler sperging)	2023-02-06 00:56:17 +07:00
mrq	b23f583c4e	Forgot to rename the cached latents to the new filename	2023-02-05 23:51:52 +07:00
mrq	c2c9b1b683	modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents)	2023-02-05 23:25:41 +07:00
mrq	daebc6c21c	added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/)	2023-02-05 17:59:13 +07:00
mrq	7b767e1442	New tunable: pause size/breathing room (governs pause at the end of clips)	2023-02-05 14:45:51 +07:00
mrq	98dbf56d44	Skip combining if not splitting, also avoids reading back the audio files to combine them by keeping them in memory	2023-02-05 06:35:32 +07:00
mrq	d2aeadd754	cleaned up element order with Blocks, also added preset updating the samples/iterations counts	2023-02-05 03:53:46 +07:00
mrq	4274cce218	Added small optimization with caching latents, dropped Anaconda for just a py3.9 + pip + venv setup, added helper install scripts for such, cleaned up app.py, added flag '--low-vram' to disable minor optimizations	2023-02-04 01:50:57 +07:00
mrq	aafef3a140	Cleaned up the good-morning-sirs-dialect labels, fixed seed=0 not being a random seed, show seed on output	2023-02-03 01:25:03 +07:00
mrq	1eb92a1236	QoL fixes	2023-02-02 21:13:28 +07:00
James Betker	aad67d0e78	Merge pull request #233 from kianmeng/fix-typos Fix typos	2023-01-17 18:24:24 +07:00
chris	0793800526	add explicit requirements.txt usage for dep installation	2023-01-11 10:50:18 +07:00
원빈 정	092b15eded	Add reference of univnet implementation	2023-01-06 15:57:02 +07:00
Kian-Meng Ang	49bbdd597e	Fix typos Found via `codespell -S *.json -L splitted,nd,ser,broadcat`	2023-01-06 11:04:36 +07:00

1 2

75 Commits (main)