Commit Graph

69 Commits

Author SHA1 Message Date
mrq
f4d2d0d7f8 added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation 2023-02-15 05:01:40 +00:00
mrq
defa460028 modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways 2023-02-15 04:44:14 +00:00
mrq
2ee6068f98 done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability) 2023-02-15 04:39:31 +00:00
mrq
c12ada600b added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file 2023-02-14 21:20:04 +00:00
mrq
b4ca260de9 added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents 2023-02-14 16:47:34 +00:00
Armored1065
99f901baa9 Update 'README.md'
Updated text to reflect the download and playback options
2023-02-13 06:19:42 +00:00
mrq
4ced0296a2 DirectML: fixed redaction/aligner by forcing it to stay on CPU 2023-02-12 20:52:04 +00:00
mrq
409dec98d5 fixed voicefixing not working as intended, load TTS before Gradio in the webui due to how long it takes to initialize tortoise (instead of just having a block to preload it) 2023-02-12 20:05:59 +00:00
mrq
b85c9921d7 added button to recalculate voice latents, added experimental switch for computing voice latents 2023-02-12 18:11:40 +00:00
mrq
2210b49cb6 fixed regression with computing conditional latencies outside of the CPU 2023-02-12 17:44:39 +00:00
mrq
a2d95fe208 fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU 2023-02-12 14:46:21 +00:00
mrq
25e70dce1a instll python3.9, wrapped try/catch when parsing args.listen in case you somehow manage to insert garbage into that field and fuck up your config, removed a very redudnant setup.py install call since that only is required if you're just going to install it for using outside of the tortoise-tts folder 2023-02-12 04:35:21 +00:00
mrq
84316d8f80 Moved experimental settings to main tab, hidden under a check box 2023-02-11 17:21:08 +00:00
mrq
c5337a6b51 Added integration for "voicefixer", fixed issue where candidates>1 and lines>1 only outputs the last combined candidate, numbered step for each generation in progress, output time per generation step 2023-02-11 15:02:11 +00:00
mrq
9bf1ea5b0a History tab (3/10 it works) 2023-02-11 01:45:25 +00:00
mrq
7471bc209c Moved voices out of the tortoise folder because it kept being processed for setup.py 2023-02-10 20:11:56 +00:00
mrq
2bce24b9dd Cleanup 2023-02-10 19:55:33 +00:00
mrq
f5ed5499a0 Added a link to the colab notebook 2023-02-10 16:26:13 +00:00
mrq
39b81318f2 Added new options: "Output Sample Rate", "Output Volume", and documentation 2023-02-10 03:02:09 +00:00
mrq
3621e16ef9 Added 'Only Load Models Locally' setting 2023-02-09 22:06:55 +00:00
mrq
dccedc3f66 Added and documented 2023-02-09 21:07:51 +00:00
mrq
38ee19cd57 I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way) 2023-02-09 05:05:21 +00:00
mrq
a37546ad99 owari da... 2023-02-09 01:53:25 +00:00
mrq
6255c98006 beginning to add DirectML support 2023-02-08 23:03:52 +00:00
mrq
d9a9fa6a82 Added two flags/settings: embed output settings, slimmer computed voice latents 2023-02-08 14:14:28 +00:00
mrq
6ebdde58f0 (finally) added the CVVP model weigh slider, latents export more data too for weighing against CVVP 2023-02-07 20:55:56 +00:00
mrq
6515d3b6de added shell scripts for linux, wrapped sorted() for voice list, I guess 2023-02-06 21:54:31 -06:00
mrq
be6fab9dcb added setting to adjust autoregressive sample batch size 2023-02-06 22:31:06 +00:00
mrq
100b4d7e61 Added settings page, added checking for updates (disabled by default), some other things that I don't remember 2023-02-06 21:43:01 +00:00
mrq
5affc777e0 added another (somewhat adequate) example, added metadata storage to generated files (need to add in a viewer later) 2023-02-06 14:17:41 +00:00
mrq
2cfd3bc213 updated README (before I go mad trying to nitpick and edit it while getting distracted from an iToddler sperging) 2023-02-06 00:56:17 +00:00
mrq
945136330c Forgot to rename the cached latents to the new filename 2023-02-05 23:51:52 +00:00
mrq
5bf21fdbe1 modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents) 2023-02-05 23:25:41 +00:00
mrq
1c582b5dc8 added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/) 2023-02-05 17:59:13 +00:00
mrq
8831522de9 New tunable: pause size/breathing room (governs pause at the end of clips) 2023-02-05 14:45:51 +00:00
mrq
bc567d7263 Skip combining if not splitting, also avoids reading back the audio files to combine them by keeping them in memory 2023-02-05 06:35:32 +00:00
mrq
d29ba75dd6 cleaned up element order with Blocks, also added preset updating the samples/iterations counts 2023-02-05 03:53:46 +00:00
mrq
5c876b81f3 Added small optimization with caching latents, dropped Anaconda for just a py3.9 + pip + venv setup, added helper install scripts for such, cleaned up app.py, added flag '--low-vram' to disable minor optimizations 2023-02-04 01:50:57 +00:00
mrq
43f45274dd Cleaned up the good-morning-sirs-dialect labels, fixed seed=0 not being a random seed, show seed on output 2023-02-03 01:25:03 +00:00
mrq
74f447e5d0 QoL fixes 2023-02-02 21:13:28 +00:00
James Betker
5dc3e269b3
Merge pull request #233 from kianmeng/fix-typos
Fix typos
2023-01-17 18:24:24 -07:00
chris
7ce3dc7bf1 add explicit requirements.txt usage for dep installation 2023-01-11 10:50:18 -05:00
원빈 정
b3d67dcc6b Add reference of univnet implementation 2023-01-06 15:57:02 +09:00
Kian-Meng Ang
551fe655ff Fix typos
Found via `codespell -S *.json -L splitted,nd,ser,broadcat`
2023-01-06 11:04:36 +08:00
James Betker
f28a116b48
Update README.md 2022-12-05 13:16:36 -08:00
Harry Coultas Blum
2efc5a3e50 Added keyword argument 2022-07-08 14:28:24 +01:00
James Betker
00f8bc5e78
Update README.md 2022-06-23 15:57:50 -07:00
Jai Mu
5bff5dd819
Update README.md
Useless update but it was bothering me.
2022-05-22 00:56:06 +09:30
James Betker
5d5aacc38c v2.4 2022-05-17 12:15:13 -06:00
James Betker
8c0b3855bf Release notes for 2.3 2022-05-12 20:26:24 -06:00