|
2c7c02eb5c
|
moved the old readme back, to align with how DLAS is setup, sorta
|
2023-02-19 17:37:36 +00:00 |
|
|
d8c6739820
|
added constructor argument and function to load a user-specified autoregressive model
|
2023-02-18 14:08:45 +00:00 |
|
|
413703b572
|
fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait
|
2023-02-16 22:12:13 +00:00 |
|
|
ec80ca632b
|
added setting "device-override", less naively decide the number to use for results, some other thing
|
2023-02-15 21:51:22 +00:00 |
|
|
ea1bc770aa
|
added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation
|
2023-02-15 05:01:40 +00:00 |
|
|
b721e395b5
|
modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways
|
2023-02-15 04:44:14 +00:00 |
|
|
2e777e8a67
|
done away with kludgy shit code, just have the user decide how many chunks to slice concat'd samples to (since it actually does improve vocie replicability)
|
2023-02-15 04:39:31 +00:00 |
|
|
314feaeea1
|
added reset generation settings to default button, revamped utilities tab to double as plain jane voice importer (and runs through voicefixer despite it not really doing anything if your voice samples are already of decent quality anyways), ditched load_wav_to_torch or whatever it was called because it literally exists as torchaudio.load, sample voice is now a combined waveform of all your samples and will always return even if using a latents file
|
2023-02-14 21:20:04 +00:00 |
|
|
48275899e8
|
added flag to enable/disable voicefixer using CUDA because I'll OOM on my 2060, changed from naively subdividing eavenly (2,4,8,16 pieces) to just incrementing by 1 (1,2,3,4) when trying to subdivide within constraints of the max chunk size for computing voice latents
|
2023-02-14 16:47:34 +00:00 |
|
Armored1065
|
d458e932be
|
Update 'README.md'
Updated text to reflect the download and playback options
|
2023-02-13 06:19:42 +00:00 |
|
|
5b5e32338c
|
DirectML: fixed redaction/aligner by forcing it to stay on CPU
|
2023-02-12 20:52:04 +00:00 |
|
|
824ad38cca
|
fixed voicefixing not working as intended, load TTS before Gradio in the webui due to how long it takes to initialize tortoise (instead of just having a block to preload it)
|
2023-02-12 20:05:59 +00:00 |
|
|
4d01bbd429
|
added button to recalculate voice latents, added experimental switch for computing voice latents
|
2023-02-12 18:11:40 +00:00 |
|
|
88529fda43
|
fixed regression with computing conditional latencies outside of the CPU
|
2023-02-12 17:44:39 +00:00 |
|
|
65f74692a0
|
fixed silently crashing from enabling kv_cache-ing if using the DirectML backend, throw an error when reading a generated audio file that does not have any embedded metadata in it, cleaned up the blocks of code that would DMA/transfer tensors/models between GPU and CPU
|
2023-02-12 14:46:21 +00:00 |
|
|
94757f5b41
|
instll python3.9, wrapped try/catch when parsing args.listen in case you somehow manage to insert garbage into that field and fuck up your config, removed a very redudnant setup.py install call since that only is required if you're just going to install it for using outside of the tortoise-tts folder
|
2023-02-12 04:35:21 +00:00 |
|
|
3a8ce5a110
|
Moved experimental settings to main tab, hidden under a check box
|
2023-02-11 17:21:08 +00:00 |
|
|
a7330164ab
|
Added integration for "voicefixer", fixed issue where candidates>1 and lines>1 only outputs the last combined candidate, numbered step for each generation in progress, output time per generation step
|
2023-02-11 15:02:11 +00:00 |
|
|
58e2b22b0e
|
History tab (3/10 it works)
|
2023-02-11 01:45:25 +00:00 |
|
|
52a9ed7858
|
Moved voices out of the tortoise folder because it kept being processed for setup.py
|
2023-02-10 20:11:56 +00:00 |
|
|
8b83c9083d
|
Cleanup
|
2023-02-10 19:55:33 +00:00 |
|
|
7baf9e3f79
|
Added a link to the colab notebook
|
2023-02-10 16:26:13 +00:00 |
|
|
efa556b793
|
Added new options: "Output Sample Rate", "Output Volume", and documentation
|
2023-02-10 03:02:09 +00:00 |
|
|
504db0d1ac
|
Added 'Only Load Models Locally' setting
|
2023-02-09 22:06:55 +00:00 |
|
|
460f5d6e32
|
Added and documented
|
2023-02-09 21:07:51 +00:00 |
|
|
3f8302a680
|
I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way)
|
2023-02-09 05:05:21 +00:00 |
|
|
b23d6b4b4c
|
owari da...
|
2023-02-09 01:53:25 +00:00 |
|
|
494f3c84a1
|
beginning to add DirectML support
|
2023-02-08 23:03:52 +00:00 |
|
|
81e4d261b7
|
Added two flags/settings: embed output settings, slimmer computed voice latents
|
2023-02-08 14:14:28 +00:00 |
|
|
e45e4431d1
|
(finally) added the CVVP model weigh slider, latents export more data too for weighing against CVVP
|
2023-02-07 20:55:56 +00:00 |
|
|
5d76d47a49
|
added shell scripts for linux, wrapped sorted() for voice list, I guess
|
2023-02-06 21:54:31 -06:00 |
|
|
a3c077ba13
|
added setting to adjust autoregressive sample batch size
|
2023-02-06 22:31:06 +00:00 |
|
|
d8c88078f3
|
Added settings page, added checking for updates (disabled by default), some other things that I don't remember
|
2023-02-06 21:43:01 +00:00 |
|
|
edb6a173d3
|
added another (somewhat adequate) example, added metadata storage to generated files (need to add in a viewer later)
|
2023-02-06 14:17:41 +00:00 |
|
|
3c0648beaf
|
updated README (before I go mad trying to nitpick and edit it while getting distracted from an iToddler sperging)
|
2023-02-06 00:56:17 +00:00 |
|
|
b23f583c4e
|
Forgot to rename the cached latents to the new filename
|
2023-02-05 23:51:52 +00:00 |
|
|
c2c9b1b683
|
modified how conditional latents are computed (before, it just happened to only bother reading the first 102400/24000=4.26 seconds per audio input, now it will chunk it all to compute latents)
|
2023-02-05 23:25:41 +00:00 |
|
|
daebc6c21c
|
added button to refresh voice list, enabling KV caching for a bonerific speed increase (credit to https://github.com/152334H/tortoise-tts-fast/)
|
2023-02-05 17:59:13 +00:00 |
|
|
7b767e1442
|
New tunable: pause size/breathing room (governs pause at the end of clips)
|
2023-02-05 14:45:51 +00:00 |
|
|
98dbf56d44
|
Skip combining if not splitting, also avoids reading back the audio files to combine them by keeping them in memory
|
2023-02-05 06:35:32 +00:00 |
|
|
d2aeadd754
|
cleaned up element order with Blocks, also added preset updating the samples/iterations counts
|
2023-02-05 03:53:46 +00:00 |
|
|
4274cce218
|
Added small optimization with caching latents, dropped Anaconda for just a py3.9 + pip + venv setup, added helper install scripts for such, cleaned up app.py, added flag '--low-vram' to disable minor optimizations
|
2023-02-04 01:50:57 +00:00 |
|
|
aafef3a140
|
Cleaned up the good-morning-sirs-dialect labels, fixed seed=0 not being a random seed, show seed on output
|
2023-02-03 01:25:03 +00:00 |
|
|
1eb92a1236
|
QoL fixes
|
2023-02-02 21:13:28 +00:00 |
|
James Betker
|
aad67d0e78
|
Merge pull request #233 from kianmeng/fix-typos
Fix typos
|
2023-01-17 18:24:24 -07:00 |
|
chris
|
0793800526
|
add explicit requirements.txt usage for dep installation
|
2023-01-11 10:50:18 -05:00 |
|
원빈 정
|
092b15eded
|
Add reference of univnet implementation
|
2023-01-06 15:57:02 +09:00 |
|
Kian-Meng Ang
|
49bbdd597e
|
Fix typos
Found via `codespell -S *.json -L splitted,nd,ser,broadcat`
|
2023-01-06 11:04:36 +08:00 |
|
James Betker
|
a5a0907e76
|
Update README.md
|
2022-12-05 13:16:36 -08:00 |
|
Harry Coultas Blum
|
75e920438a
|
Added keyword argument
|
2022-07-08 14:28:24 +01:00 |
|