Commit Graph

313 Commits

Author SHA1 Message Date
mrq
815ae5d707 Merge pull request 'feat: support .flac voice files' (#43) from NtTestAlert/tortoise-tts:support_flac_voice into main
Reviewed-on: mrq/tortoise-tts#43
2023-04-01 16:37:56 +00:00
2cd7b72688 feat: support .flac voice files 2023-04-01 15:08:31 +02:00
mrq
0bcdf81d04 option to decouple sample batch size from CLVP candidate selection size (currently just unsqueezes the batches) 2023-03-21 21:33:46 +00:00
mrq
d1ad634ea9 added japanese preprocessor for tokenizer 2023-03-17 20:03:02 +00:00
mrq
af78e3978a deduce if preprocessing text by checking the JSON itself instead 2023-03-16 14:41:04 +00:00
mrq
e201746eeb added diffusion_model and tokenizer_json as arguments for settings editing 2023-03-16 14:19:24 +00:00
mrq
1f674a468f added flag to disable preprocessing (because some IPAs will turn into ASCII, implicitly enable for using the specific ipa.json tokenizer vocab) 2023-03-16 04:33:03 +00:00
mrq
42cb1f3674 added args for tokenizer and diffusion model (so I don't have to add it later) 2023-03-15 00:30:28 +00:00
mrq
65a43deb9e why didn't I also have it use chunks for computing the AR conditional latents (instead of just the diffusion aspect) 2023-03-14 01:13:49 +00:00
mrq
97cd58e7eb maybe solved that odd VRAM spike when doing the clvp pass 2023-03-12 12:48:29 -05:00
mrq
fec0685405 revert muh clean code 2023-03-10 00:56:29 +00:00
mrq
0514f011ff how did I botch this, I don't think it affects anything since it never thrown an error 2023-03-09 22:36:12 +00:00
mrq
00be48670b i am very smart 2023-03-09 02:06:44 +00:00
mrq
bbeee40ab3 forgot to convert to gigabytes 2023-03-09 00:51:13 +00:00
mrq
6410df569b expose VRAM easily 2023-03-09 00:38:31 +00:00
mrq
3dd5cad324 reverting additional auto-suggested batch sizes, per mrq/ai-voice-cloning#87 proving it in fact, is not a good idea 2023-03-07 19:38:02 +00:00
mrq
cc36c0997c didn't get a chance to commit this this morning 2023-03-07 15:43:09 +00:00
mrq
fffea7fc03 unmarried the config.json to the bigvgan by downloading the right one 2023-03-07 13:37:45 +00:00
mrq
26133c2031 do not reload AR/vocoder if already loaded 2023-03-07 04:33:49 +00:00
mrq
e2db36af60 added loading vocoders on the fly 2023-03-07 02:44:09 +00:00
mrq
7b2aa51abc oops 2023-03-06 21:32:20 +00:00
mrq
7f98727ad5 added option to specify autoregressive model at tts generation time (for a spicy feature later) 2023-03-06 20:31:19 +00:00
mrq
6fcd8c604f moved bigvgan model to a huggingspace repo 2023-03-05 19:47:22 +00:00
mrq
0f3261e071 you should have migrated by now, if anything breaks it's on (You) 2023-03-05 14:03:18 +00:00
mrq
06bdf72b89 load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior) 2023-03-03 13:53:21 +00:00
mrq
2ba0e056cd attribution 2023-03-03 06:45:35 +00:00
mrq
aca32a71f7 added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN) 2023-03-03 06:30:58 +00:00
mrq
a9de016230 added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now 2023-03-02 00:44:42 +00:00
mrq
7b839a4263 applied the bitsandbytes wrapper to tortoise inference (not sure if it matters) 2023-02-28 01:42:10 +00:00
mrq
7cc0250a1a added more kill checks, since it only actually did it for the first iteration of a loop 2023-02-24 23:10:04 +00:00
mrq
de46cf7831 adding magically deleted files back (might have a hunch on what happened) 2023-02-24 19:30:04 +00:00
mrq
2c7c02eb5c moved the old readme back, to align with how DLAS is setup, sorta 2023-02-19 17:37:36 +00:00
mrq
34b232927e Oops 2023-02-19 01:54:21 +00:00
mrq
d8c6739820 added constructor argument and function to load a user-specified autoregressive model 2023-02-18 14:08:45 +00:00
mrq
00cb19b6cf arg to skip voice latents for grabbing voice lists (for preparing datasets) 2023-02-17 04:50:02 +00:00
mrq
b255a77a05 updated notebooks to use the new "main" setup 2023-02-17 03:31:19 +00:00
mrq
150138860c oops 2023-02-17 01:46:38 +00:00
mrq
6ad3477bfd one more update 2023-02-16 23:18:02 +00:00
mrq
413703b572 fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait 2023-02-16 22:12:13 +00:00
mrq
30298b9ca3 fixing brain worms 2023-02-16 21:36:49 +00:00
mrq
d53edf540e pip-ifying things 2023-02-16 19:48:06 +00:00
mrq
d159346572 oops 2023-02-16 13:23:07 +00:00
mrq
eca61af016 actually for real fixed incrementing filenames because i had a regex that actually only worked if candidates or lines>1, cuda now takes priority over dml if you're a nut with both of them installed because you can just specify an override anyways 2023-02-16 01:06:32 +00:00
mrq
ec80ca632b added setting "device-override", less naively decide the number to use for results, some other thing 2023-02-15 21:51:22 +00:00
mrq
dcc5c140e6 fixes 2023-02-15 15:33:08 +00:00
mrq
729b292515 oops x2 2023-02-15 05:57:42 +00:00
mrq
5bf98de301 oops 2023-02-15 05:55:01 +00:00
mrq
3e8365fdec voicefixed files do not overwrite, as my autism wants to hear the difference between them, incrementing file format fixed for real 2023-02-15 05:49:28 +00:00
mrq
ea1bc770aa added option: force cpu for conditioning latents, for when you want low chunk counts but your GPU keeps OOMing because fuck fragmentation 2023-02-15 05:01:40 +00:00
mrq
b721e395b5 modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways 2023-02-15 04:44:14 +00:00