Commit Graph

332 Commits (main)
 

Author SHA1 Message Date
mrq 95f679f4ba possible fix for when candidates >= samples 2023-10-10 15:30:08 +07:00
mrq bf3b6c87aa added compat for coqui's XTTS 2023-09-16 03:38:21 +07:00
mrq d7e6914fb8 Merge pull request 'main' (#47) from ken11o2/tortoise-tts:main into main
Reviewed-on: #47
2023-09-04 20:01:14 +07:00
ken11o2 b7c7fd1c5f add arg use_deepspeed 2023-09-04 19:14:53 +07:00
ken11o2 2478dc255e update TextToSpeech 2023-09-04 19:13:45 +07:00
ken11o2 18adfaf785 add use_deepspeed to contructor and update method post_init_gpt2_config 2023-09-04 19:12:13 +07:00
ken11o2 ac97c17bf7 add use_deepspeed 2023-09-04 19:10:27 +07:00
mrq b10c58436d pesky dot 2023-08-20 22:41:55 +07:00
mrq cbd3c95c42 possible speedup with one simple trick (it worked for valle inferencing), also backported the voice list loading from aivc 2023-08-20 22:32:01 +07:00
mrq 9afa71542b little sloppy hack to try and not load the same model when it was already loaded 2023-08-11 04:02:36 +07:00
mrq e2cd07d560 Fix for redaction at end of text (#45) 2023-06-10 21:16:21 +07:00
mrq 5ff00bf3bf added flags to rever to default method of latent generation (separately for the AR and Diffusion latents, as some voices don't play nicely with the chunk-for-all method) 2023-05-21 01:46:55 +07:00
mrq c90ee7c529 removed kludgy wrappers for passing progress when I was a pythonlet and didn't know gradio can hook into tqdm outputs anyways 2023-05-04 23:39:39 +07:00
mrq 086aad5b49 quick hotfix to remove offending codesmell (will actually clean it when I finish eating) 2023-05-04 22:59:57 +07:00
mrq 04b7049811 freeze numpy to 1.23.5 because latest version will moan about deprecating complex 2023-05-04 01:54:41 +07:00
mrq b6a213bbbd removed some CPU fallback wrappers because directml seems to work now without them 2023-04-29 00:46:36 +07:00
mrq 2f7d9ab932 disable BNB for inferencing by default because I'm pretty sure it makes zero differences (can be force enabled with env vars if you'r erelying on this for some reason) 2023-04-29 00:38:18 +07:00
mrq f025470d60 Merge pull request 'Update tortoise/utils/devices.py vram issue' (#44) from aJoe/tortoise-tts:main into main
Reviewed-on: #44
2023-04-12 19:58:02 +07:00
aJoe eea4c68edc Update tortoise/utils/devices.py vram issue
Added line 85 to set the name variable as it was 'None' causing vram to be incorrect
2023-04-12 05:33:30 +07:00
mrq 815ae5d707 Merge pull request 'feat: support .flac voice files' (#43) from NtTestAlert/tortoise-tts:support_flac_voice into main
Reviewed-on: #43
2023-04-01 16:37:56 +07:00
NtTestAlert 2cd7b72688 feat: support .flac voice files 2023-04-01 15:08:31 +07:00
mrq 0bcdf81d04 option to decouple sample batch size from CLVP candidate selection size (currently just unsqueezes the batches) 2023-03-21 21:33:46 +07:00
mrq d1ad634ea9 added japanese preprocessor for tokenizer 2023-03-17 20:03:02 +07:00
mrq af78e3978a deduce if preprocessing text by checking the JSON itself instead 2023-03-16 14:41:04 +07:00
mrq e201746eeb added diffusion_model and tokenizer_json as arguments for settings editing 2023-03-16 14:19:24 +07:00
mrq 1f674a468f added flag to disable preprocessing (because some IPAs will turn into ASCII, implicitly enable for using the specific ipa.json tokenizer vocab) 2023-03-16 04:33:03 +07:00
mrq 42cb1f3674 added args for tokenizer and diffusion model (so I don't have to add it later) 2023-03-15 00:30:28 +07:00
mrq 65a43deb9e why didn't I also have it use chunks for computing the AR conditional latents (instead of just the diffusion aspect) 2023-03-14 01:13:49 +07:00
mrq 97cd58e7eb maybe solved that odd VRAM spike when doing the clvp pass 2023-03-12 12:48:29 +07:00
mrq fec0685405 revert muh clean code 2023-03-10 00:56:29 +07:00
mrq 0514f011ff how did I botch this, I don't think it affects anything since it never thrown an error 2023-03-09 22:36:12 +07:00
mrq 00be48670b i am very smart 2023-03-09 02:06:44 +07:00
mrq bbeee40ab3 forgot to convert to gigabytes 2023-03-09 00:51:13 +07:00
mrq 6410df569b expose VRAM easily 2023-03-09 00:38:31 +07:00
mrq 3dd5cad324 reverting additional auto-suggested batch sizes, per mrq/ai-voice-cloning#87 proving it in fact, is not a good idea 2023-03-07 19:38:02 +07:00
mrq cc36c0997c didn't get a chance to commit this this morning 2023-03-07 15:43:09 +07:00
mrq fffea7fc03 unmarried the config.json to the bigvgan by downloading the right one 2023-03-07 13:37:45 +07:00
mrq 26133c2031 do not reload AR/vocoder if already loaded 2023-03-07 04:33:49 +07:00
mrq e2db36af60 added loading vocoders on the fly 2023-03-07 02:44:09 +07:00
mrq 7b2aa51abc oops 2023-03-06 21:32:20 +07:00
mrq 7f98727ad5 added option to specify autoregressive model at tts generation time (for a spicy feature later) 2023-03-06 20:31:19 +07:00
mrq 6fcd8c604f moved bigvgan model to a huggingspace repo 2023-03-05 19:47:22 +07:00
mrq 0f3261e071 you should have migrated by now, if anything breaks it's on (You) 2023-03-05 14:03:18 +07:00
mrq 06bdf72b89 load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior) 2023-03-03 13:53:21 +07:00
mrq 2ba0e056cd attribution 2023-03-03 06:45:35 +07:00
mrq aca32a71f7 added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN) 2023-03-03 06:30:58 +07:00
mrq a9de016230 added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now 2023-03-02 00:44:42 +07:00
mrq 7b839a4263 applied the bitsandbytes wrapper to tortoise inference (not sure if it matters) 2023-02-28 01:42:10 +07:00
mrq 7cc0250a1a added more kill checks, since it only actually did it for the first iteration of a loop 2023-02-24 23:10:04 +07:00
mrq de46cf7831 adding magically deleted files back (might have a hunch on what happened) 2023-02-24 19:30:04 +07:00