|
bf3b6c87aa
|
added compat for coqui's XTTS
|
2023-09-16 03:38:21 +00:00 |
|
ken11o2
|
b7c7fd1c5f
|
add arg use_deepspeed
|
2023-09-04 19:14:53 +00:00 |
|
ken11o2
|
2478dc255e
|
update TextToSpeech
|
2023-09-04 19:13:45 +00:00 |
|
ken11o2
|
18adfaf785
|
add use_deepspeed to contructor and update method post_init_gpt2_config
|
2023-09-04 19:12:13 +00:00 |
|
ken11o2
|
ac97c17bf7
|
add use_deepspeed
|
2023-09-04 19:10:27 +00:00 |
|
|
b10c58436d
|
pesky dot
|
2023-08-20 22:41:55 -05:00 |
|
|
cbd3c95c42
|
possible speedup with one simple trick (it worked for valle inferencing), also backported the voice list loading from aivc
|
2023-08-20 22:32:01 -05:00 |
|
|
9afa71542b
|
little sloppy hack to try and not load the same model when it was already loaded
|
2023-08-11 04:02:36 +00:00 |
|
|
e2cd07d560
|
Fix for redaction at end of text (#45)
|
2023-06-10 21:16:21 +00:00 |
|
|
5ff00bf3bf
|
added flags to rever to default method of latent generation (separately for the AR and Diffusion latents, as some voices don't play nicely with the chunk-for-all method)
|
2023-05-21 01:46:55 +00:00 |
|
|
c90ee7c529
|
removed kludgy wrappers for passing progress when I was a pythonlet and didn't know gradio can hook into tqdm outputs anyways
|
2023-05-04 23:39:39 +00:00 |
|
|
086aad5b49
|
quick hotfix to remove offending codesmell (will actually clean it when I finish eating)
|
2023-05-04 22:59:57 +00:00 |
|
|
b6a213bbbd
|
removed some CPU fallback wrappers because directml seems to work now without them
|
2023-04-29 00:46:36 +00:00 |
|
|
2f7d9ab932
|
disable BNB for inferencing by default because I'm pretty sure it makes zero differences (can be force enabled with env vars if you'r erelying on this for some reason)
|
2023-04-29 00:38:18 +00:00 |
|
aJoe
|
eea4c68edc
|
Update tortoise/utils/devices.py vram issue
Added line 85 to set the name variable as it was 'None' causing vram to be incorrect
|
2023-04-12 05:33:30 +00:00 |
|
|
2cd7b72688
|
feat: support .flac voice files
|
2023-04-01 15:08:31 +02:00 |
|
|
0bcdf81d04
|
option to decouple sample batch size from CLVP candidate selection size (currently just unsqueezes the batches)
|
2023-03-21 21:33:46 +00:00 |
|
|
d1ad634ea9
|
added japanese preprocessor for tokenizer
|
2023-03-17 20:03:02 +00:00 |
|
|
af78e3978a
|
deduce if preprocessing text by checking the JSON itself instead
|
2023-03-16 14:41:04 +00:00 |
|
|
e201746eeb
|
added diffusion_model and tokenizer_json as arguments for settings editing
|
2023-03-16 14:19:24 +00:00 |
|
|
1f674a468f
|
added flag to disable preprocessing (because some IPAs will turn into ASCII, implicitly enable for using the specific ipa.json tokenizer vocab)
|
2023-03-16 04:33:03 +00:00 |
|
|
42cb1f3674
|
added args for tokenizer and diffusion model (so I don't have to add it later)
|
2023-03-15 00:30:28 +00:00 |
|
|
65a43deb9e
|
why didn't I also have it use chunks for computing the AR conditional latents (instead of just the diffusion aspect)
|
2023-03-14 01:13:49 +00:00 |
|
|
97cd58e7eb
|
maybe solved that odd VRAM spike when doing the clvp pass
|
2023-03-12 12:48:29 -05:00 |
|
|
fec0685405
|
revert muh clean code
|
2023-03-10 00:56:29 +00:00 |
|
|
0514f011ff
|
how did I botch this, I don't think it affects anything since it never thrown an error
|
2023-03-09 22:36:12 +00:00 |
|
|
00be48670b
|
i am very smart
|
2023-03-09 02:06:44 +00:00 |
|
|
bbeee40ab3
|
forgot to convert to gigabytes
|
2023-03-09 00:51:13 +00:00 |
|
|
6410df569b
|
expose VRAM easily
|
2023-03-09 00:38:31 +00:00 |
|
|
3dd5cad324
|
reverting additional auto-suggested batch sizes, per mrq/ai-voice-cloning#87 proving it in fact, is not a good idea
|
2023-03-07 19:38:02 +00:00 |
|
|
cc36c0997c
|
didn't get a chance to commit this this morning
|
2023-03-07 15:43:09 +00:00 |
|
|
fffea7fc03
|
unmarried the config.json to the bigvgan by downloading the right one
|
2023-03-07 13:37:45 +00:00 |
|
|
26133c2031
|
do not reload AR/vocoder if already loaded
|
2023-03-07 04:33:49 +00:00 |
|
|
e2db36af60
|
added loading vocoders on the fly
|
2023-03-07 02:44:09 +00:00 |
|
|
7b2aa51abc
|
oops
|
2023-03-06 21:32:20 +00:00 |
|
|
7f98727ad5
|
added option to specify autoregressive model at tts generation time (for a spicy feature later)
|
2023-03-06 20:31:19 +00:00 |
|
|
6fcd8c604f
|
moved bigvgan model to a huggingspace repo
|
2023-03-05 19:47:22 +00:00 |
|
|
06bdf72b89
|
load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior)
|
2023-03-03 13:53:21 +00:00 |
|
|
2ba0e056cd
|
attribution
|
2023-03-03 06:45:35 +00:00 |
|
|
aca32a71f7
|
added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN)
|
2023-03-03 06:30:58 +00:00 |
|
|
a9de016230
|
added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now
|
2023-03-02 00:44:42 +00:00 |
|
|
7b839a4263
|
applied the bitsandbytes wrapper to tortoise inference (not sure if it matters)
|
2023-02-28 01:42:10 +00:00 |
|
|
7cc0250a1a
|
added more kill checks, since it only actually did it for the first iteration of a loop
|
2023-02-24 23:10:04 +00:00 |
|
|
de46cf7831
|
adding magically deleted files back (might have a hunch on what happened)
|
2023-02-24 19:30:04 +00:00 |
|
|
34b232927e
|
Oops
|
2023-02-19 01:54:21 +00:00 |
|
|
d8c6739820
|
added constructor argument and function to load a user-specified autoregressive model
|
2023-02-18 14:08:45 +00:00 |
|
|
00cb19b6cf
|
arg to skip voice latents for grabbing voice lists (for preparing datasets)
|
2023-02-17 04:50:02 +00:00 |
|
|
6ad3477bfd
|
one more update
|
2023-02-16 23:18:02 +00:00 |
|
|
30298b9ca3
|
fixing brain worms
|
2023-02-16 21:36:49 +00:00 |
|
|
d159346572
|
oops
|
2023-02-16 13:23:07 +00:00 |
|