tortoise-tts

mrq/tortoise-tts

Fork 15

Commit Graph

Select branches

Hide Pull Requests

main

master

new

#102

#122

#123

#170

#170

#193

#20

#229

#230

#230

#233

#234

#245

#249

#249

#250

#263

#273

#273

#3

#31

#36

#41

#42

#43

#44

#47

#49

#50

#55

#57

#58

#64

#68

#7

#70

#74

#78

#8

#82

#90

#97

d1e811e6ea sane download bar new mrq 2024-10-27 09:02:48 -0500
a05faf0dfa backport mrq 2024-07-22 20:56:45 -0500
f25e765682 maybe backported some weird fixes for LoRA loading from mrq/vall-e ? mrq 2024-07-22 20:48:06 -0500
90ecf3da7d more backporting mrq 2024-06-28 22:44:42 -0500
43d85d97aa backported additions from e-c-k-e-r/vall-e (paths sorted-by-duration and batched sampling) mrq 2024-06-28 22:29:42 -0500
e0a93a6400 readme tweaks mrq 2024-06-28 21:02:40 -0500
80d6494973 might help to resample to the right sample rate for the AR / dvae,,, mrq 2024-06-25 19:48:45 -0500
20789a0b8a i swear it worked before and now it didnt mrq 2024-06-25 19:17:14 -0500
6ee5f21ddc oops, needed some fixes mrq 2024-06-25 13:40:39 -0500
286681c87c oops mrq 2024-06-21 00:20:53 -0500
79fc406c78 calm_token was set wrong, somehow mrq 2024-06-19 22:20:06 -0500
e2c9b0465f set seed on inference, since it seems to be set to 0 every time mrq 2024-06-19 22:10:59 -0500
0b1a71430c added BigVGAN and HiFiGAN (from https://git.ecker.tech/jarod/tortoise-tts), vocoder selectable in webUI mrq 2024-06-19 21:43:29 -0500
a5c21d65d2 added automatically loading default YAML if --yaml is not profided (although I think it already does this by using defaults), default YAML will use local backend + deepspeed inferencing for speedups mrq 2024-06-19 18:49:39 -0500
f4fcc35aa8 fixed it breaking on subsequent utterances through the web UI from latents being on the CPU mrq 2024-06-19 18:26:15 -0500
96b74f38ef sampler and cond_free selectable in webUI, re-enabled cond_free as default (somehow it's working again) mrq 2024-06-19 17:12:28 -0500
73f271fb8a added automagic offloading models to GPU then CPU when theyre done during inference mrq 2024-06-19 17:01:05 -0500
5d24631bfb don't pad output mel tokens to speed up diffusion (despite copying it exactly from tortoise) mrq 2024-06-19 15:27:11 -0500
849de13f27 added tqdm bar for AR mrq 2024-06-19 15:00:14 -0500
99be487482 backported old fork features (kv_cache (which looking back seems like a spook), ddim sampling, etc) mrq 2024-06-19 14:49:24 -0500
268ba17485 crammed in HF attention selection mechanisms for the AR mrq 2024-06-19 10:21:43 -0500
e5136613f5 semblance of documentation, automagic model downloading, a little saner inference results folder mrq 2024-06-19 10:08:14 -0500
6c2e00ce2a load exported LoRA weights if exists (to-do: make a better LoRA loading mechanism) mrq 2024-06-18 21:46:42 -0500
7c9144ff22 working webui mrq 2024-06-18 21:03:25 -0500
fb313d7ef4 working, the vocoder was just loading wrong mrq 2024-06-18 20:55:50 -0500
b5570f1b86 progress mrq 2024-06-18 17:09:50 -0500
7aae9d48ab training + LoRA training works? (keeps OOMing after a step) mrq 2024-06-18 13:28:50 -0500
d7b63d2f70 encoding mel tokens + dataset preparation mrq 2024-06-18 10:30:54 -0500
37ec9f1b79 initial "refractoring" mrq 2024-06-17 22:48:34 -0500
156bb5e7da Add files required for hifigan, including autoregressive.py modification 1736035863632747703/tmp_refs/heads/main 1736035863632747703/main 1736027390615876214/tmp_refs/heads/main 1736027390615876214/main 1723337853318960522/tmp_refs/heads/main 1723337853318960522/main Jarod Mica 2023-11-26 21:51:28 -0800
95f679f4ba possible fix for when candidates >= samples main 1745976926287824858/tmp_refs/heads/master 1745976926287824858/master master mrq 2023-10-10 15:30:08 +0000
bf3b6c87aa added compat for coqui's XTTS 1743804817687574180/tmp_refs/heads/main 1743804817687574180/main 1727226449746173981/tmp_refs/heads/master 1727226449746173981/master 1722787990165393018/tmp_refs/heads/main 1722787990165393018/main 1719701645208428515/tmp_refs/heads/main 1719701645208428515/main 1719608859174587957/tmp_refs/heads/main 1719608859174587957/main 1716726990957499915/tmp_refs/heads/master 1716726990957499915/master 1710274239739377895/tmp_refs/heads/main 1710274239739377895/main 1710274213466527059/tmp_refs/heads/main 1710274213466527059/main mrq 2023-09-16 03:38:21 +0000
9ddfcb57aa Update tortoise/api.py 1744552822310412919/tmp_refs/heads/main 1744552822310412919/main 1723750396132212768/tmp_refs/heads/main 1723750396132212768/main 1720005553000394408/tmp_refs/heads/main 1720005553000394408/main 1710274000886183304/tmp_refs/heads/main 1710274000886183304/main HarkonCollider 2023-09-09 22:00:21 +0000
d7e6914fb8 Merge pull request 'main' (#47) from ken11o2/tortoise-tts:main into main 1744878794826642657/tmp_refs/heads/master 1744878794826642657/master 1735143552337632696/tmp_refs/heads/master 1735143552337632696/master 1723026643394326077/tmp_refs/heads/master 1723026643394326077/master 1722991795190407314/tmp_refs/heads/master 1722991795190407314/master 1722676703821962341/tmp_refs/heads/master 1722676703821962341/master 1720072025747419067/tmp_refs/heads/master 1720072025747419067/master mrq 2023-09-04 20:01:14 +0000
b7c7fd1c5f add arg use_deepspeed 1719695049154731960/tmp_refs/heads/main 1719695049154731960/main 1719601419429777010/tmp_refs/heads/main 1719601419429777010/main ken11o2 2023-09-04 19:14:53 +0000
2478dc255e update TextToSpeech ken11o2 2023-09-04 19:13:45 +0000
18adfaf785 add use_deepspeed to contructor and update method post_init_gpt2_config ken11o2 2023-09-04 19:12:13 +0000
ac97c17bf7 add use_deepspeed ken11o2 2023-09-04 19:10:27 +0000
f05dfd0bea Specify numpy 1.23 Newer numpy versions don't work. 1719694167664979037/tmp_refs/heads/main 1719694167664979037/main 1719600447982911556/tmp_refs/heads/main 1719600447982911556/main a-One-Fan 2023-08-30 08:25:05 +0300
b10c58436d pesky dot 1743749035974743204/tmp_refs/heads/master 1743749035974743204/master 1743446194572835314/tmp_refs/heads/master 1743446194572835314/master 1723284695763961586/tmp_refs/heads/master 1723284695763961586/master 1723252055876832715/tmp_refs/heads/master 1723252055876832715/master 1723053070858394723/tmp_refs/heads/master 1723053070858394723/master 1722654672823287199/tmp_refs/heads/master 1722654672823287199/master mrq 2023-08-20 22:41:55 -0500
cbd3c95c42 possible speedup with one simple trick (it worked for valle inferencing), also backported the voice list loading from aivc mrq 2023-08-20 22:32:01 -0500
9afa71542b little sloppy hack to try and not load the same model when it was already loaded mrq 2023-08-11 04:02:36 +0000
e99a905d7c Fix lowercasing of kernel a-One-Fan 2023-07-13 22:14:16 +0300
1271237d89 Workaround for WSL VRAM leaks a-One-Fan 2023-07-13 10:16:57 +0300
e2cd07d560 Fix for redaction at end of text (#45) mrq 2023-06-10 21:16:21 +0000
5ff00bf3bf added flags to rever to default method of latent generation (separately for the AR and Diffusion latents, as some voices don't play nicely with the chunk-for-all method) 1745402579512341167/tmp_refs/heads/main 1745402579512341167/main 1744795418440691358/tmp_refs/heads/master 1744795418440691358/master 1741073735172102678/tmp_refs/heads/master 1741073735172102678/master 1730528279113407052/tmp_refs/heads/master 1730528279113407052/master 1723711651839185022/tmp_refs/heads/main 1723711651839185022/main 1722669762176523745/tmp_refs/heads/master 1722669762176523745/master mrq 2023-05-21 01:46:55 +0000
c90ee7c529 removed kludgy wrappers for passing progress when I was a pythonlet and didn't know gradio can hook into tqdm outputs anyways mrq 2023-05-04 23:39:39 +0000
086aad5b49 quick hotfix to remove offending codesmell (will actually clean it when I finish eating) mrq 2023-05-04 22:59:57 +0000
8618922a33 Implement correct XPU device count Forgot to do that a-One-Fan 2023-05-04 21:14:07 +0300
04b7049811 freeze numpy to 1.23.5 because latest version will moan about deprecating complex mrq 2023-05-04 01:54:41 +0000
44d2dcbb19 Add initial oneAPI support a-One-Fan 2023-04-30 23:05:24 +0300
b6a213bbbd removed some CPU fallback wrappers because directml seems to work now without them 1746860842622201547/tmp_refs/heads/master 1746860842622201547/master 1729687641034587323/tmp_refs/heads/master 1729687641034587323/master 1722672432879050967/tmp_refs/heads/master 1722672432879050967/master 1722467571017320645/tmp_refs/heads/master 1722467571017320645/master mrq 2023-04-29 00:46:36 +0000
2f7d9ab932 disable BNB for inferencing by default because I'm pretty sure it makes zero differences (can be force enabled with env vars if you'r erelying on this for some reason) mrq 2023-04-29 00:38:18 +0000
f025470d60 Merge pull request 'Update tortoise/utils/devices.py vram issue' (#44) from aJoe/tortoise-tts:main into main mrq 2023-04-12 19:58:02 +0000
eea4c68edc Update tortoise/utils/devices.py vram issue 1729685565561216230/tmp_refs/heads/main 1729685565561216230/main 1723579361949820501/tmp_refs/heads/main 1723579361949820501/main aJoe 2023-04-12 05:33:30 +0000
815ae5d707 Merge pull request 'feat: support .flac voice files' (#43) from NtTestAlert/tortoise-tts:support_flac_voice into main 1745769128631567508/tmp_refs/heads/main 1745769128631567508/main 1745730962010420844/tmp_refs/heads/main 1745730962010420844/main 1719767672749564766/tmp_refs/heads/main 1719767672749564766/main 1719723549189960920/tmp_refs/heads/main 1719723549189960920/main mrq 2023-04-01 16:37:56 +0000
2cd7b72688 feat: support .flac voice files NtTestAlert 2023-04-01 15:08:31 +0200
0bcdf81d04 option to decouple sample batch size from CLVP candidate selection size (currently just unsqueezes the batches) 1710189933836426429/tmp_refs/heads/master 1710189933836426429/master mrq 2023-03-21 21:33:46 +0000
d1ad634ea9 added japanese preprocessor for tokenizer 1745601694898714528/tmp_refs/heads/main 1745601694898714528/main mrq 2023-03-17 20:03:02 +0000
af78e3978a deduce if preprocessing text by checking the JSON itself instead mrq 2023-03-16 14:41:04 +0000
e201746eeb added diffusion_model and tokenizer_json as arguments for settings editing mrq 2023-03-16 14:19:24 +0000
1f674a468f added flag to disable preprocessing (because some IPAs will turn into ASCII, implicitly enable for using the specific ipa.json tokenizer vocab) mrq 2023-03-16 04:33:03 +0000
42cb1f3674 added args for tokenizer and diffusion model (so I don't have to add it later) mrq 2023-03-15 00:30:28 +0000
65a43deb9e why didn't I also have it use chunks for computing the AR conditional latents (instead of just the diffusion aspect) mrq 2023-03-14 01:13:49 +0000
97cd58e7eb maybe solved that odd VRAM spike when doing the clvp pass mrq 2023-03-12 12:48:29 -0500
fec0685405 revert muh clean code mrq 2023-03-10 00:56:29 +0000
0514f011ff how did I botch this, I don't think it affects anything since it never thrown an error mrq 2023-03-09 22:36:12 +0000
00be48670b i am very smart mrq 2023-03-09 02:06:44 +0000
bbeee40ab3 forgot to convert to gigabytes mrq 2023-03-09 00:51:13 +0000
6410df569b expose VRAM easily mrq 2023-03-09 00:38:31 +0000
3dd5cad324 reverting additional auto-suggested batch sizes, per mrq/ai-voice-cloning#87 proving it in fact, is not a good idea mrq 2023-03-07 19:38:02 +0000
cc36c0997c didn't get a chance to commit this this morning mrq 2023-03-07 15:43:09 +0000
e650800447 Update 'tortoise/utils/device.py' 1734866957178459162/tmp_refs/heads/main 1734866957178459162/main deviandice 2023-03-07 14:05:27 +0000
fffea7fc03 unmarried the config.json to the bigvgan by downloading the right one mrq 2023-03-07 13:37:45 +0000
26133c2031 do not reload AR/vocoder if already loaded 1718029514734621673/tmp_refs/heads/master 1718029514734621673/master mrq 2023-03-07 04:33:49 +0000
e2db36af60 added loading vocoders on the fly mrq 2023-03-07 02:44:09 +0000
7b2aa51abc oops mrq 2023-03-06 21:32:20 +0000
7f98727ad5 added option to specify autoregressive model at tts generation time (for a spicy feature later) mrq 2023-03-06 20:31:19 +0000
6fcd8c604f moved bigvgan model to a huggingspace repo mrq 2023-03-05 19:47:22 +0000
0f3261e071 you should have migrated by now, if anything breaks it's on (You) mrq 2023-03-05 14:03:18 +0000
06bdf72b89 load the model on CPU because torch doesn't like loading models directly to GPU (it just follows the default vocoder loading behavior) mrq 2023-03-03 13:53:21 +0000
2ba0e056cd attribution mrq 2023-03-03 06:45:35 +0000
aca32a71f7 added BigVGAN in place of default vocoder (credit to https://github.com/deviandice/tortoise-tts-BigVGAN) mrq 2023-03-03 06:30:58 +0000
a9de016230 added storing the loaded model's hash to the TTS object instead of relying on jerryrig injecting it (although I still have to for the weirdos who refuse to update the right way), added a parameter when loading voices to load a latent tagged with a model's hash so latents are per-model now mrq 2023-03-02 00:44:42 +0000
7b839a4263 applied the bitsandbytes wrapper to tortoise inference (not sure if it matters) mrq 2023-02-28 01:42:10 +0000
7cc0250a1a added more kill checks, since it only actually did it for the first iteration of a loop mrq 2023-02-24 23:10:04 +0000
de46cf7831 adding magically deleted files back (might have a hunch on what happened) mrq 2023-02-24 19:30:04 +0000
2c7c02eb5c moved the old readme back, to align with how DLAS is setup, sorta mrq 2023-02-19 17:37:36 +0000
34b232927e Oops mrq 2023-02-19 01:54:21 +0000
d8c6739820 added constructor argument and function to load a user-specified autoregressive model mrq 2023-02-18 14:08:45 +0000
00cb19b6cf arg to skip voice latents for grabbing voice lists (for preparing datasets) mrq 2023-02-17 04:50:02 +0000
b255a77a05 updated notebooks to use the new "main" setup mrq 2023-02-17 03:31:19 +0000
150138860c oops mrq 2023-02-17 01:46:38 +0000
6ad3477bfd one more update mrq 2023-02-16 23:18:02 +0000
413703b572 fixed colab to use the new repo, reorder loading tortoise before the web UI for people who don't wait mrq 2023-02-16 22:12:13 +0000
30298b9ca3 fixing brain worms mrq 2023-02-16 21:36:49 +0000
d53edf540e pip-ifying things mrq 2023-02-16 19:48:06 +0000
d159346572 oops mrq 2023-02-16 13:23:07 +0000
eca61af016 actually for real fixed incrementing filenames because i had a regex that actually only worked if candidates or lines>1, cuda now takes priority over dml if you're a nut with both of them installed because you can just specify an override anyways mrq 2023-02-16 01:06:32 +0000
ec80ca632b added setting "device-override", less naively decide the number to use for results, some other thing mrq 2023-02-15 21:51:22 +0000

Commit Graph Select branches Hide Pull Requests main master new #102 #122 #123 #170 #170 #193 #20 #229 #230 #230 #233 #234 #245 #249 #249 #250 #263 #273 #273 #3 #31 #36 #41 #42 #43 #44 #47 #49 #50 #55 #57 #58 #64 #68 #7 #70 #74 #78 #8 #82 #90 #97 Mono Color

Commit Graph

Select branches

Hide Pull Requests

main

master

new

#102

#122

#123

#170

#170

#193

#20

#229

#230

#230

#233

#234

#245

#249

#249

#250

#263

#273

#273

#3

#31

#36

#41

#42

#43

#44

#47

#49

#50

#55

#57

#58

#64

#68

#7

#70

#74

#78

#8

#82

#90

#97