Commit Graph

709 Commits

Author SHA1 Message Date
mrq
9fa87c417a added option to use raw text rather than the IPA phonemes (it requires a model trained on raw text) 2025-01-06 00:10:43 -06:00
mrq
3ab11bdc7b oops 2025-01-05 23:53:17 -06:00
mrq
b445f4abb6 experimental 2025-01-05 19:05:00 -06:00
mrq
2e6a7625e4 experimental 2025-01-05 12:47:03 -06:00
mrq
31cfef59c4 when you do more training thinking the original model that can do NS/SR got deleted but it was actually a string not having its quotes in the right place....... 2024-12-27 18:16:57 -06:00
mrq
9b0d2ccbe1 2024-12-26 21:42:17 -06:00
mrq
25a02f2c3f oops 2024-12-25 00:36:19 -06:00
mrq
b9d2cd5513 vall_e.cpp cli 2024-12-25 00:28:34 -06:00
mrq
59f56ad099 cleaup 2024-12-24 23:14:32 -06:00
mrq
6bf59bbd8b vall_e.cpp phonemizing and tokenizing 2024-12-24 22:39:32 -06:00
mrq
8516bab15c cleanup 2024-12-24 20:29:03 -06:00
mrq
82e8592f2a working vall_e.cpp 2024-12-24 17:54:48 -06:00
mrq
2b4d783299 ugh 2024-12-23 23:42:44 -06:00
mrq
532200de2a nvm fixed 2024-12-23 22:23:43 -06:00
mrq
f62f99b8de more work on vall_e.cpp (need to resolve why the embeddings (and maybe the weights as a whole) are different from the base model) 2024-12-23 20:36:40 -06:00
mrq
6ecdb715b6 more work on vall_e.cpp (some more cleanup, NAR-len demasking, but still need to iron out some kinks) 2024-12-23 17:20:04 -06:00
mrq
a6945f981d vall_e.cpp cleanup (having to keep a map of something that can work without touching llama.cpp AND something minimally invasive, AND adhere to a C++ style that isn't mine, is making me bipolar) 2024-12-23 14:16:16 -06:00
mrq
497bdfc67b more work (the wall is non-causal decoding......) 2024-12-22 20:11:31 -06:00
mrq
5f289db275 ugh 2024-12-22 16:15:24 -06:00
mrq
0d4329d2e3 sanity cleanup 2024-12-22 15:05:45 -06:00
mrq
353e478e68 agony 2024-12-21 22:52:10 -06:00
mrq
2542ed067d ugh 2024-12-21 19:59:56 -06:00
mrq
70a0f5724b i hate learning APIs so much 2024-12-21 19:40:19 -06:00
mrq
1b4a69ce29 more updates to vall_e.cpp 2024-12-21 19:16:44 -06:00
mrq
503124d0d3 crammed encodec.cpp in 2024-12-21 15:48:12 -06:00
mrq
979c1f797c quant 2024-12-21 11:56:22 -06:00
mrq
5788db849b added extremely barebones vall_e.cpp so I can stop having to juggle this file around so much 2024-12-21 10:57:02 -06:00
mrq
91caf00212 ugh 2024-12-20 17:13:37 -06:00
mrq
d85273609e corrected export.py's --hf 2024-12-20 15:17:13 -06:00
mrq
59bf6b8b33 exposed additional task (ns, sr, vc) (vc is experimental) 2024-12-20 11:15:29 -06:00
mrq
53230efd74 changed prompt_inject_noise to prompt_inject_noise_p so I can have another reason to do this post-training 2024-12-19 19:28:50 -06:00
mrq
e7e7f48043 livid 2024-12-19 19:25:27 -06:00
mrq
8838babcba sanity checks (and I realized that the model actually had langs set to 4 in the yaml for KO/ZH so................ 2024-12-19 19:08:57 -06:00
mrq
7617b6485f instead just compute a bunch of stuff on the transcriptions to store later in different names so I can just retrieve what I want, also added tongue twisters for nefarious reasons 2024-12-18 23:43:11 -06:00
mrq
4775edaa41 added text cleaning/normalization for wer purposes but it amounts to nothing desu 2024-12-18 19:58:53 -06:00
mrq
9f2bd7f6e4 ugh 2024-12-17 23:17:12 -06:00
mrq
9090c34f10 cringe script to process seed-tts-eval's eval dataset into something i can easily use 2024-12-17 22:47:12 -06:00
mrq
ed152f78df tweaks to prompt duration to allow me to divorce how i use it for training with how I'm using it for the demo page, and demo page tweaks to make my life easier 2024-12-17 19:33:04 -06:00
mrq
7129582303 actually do proper wer/cer calculation by un-normalizing the scores 2024-12-17 14:22:30 -06:00
mrq
c2c6d912ac actually do speaker verification 2024-12-17 10:11:14 -06:00
mrq
c2e17e287b really shoddy voice conversion implementation (it sort of works...) 2024-12-16 22:54:53 -06:00
mrq
8515038968 imagine my disappointment when the epoch finished just for it to throw an exception 2024-12-16 18:28:01 -06:00
mrq
4a65ac9eb7 oops 2024-12-15 17:21:51 -06:00
mrq
cd4a5f427c KO/ZH model soon 2024-12-15 17:01:14 -06:00
mrq
4800e7179a remove nan checks because it causes problems in distributed training because I'm not syncing between GPUs (and nan losses gets ignored anyways with loss scaling) 2024-12-15 09:42:54 -06:00
mrq
2ba6b483dc ugh 2024-12-14 22:43:51 -06:00
mrq
3dd31e74d1 finally figured out a clean way to handle "resuming" the tqdm bar 2024-12-14 18:44:43 -06:00
mrq
35389481ee move lazy-stored ortho matrix to the grad device for apollo because agony 2024-12-13 23:22:26 -06:00
mrq
09804ecc16 APOLLO tweaks to make it work with deepspeed 2024-12-13 23:03:52 -06:00
mrq
64c67160a3 tweaks 2024-12-13 19:00:35 -06:00