Commit Graph

12 Commits

Author SHA1 Message Date
mrq
92139b6da9 additional cruft, added a note in documentation to be aware of NUMA node topology when running vall_e.emb.process with more than one process 2025-02-18 19:56:30 -06:00
mrq
0dc49ef4d5 documentation update while I wait for more audio (between 4 and 8 seconds per utterance) quantize for nvidia/audio-codec-44khz (I was foolish to think I can get something servicable with just 4 seconds max for an utterance) 2025-02-15 17:42:06 -06:00
mrq
04fef5dad5 agony 2025-02-12 00:18:24 -06:00
mrq
1c0ed6abac added notes on this unfruitful experiment 2025-02-11 16:21:43 -06:00
mrq
59bf6b8b33 exposed additional task (ns, sr, vc) (vc is experimental) 2024-12-20 11:15:29 -06:00
mrq
8568a93dad added WER/SIM-O metrics, added APOLLO but I need to test it 2024-12-10 20:13:21 -06:00
mrq
a6c745bafb chinese (mandarin?) support added (I guess I don't need pinyin, but tone markers are handled), korean validated, vocab adjusted 2024-12-09 14:26:19 -06:00
mrq
a032ff588f doc update, added automatically deducing language from a given text, also checks if the input is already phonemized text to allow direct control without being cringe (procrastinating adding WER/SIM-O) 2024-12-07 22:34:25 -06:00
mrq
39096f8ff3 redid loss calculation to be cleaner, and position ID generation, and other things (I might need to train the NAR-len from scratch and not resume from an existing checkpoint.........) 2024-11-14 22:17:47 -06:00
mrq
f7b8b1e825 dropped subtrain dataloader since its useless to duplicate 2024-11-11 17:00:49 -06:00
mrq
bcabde3454 more notes 2024-11-06 13:51:28 -06:00
mrq
9901c4f8ca documentation under ./docs/ 2024-11-05 16:11:01 -06:00