vall-e

mrq/vall-e

Author	SHA1	Message	Date
mrq	92139b6da9	additional cruft, added a note in documentation to be aware of NUMA node topology when running vall_e.emb.process with more than one process	2025-02-18 19:56:30 -06:00
mrq	0dc49ef4d5	documentation update while I wait for more audio (between 4 and 8 seconds per utterance) quantize for nvidia/audio-codec-44khz (I was foolish to think I can get something servicable with just 4 seconds max for an utterance)	2025-02-15 17:42:06 -06:00
mrq	04fef5dad5	agony	2025-02-12 00:18:24 -06:00
mrq	1c0ed6abac	added notes on this unfruitful experiment	2025-02-11 16:21:43 -06:00
mrq	59bf6b8b33	exposed additional task (ns, sr, vc) (vc is experimental)	2024-12-20 11:15:29 -06:00
mrq	8568a93dad	added WER/SIM-O metrics, added APOLLO but I need to test it	2024-12-10 20:13:21 -06:00
mrq	a6c745bafb	chinese (mandarin?) support added (I guess I don't need pinyin, but tone markers are handled), korean validated, vocab adjusted	2024-12-09 14:26:19 -06:00
mrq	a032ff588f	doc update, added automatically deducing language from a given text, also checks if the input is already phonemized text to allow direct control without being cringe (procrastinating adding WER/SIM-O)	2024-12-07 22:34:25 -06:00
mrq	39096f8ff3	redid loss calculation to be cleaner, and position ID generation, and other things (I might need to train the NAR-len from scratch and not resume from an existing checkpoint.........)	2024-11-14 22:17:47 -06:00
mrq	f7b8b1e825	dropped subtrain dataloader since its useless to duplicate	2024-11-11 17:00:49 -06:00
mrq	bcabde3454	more notes	2024-11-06 13:51:28 -06:00
mrq	9901c4f8ca	documentation under ./docs/	2024-11-05 16:11:01 -06:00

12 Commits