vall-e

mrq/vall-e

Author	SHA1	Message	Date
mrq	953015748f	ugh	2025-02-07 20:49:28 -06:00
mrq	ed94b261dc	could have sworn i had 'vall_e.emb.process --dtype' working, also possible RAM optimization so I can stop locking up my server when firing four encoding processes	2025-02-07 18:52:19 -06:00
mrq	47eb498046	more tweaks	2025-02-06 23:26:26 -06:00
mrq	67a9401cce	oops	2025-02-06 15:14:14 -06:00
mrq	712ce4af5d	maybe fixed errors with DAC backend, added option to limit by duration in emb.process (because I only really need short utternaces right now and I'm not ready to spend a week on processing everything again)	2025-02-06 12:37:18 -06:00
mrq	299cc88821	re-added amp encoding/decoding for audio, possible bad idea to ignore using amp instead if requested	2025-02-05 21:55:06 -06:00
mrq	7592befc53	updated vall_e.emb.process to allow for batched processing, some typo fixes (it's painfully slow on my 7900XTX...)	2025-02-05 21:13:20 -06:00
mrq	79c504c278	cleaned up encode/decode functions to make them a little more coherent, added option to batch encode/decode (would have been very nice in the past, but this should speed things up for me when i fall for the latest meme codec)	2025-02-05 20:54:31 -06:00
mrq	84174c1c1b	oops	2025-02-05 10:25:03 -06:00
mrq	bb2ebe1ca2	fixed issues that may rise from updating transformers with attention, added nvidia/audio-codec-44khz backend support (by gutting everything necessary because I do NOT want to install more dependencies	2025-02-04 20:30:07 -06:00
mrq	0841f366e8	I should really just grab modelling_llama wholesale (fix for the adapted attention class)	2025-01-28 21:55:05 -06:00
mrq	e5f9da2221	oops	2025-01-21 11:59:24 -06:00
mrq	69c1d2991f	updated mixtral backend (need this for something else)	2025-01-20 21:50:56 -06:00
mrq	1a26f789a5	added option to playback audio directly, removed no-phonemize option since I swear it worked in testing but it doesn't actually work	2025-01-12 21:52:49 -06:00
mrq	9fa87c417a	added option to use raw text rather than the IPA phonemes (it requires a model trained on raw text)	2025-01-06 00:10:43 -06:00
mrq	3ab11bdc7b	oops	2025-01-05 23:53:17 -06:00
mrq	b445f4abb6	experimental	2025-01-05 19:05:00 -06:00
mrq	2e6a7625e4	experimental	2025-01-05 12:47:03 -06:00
mrq	31cfef59c4	when you do more training thinking the original model that can do NS/SR got deleted but it was actually a string not having its quotes in the right place.......	2024-12-27 18:16:57 -06:00
mrq	9b0d2ccbe1		2024-12-26 21:42:17 -06:00
mrq	25a02f2c3f	oops	2024-12-25 00:36:19 -06:00
mrq	b9d2cd5513	vall_e.cpp cli	2024-12-25 00:28:34 -06:00
mrq	59f56ad099	cleaup	2024-12-24 23:14:32 -06:00
mrq	6bf59bbd8b	vall_e.cpp phonemizing and tokenizing	2024-12-24 22:39:32 -06:00
mrq	8516bab15c	cleanup	2024-12-24 20:29:03 -06:00
mrq	82e8592f2a	working vall_e.cpp	2024-12-24 17:54:48 -06:00
mrq	2b4d783299	ugh	2024-12-23 23:42:44 -06:00
mrq	532200de2a	nvm fixed	2024-12-23 22:23:43 -06:00
mrq	f62f99b8de	more work on vall_e.cpp (need to resolve why the embeddings (and maybe the weights as a whole) are different from the base model)	2024-12-23 20:36:40 -06:00
mrq	6ecdb715b6	more work on vall_e.cpp (some more cleanup, NAR-len demasking, but still need to iron out some kinks)	2024-12-23 17:20:04 -06:00
mrq	a6945f981d	vall_e.cpp cleanup (having to keep a map of something that can work without touching llama.cpp AND something minimally invasive, AND adhere to a C++ style that isn't mine, is making me bipolar)	2024-12-23 14:16:16 -06:00
mrq	497bdfc67b	more work (the wall is non-causal decoding......)	2024-12-22 20:11:31 -06:00
mrq	5f289db275	ugh	2024-12-22 16:15:24 -06:00
mrq	0d4329d2e3	sanity cleanup	2024-12-22 15:05:45 -06:00
mrq	353e478e68	agony	2024-12-21 22:52:10 -06:00
mrq	2542ed067d	ugh	2024-12-21 19:59:56 -06:00
mrq	70a0f5724b	i hate learning APIs so much	2024-12-21 19:40:19 -06:00
mrq	1b4a69ce29	more updates to vall_e.cpp	2024-12-21 19:16:44 -06:00
mrq	503124d0d3	crammed encodec.cpp in	2024-12-21 15:48:12 -06:00
mrq	979c1f797c	quant	2024-12-21 11:56:22 -06:00
mrq	5788db849b	added extremely barebones vall_e.cpp so I can stop having to juggle this file around so much	2024-12-21 10:57:02 -06:00
mrq	91caf00212	ugh	2024-12-20 17:13:37 -06:00
mrq	d85273609e	corrected export.py's --hf	2024-12-20 15:17:13 -06:00
mrq	59bf6b8b33	exposed additional task (ns, sr, vc) (vc is experimental)	2024-12-20 11:15:29 -06:00
mrq	53230efd74	changed prompt_inject_noise to prompt_inject_noise_p so I can have another reason to do this post-training	2024-12-19 19:28:50 -06:00
mrq	e7e7f48043	livid	2024-12-19 19:25:27 -06:00
mrq	8838babcba	sanity checks (and I realized that the model actually had langs set to 4 in the yaml for KO/ZH so................	2024-12-19 19:08:57 -06:00
mrq	7617b6485f	instead just compute a bunch of stuff on the transcriptions to store later in different names so I can just retrieve what I want, also added tongue twisters for nefarious reasons	2024-12-18 23:43:11 -06:00
mrq	4775edaa41	added text cleaning/normalization for wer purposes but it amounts to nothing desu	2024-12-18 19:58:53 -06:00
mrq	9f2bd7f6e4	ugh	2024-12-17 23:17:12 -06:00

1 2 3 4 5 ...

723 Commits