vall-e

mrq/vall-e

Author	SHA1	Message	Date
mrq	2ea387c08a	segregated experimental changes into its own streamlined file to avoid breaking the existing model, and it can pivot to the cleaned up code if it actually works (nothing is working)	2025-02-26 21:26:13 -06:00
mrq	9b0d2ccbe1		2024-12-26 21:42:17 -06:00
mrq	c2c6d912ac	actually do speaker verification	2024-12-17 10:11:14 -06:00
mrq	4a65ac9eb7	oops	2024-12-15 17:21:51 -06:00
mrq	9cb0b6901b	unified nar.py into ar_nar.py	2024-11-10 12:19:48 -06:00
mrq	aefe8fcdad	UGH	2024-11-05 22:13:58 -06:00
mrq	ccf71dc1b6	added option to load from a model state dict directly instead of a yaml (to-do: do this for LoRAs too), automatically download the default model if none is provided	2024-10-25 22:15:15 -05:00
mrq	acdce66d4e	readme tweaks, set the (unused) default model download URL back to the base ar+nar-llama-8 model, as ar+nar-tts+stt-llama-8 was renamed back to it since it performs well	2024-10-05 22:53:53 -05:00
mrq	31e8b7edb8	tweaks and fixes for lora stuffs	2024-09-08 18:05:21 -05:00
mrq	32287710a2	moved prints to use logger, edited readme (fused_attn doesnt seem stable for training)	2024-08-29 13:27:16 -05:00
mrq	b7b99a25f1	added ability to specify attention backend for CLI and webui (because im tired of editing the yaml)	2024-08-26 19:33:51 -05:00
mrq	bc2a6fa756	sanity cleanup: moved experimental features under its own thing	2024-06-30 10:37:33 -05:00
mrq	cca542a4c0	ugh	2024-06-11 23:59:28 -05:00
mrq	8d068fa3f9	reticulating splines	2024-06-08 20:30:15 -05:00
mrq	b2194b859a	re-added loading multiple models because I'm now entertaining having split AR/NAR models again (and need a way to load both at once)	2024-06-06 09:48:43 -05:00
mrq	880b4ecd1b	cleanup, putting some thoughts in comments before I forget about them	2024-06-05 19:50:06 -05:00
mrq	c93d5863fd	fixes	2024-06-04 00:07:00 -05:00
mrq	934672252b	feverish cleanup	2024-06-03 21:28:49 -05:00
mrq	0b6499601b	sanitizing	2024-05-11 16:31:05 -05:00
mrq	545162195b	deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things	2024-04-15 19:54:32 -05:00
mrq	9d97eb5104	added FP8 support through `NVIDIA/TransformerEngine`, added RetNet_HF through `syncdoth/RetNet` (as an alternative to branch away from torchscale)	2024-04-08 20:14:51 -05:00
mrq	3da1518ace	added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)	2024-01-31 21:48:36 -06:00
mrq	e513d2ef19	experts weren't forwarded into constructer (wasted a few days of training garbage)	2023-12-23 16:08:17 -06:00
mrq	65f500083d	tweaks to try and get deepspeed quantized inferencing, validating bitsandbytes and deepspeed quantization, nothing seems to work	2023-10-12 22:21:43 -05:00
mrq	100ca6b7d0	added option to use SGD optimizer through the YAML, added option to pass in additional optimizer parameters through the YAML, added experimental unified AR+NAR model (does not seem fruitful in testing)	2023-09-06 18:58:35 -05:00
mrq	451726fdd5	added ability to disable activation checkpointing through the YAML (it is very VRAM intensive at double layer size)	2023-09-05 15:38:21 -05:00
mrq	8a6c203277	added per-speaker samplers	2023-09-03 21:27:13 -05:00
mrq	0a524f1d59	reticulating splines	2023-08-03 21:39:00 -05:00
mrq	c85101403f	big cleanup	2023-08-03 20:26:36 -05:00
mrq	7a06b27a9c	Tweaks	2023-08-02 22:06:39 +00:00

30 Commits