vall-e

mrq/vall-e

Author	SHA1	Message	Date
mrq	4d75ee066c	actually do the Linear replacement with TE's Linear	2024-04-09 14:41:13 -05:00
mrq	7075c2a5f0	added an option to allow injecting embeddings from another model, because it dawned upon me how valuable embeddings from a good model can be for subsequent trainings (defined under cfg.models._embeddings as a relative path to the yaml)	2024-04-04 19:11:49 -05:00
mrq	47435207f7	Added cfg.bitsandbytes.replace as a less intrusive alternative to cfg.bitsandbytes.inject to replace all Linear modules in a model	2024-03-01 19:20:10 -06:00
mrq	3da1518ace	added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)	2024-01-31 21:48:36 -06:00
mrq	c690aa509d	fixes and compat (MoE-fying an existing model and retraining from there just ruins it after a second of audio...)	2023-12-25 21:20:32 -06:00
mrq	32d4271ca8	fixed issue with training from scratch (oops)	2023-10-21 09:55:38 -05:00
mrq	65f500083d	tweaks to try and get deepspeed quantized inferencing, validating bitsandbytes and deepspeed quantization, nothing seems to work	2023-10-12 22:21:43 -05:00
mrq	893a610fad	cleanup, use deepspeed inferencing pathway if requested	2023-10-09 15:24:04 -05:00
mrq	87c4bfedba	added ability to mark models as disabled for training, and hotloading them for eval/validation (useful if training only one model, or training a model per GPU)	2023-08-27 12:26:12 -05:00
mrq	d89568a96e	some fixes for the local framework	2023-08-05 03:22:15 +00:00
mrq	0a524f1d59	reticulating splines	2023-08-03 21:39:00 -05:00
mrq	c85101403f	big cleanup	2023-08-03 20:26:36 -05:00

12 Commits