vall-e

mrq/vall-e

Author	SHA1	Message	Date
mrq	c690aa509d	fixes and compat (MoE-fying an existing model and retraining from there just ruins it after a second of audio...)	2023-12-25 21:20:32 -06:00
mrq	e513d2ef19	experts weren't forwarded into constructer (wasted a few days of training garbage)	2023-12-23 16:08:17 -06:00
mrq	0db3203b21	added LLaMA/Mixtral (if experts>1) model arches, utilize XMoE's loss as well, set MoE frequency to 1 to make every layer MoE'd for RetNet, etc. (going to do tests without burning out again to see how things go)	2023-12-22 19:27:36 -06:00
mrq	9c198eb75a	added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works)	2023-12-20 18:45:58 -06:00
mrq	6c51a629cc	resetting step count resets the samples processed and other metrics	2023-10-29 12:11:19 -05:00
mrq	0aa2a3cc07	evaluation/validation passes language ID during training (oops)	2023-10-29 12:00:40 -05:00
mrq	ed54f4ebec	un 'experimental' the better target sequence preparation	2023-10-22 09:06:59 -05:00
mrq	9a6040383e	make validation samplers ignore sampler type	2023-10-22 09:01:47 -05:00
mrq	32d4271ca8	fixed issue with training from scratch (oops)	2023-10-21 09:55:38 -05:00
mrq	3195026dba	fixed issue with the 'add another target audio to artificially create longer sequences' for HDF5 just duplicating the utterance initially sampled	2023-10-18 20:38:33 -05:00
mrq	09cda7d3f9	added sampling by speaker group name (might be better to de-emphasize the LibriVox/Audiobooks that are in large numbers, and emphasize the smaller pools), log cleanup	2023-10-16 19:30:38 -05:00
mrq	a539f6889f	mucked around with the loss calculation, this seems better?	2023-10-13 18:22:21 -05:00
mrq	fb467b19ba	exposed rolling resp context to the web UI, added passing in language to inferencing command line	2023-10-12 23:21:01 -05:00
mrq	298fd9a5f9	fixed issue with webui	2023-10-12 22:49:25 -05:00
mrq	65f500083d	tweaks to try and get deepspeed quantized inferencing, validating bitsandbytes and deepspeed quantization, nothing seems to work	2023-10-12 22:21:43 -05:00
mrq	08bae355eb	actually use langs from the dataloader	2023-10-11 21:21:50 -05:00
mrq	3af19d79fd	oops	2023-10-11 20:49:54 -05:00
mrq	8740cdefc6	added initial support for languages (still testing, marked as model version 3), added experimental 'context extend by limiting the resp context' (untested)	2023-10-11 20:38:40 -05:00
mrq	6045cbce94	added experimental option to append utterances for training target (emphasis on experimental)	2023-10-11 17:32:45 -05:00
mrq	7facacf7c9	separated samplers into its own file, don't bother copying the logits back to the GPU after sampling, it's not necessary	2023-10-11 12:25:31 -05:00
mrq	100dd164e6	apply phoneme cleanup in inferencing as well	2023-10-10 19:21:19 -05:00
mrq	b4405c98ea	remove double spaces in the text phonemes (might have caused problems.........)	2023-10-10 19:18:24 -05:00
mrq	47b3077415	fixed mirostat issue	2023-10-10 18:09:49 -05:00
mrq	99e980d323	documentation and more better-er attribution	2023-10-10 17:15:16 -05:00
mrq	e727b6e5c1	changed dynamic temperature trigger to be a min-(n)ar-temp value between [0,(n)ar-temp), flags to set min temp, checkbox in web UI to request it	2023-10-10 17:02:33 -05:00
mrq	ec25f56bd9	used torch.max fixes things, somehow, for dynamic temp sampling	2023-10-10 16:42:24 -05:00
mrq	87db03dd93	trim the input prompt to 3 seconds when training NAR tasks (marked as experimental; the paper mentions doing so, but I don't know how much this would harm the retention heads)	2023-10-09 22:03:58 -05:00
mrq	893a610fad	cleanup, use deepspeed inferencing pathway if requested	2023-10-09 15:24:04 -05:00
mrq	26fbb92ec6	reduced dynamic temperature threshold to > 1.0, as it seems to not quite be useful for audio LMs, sped up any sampling that touches logits by copying them to CPU first, as accessing tensors on the GPU is slow as balls)	2023-10-09 14:46:17 -05:00
mrq	29873e6ded	extend the max temps in the web UI to actually allow dynamic temp sampling	2023-10-09 13:30:45 -05:00
mrq	27483e56f0	disabled preparing of SpeechX tasks, added dynamic temperature testing (to-do: test it, credited in the function)	2023-10-09 13:01:40 -05:00
mrq	2deb995cc9	updated setup script	2023-10-06 20:08:28 -05:00
mrq	1fd91b6437	cleanup	2023-10-06 10:13:54 -05:00
mrq	3db7e7dea1	implicitly load checkpoint if deepspeed checkpoint not found, updated setup script to grab the diskcached dataloader things	2023-10-06 10:02:45 -05:00
mrq	82f02ae9b1	oops	2023-10-06 09:26:52 -05:00
mrq	2f2505b12f	updated setup script	2023-10-06 08:08:28 -05:00
mrq	63cc9cf37a	added compat flags for torchscale because the maintainer for torchscale broke compat for existing models	2023-10-05 16:39:46 -05:00
mrq	12cfc9e502	added prodigyopt as a dependency because I keep forgetting	2023-10-04 19:42:56 -05:00
mrq	153f8b293c	added min-x and min-y arguments to plot.py, helper script to download from my existing checkpoint	2023-10-04 19:41:37 -05:00
mrq	777ba43305	oops	2023-10-03 15:01:37 -05:00
mrq	d12877ee09	added option to set probability of selecting the AR during training under a monolithic AR+NAR, added some more to-dos while I have them in mind	2023-10-02 16:52:42 -05:00
mrq	e85b798fbf	set default NAR levels to max for the web UI	2023-09-29 19:14:16 -05:00
mrq	c7fb740d41	do not specify a default dtype for the web UI, let it implicitly load from the yaml instead	2023-09-24 17:54:03 -05:00
mrq	4abd6564d1	fixed training stats not loading from exported weights, a bit of a readme cleanup, updated example training yaml	2023-09-23 19:59:00 -05:00
mrq	9384900ce6	revert the frankensteined "train one model but hotload the other" since it kept loading the last exported weights and I'm not supporting this usecase anymore anyways	2023-09-22 13:04:17 -05:00
mrq	e7da1eb90d	edge case	2023-09-20 19:20:17 -05:00
mrq	c0b25541e3	restructured some things with the model to remove dead weights	2023-09-20 19:10:59 -05:00
mrq	a6bfe43590	added mirostat sampling (given a partially trained model, it got far decent output than I expected, need to test on a better trained model)	2023-09-18 18:55:41 -05:00
mrq	2567e082b5	UGH	2023-09-16 00:26:13 -05:00
mrq	22ffaf3a33	have loss for the NAR not-ignore the text prompt, I imagine this should help the NAR and explain why it's always had a bit of an issue with training	2023-09-15 19:08:44 -05:00

... 2 3 4 5 6 ...

308 Commits