DL-Art-School

Author	SHA1	Message	Date
mrq	0f04206aa2	added ability to toggle some settings with envvars for later testing without needing to manually edit this file (and some other things like disabling it when a user requests it in the future)	2023-02-24 23:08:56 +00:00
mrq	1433b7c0ea	working Embedding override	2023-02-23 07:28:27 +00:00
mrq	94aefa3e4c	silence	2023-02-23 07:25:09 +00:00
mrq	fd66c4104b	ugh	2023-02-23 07:18:07 +00:00
mrq	7bcedca771	I guess I can't easily toggle it outside of here, but it works	2023-02-23 07:02:06 +00:00
mrq	0ef8ab6872	shut up	2023-02-23 06:12:27 +00:00
mrq	58600274ac	Disabling bitsandbytes optimization as default for now, in the off chance that it actually produces garbage (which shouldn't happen, there's no chance, if training at float16 from a model at float16 works fine, then this has to work)	2023-02-23 03:22:59 +00:00
mrq	918473807f	Merge pull request 'bitsandbytes' (#2 ) from bitsandbytes into master Reviewed-on: #2	2023-02-23 03:16:25 +00:00
mrq	6676c89c0e	I sucked off the hyptothetical wizard again, just using BNB's ADAM optimizer nets HUGE savings, but I don't know the output costs, will need to test	2023-02-23 02:42:17 +00:00
mrq	01c0941a40	binaries	2023-02-22 23:09:27 +00:00
mrq	4427d7fb84	initial conversion (errors out)	2023-02-22 23:07:05 +00:00
mrq	6c284ef8ec	oops	2023-02-18 03:27:04 +00:00
mrq	8db762fa17	thought I copied this over	2023-02-18 03:15:44 +00:00
mrq	73d9c3bd46	set output folder to be sane with the cwd as a reference point	2023-02-18 02:01:09 +00:00
mrq	5ecf7da881	Fix later	2023-02-17 20:49:29 +00:00
mrq	e3e8801e5f	Fix I thought wasn't needed since it literally worked without it earlier	2023-02-17 20:41:20 +00:00
mrq	535549c3f3	add some snark about the kludge I had to fix, and the kludge I used to fix it	2023-02-17 19:20:19 +00:00
mrq	a09cf98c7f	more cleanup, pip-ifying won't work, got an alternative	2023-02-17 15:47:55 +00:00
mrq	6afa2c299e	break if your dataset size is smaller than your batch size	2023-02-17 04:08:27 +00:00
mrq	94d0f16608	Necessary fixes to get it to work	2023-02-17 02:03:00 +00:00
mrq	49e23b226b	pip-ify	2023-02-17 00:33:50 +00:00
James Betker	f31a333c4f	more sampling fixes	2022-10-10 20:11:28 -06:00
James Betker	5d172fbf7e	Fix eval	2022-10-10 14:22:36 -06:00
James Betker	9502e0755e	ugh	2022-10-10 12:15:51 -06:00
James Betker	fce2c8f5db	and listify them	2022-10-10 12:13:49 -06:00
James Betker	3cf78e3c44	train mel head even when not	2022-10-10 12:10:56 -06:00
James Betker	cc74a43675	Checkin	2022-10-10 11:30:20 -06:00
James Betker	3cb14123bc	glc fix	2022-07-29 11:24:36 -06:00
James Betker	4ddd01a7fb	support generating cheaters from the new cheater network	2022-07-29 09:19:20 -06:00
James Betker	27a9b1b750	rename perplexity->log perplexity	2022-07-28 09:48:40 -06:00
James Betker	1d68624828	fix some imports..	2022-07-28 02:35:32 -06:00
James Betker	cfe907f13f	i like this better	2022-07-28 02:33:23 -06:00
James Betker	d44ed5d12d	probably too harsh on ninfs	2022-07-28 01:33:54 -06:00
James Betker	4509cfc705	track logperp for diffusion evals	2022-07-28 01:30:44 -06:00
James Betker	19eb939ccf	gd perplexity # Conflicts: # codes/trainer/eval/music_diffusion_fid.py	2022-07-28 00:25:05 -06:00
James Betker	a1bbde8a43	few things	2022-07-26 11:52:03 -06:00
James Betker	f8108cfdb2	update environment and fix a bunch of deps	2022-07-24 23:43:25 -06:00
James Betker	45afefabed	fix booboo	2022-07-24 18:00:14 -06:00
James Betker	cc62ba9cba	few more tfd13 things	2022-07-24 17:39:33 -06:00
James Betker	f3d967dbf5	remove eta from mdf	2022-07-24 17:21:20 -06:00
James Betker	76464ca063	some fixes to mdf to support new archs	2022-07-21 10:55:50 -06:00
James Betker	13c263e9fb	go all in on m2wv3	2022-07-21 00:51:27 -06:00
James Betker	24a78bd7d1	update tfd14 too	2022-07-21 00:45:33 -06:00
James Betker	02ebda42f2	#yolo	2022-07-21 00:43:03 -06:00
James Betker	b92ff8de78	misc	2022-07-20 23:59:32 -06:00
James Betker	a1743d26aa	Revert "Try to squeeze a bit more performance out of this arch" This reverts commit `767f963392`.	2022-07-20 23:57:56 -06:00
James Betker	767f963392	Try to squeeze a bit more performance out of this arch	2022-07-20 23:51:11 -06:00
James Betker	b9d0f7e6de	simplify parameterization a bit	2022-07-20 23:41:54 -06:00
James Betker	ee8ceed6da	rework tfd13 further - use a gated activation layer for both attention & convs - add a relativistic learned position bias. I believe this is similar to the T5 position encodings but it is simpler and learned - get rid of prepending to the attention matrix - this doesn't really work that well. the model eventually learns to attend one of its heads to these blocks but why not just concat if it is doing that?	2022-07-20 23:28:29 -06:00
James Betker	40427de8e3	update tfd13 for inference	2022-07-20 21:51:25 -06:00

1 2 3 4 5 ...

2137 Commits