DL-Art-School

Author	SHA1	Message	Date
James Betker	767f963392	Try to squeeze a bit more performance out of this arch	2022-07-20 23:51:11 -06:00
James Betker	b9d0f7e6de	simplify parameterization a bit	2022-07-20 23:41:54 -06:00
James Betker	ee8ceed6da	rework tfd13 further - use a gated activation layer for both attention & convs - add a relativistic learned position bias. I believe this is similar to the T5 position encodings but it is simpler and learned - get rid of prepending to the attention matrix - this doesn't really work that well. the model eventually learns to attend one of its heads to these blocks but why not just concat if it is doing that?	2022-07-20 23:28:29 -06:00
James Betker	40427de8e3	update tfd13 for inference	2022-07-20 21:51:25 -06:00
James Betker	dbebe18602	Fix ts=0 with new formulation	2022-07-20 12:12:33 -06:00
James Betker	82bd62019f	diffuse the cascaded prior for continuous sr model	2022-07-20 11:54:09 -06:00
James Betker	b0e3be0a17	transition to nearest interpolation mode for downsampling	2022-07-20 10:56:17 -06:00
James Betker	15decfdb98	misc	2022-07-20 10:19:02 -06:00
James Betker	fc0b291b21	do masking up proper	2022-07-19 16:32:17 -06:00
James Betker	c00398e955	scope attention in tfd13 as well	2022-07-19 14:59:43 -06:00
James Betker	6b1cfe8e66	ugh	2022-07-19 11:14:20 -06:00
James Betker	eecb534e66	a few fixes to multiresolution sr	2022-07-19 11:11:15 -06:00
James Betker	df27b98730	ddp doesnt like dropout on checkpointed values	2022-07-18 17:17:04 -06:00
James Betker	c959e530cb	good ole ddp..	2022-07-18 17:13:45 -06:00
James Betker	cf57c352c8	Another fix	2022-07-18 17:09:13 -06:00
James Betker	83a4ef4149	default to use input for conditioning & add preprocessed input to GDI	2022-07-18 17:01:19 -06:00
James Betker	1b4d9567f3	tfd13 for multi-resolution superscaling	2022-07-18 16:36:22 -06:00

17 Commits