DL-Art-School

History

James Betker ee8ceed6da rework tfd13 further - use a gated activation layer for both attention & convs - add a relativistic learned position bias. I believe this is similar to the T5 position encodings but it is simpler and learned - get rid of prepending to the attention matrix - this doesn't really work that well. the model eventually learns to attend one of its heads to these blocks but why not just concat if it is doing that?		2022-07-20 23:28:29 -06:00
..
asr	some dumb stuff	2022-04-07 11:32:34 -06:00
music	rework tfd13 further	2022-07-20 23:28:29 -06:00
tts	uv back to tortoise days	2022-06-15 09:04:41 -06:00
vocoders
__init__.py
audio_resnet.py
mel2vec.py	train quantizer with diffusion	2022-05-30 16:25:33 -06:00