forked from mrq/DL-Art-School
009a1e8404
This new one has a "cheating" top layer, that does not feed down into the unet encoder, but does consume the outputs of the unet. This cheater only operates on half of the input, while the rest of the unet operates on the full input. This limits the dimensionality of this last layer, on the assumption that these last layers consume by far the most computation and memory, but do not require the full input context. Losses are only computed on half of the aggregate input. |
||
---|---|---|
.. | ||
__init__.py | ||
gpt_asr_hf.py | ||
gpt_asr_hf2.py | ||
gpt_tts_hf.py | ||
lucidrains_dvae.py | ||
mini_encoder.py | ||
pixelshuffle_1d.py | ||
text_voice_clip.py | ||
transformer_builders.py | ||
unet_diffusion_vocoder_with_ref_trunc_top.py | ||
unet_diffusion_vocoder_with_ref.py | ||
unified_voice_bilevel.py | ||
unified_voice.py | ||
unified_voice2.py | ||
voice_voice_clip.py |