DL-Art-School/codes/models/gpt_voice
James Betker 009a1e8404 Add a new diffusion_vocoder that should be trainable faster
This new one has a "cheating" top layer, that does not feed down into the unet encoder,
but does consume the outputs of the unet. This cheater only operates on half of the input,
while the rest of the unet operates on the full input. This limits the dimensionality of this last
layer, on the assumption that these last layers consume by far the most computation and memory,
but do not require the full input context.

Losses are only computed on half of the aggregate input.
2022-01-11 17:26:07 -07:00
..
__init__.py
gpt_asr_hf.py
gpt_asr_hf2.py
gpt_tts_hf.py
lucidrains_dvae.py
mini_encoder.py
pixelshuffle_1d.py
text_voice_clip.py Fixes 2022-01-10 14:32:04 -07:00
transformer_builders.py
unet_diffusion_vocoder_with_ref_trunc_top.py Add a new diffusion_vocoder that should be trainable faster 2022-01-11 17:26:07 -07:00
unet_diffusion_vocoder_with_ref.py
unified_voice_bilevel.py
unified_voice.py
unified_voice2.py fix unified_voice 2022-01-10 16:17:31 -07:00
voice_voice_clip.py