Commit Graph

1268 Commits

Author SHA1 Message Date
James Betker
c0f61a2e15 Rework how DVAE tokens are ordered
It might make more sense to have top tokens, then bottom tokens
with top tokens having different discretized values.
2021-08-05 07:07:17 -06:00
James Betker
4017236ba9 Fix up inference for gpt_tts 2021-08-05 06:46:30 -06:00
James Betker
5037220ac7 Mods to support contrastive learning on audio files 2021-08-05 05:57:04 -06:00
James Betker
341f28dd82 It works! 2021-08-04 20:07:51 -06:00
James Betker
36c7c1fbdb Fix training flow for NEXT TOKEN prediction instead of same token prediction
doh
2021-08-04 10:28:09 -06:00
James Betker
d9936df363 Add gpt_tts dataset and implement inference
- Adds a script which preprocesses quantized mels given a DVAE
- Adds a dataset which can consume preprocessed qmels
- Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens)
- Adds inference to gpt_tts
2021-08-04 00:44:04 -06:00
James Betker
4c98b9703f Get dalle-style TTS to "work" 2021-08-03 21:08:27 -06:00
James Betker
2814307eee Alterations to support VQVAE on mel spectrograms 2021-08-01 07:54:21 -06:00
James Betker
965f6e6b52 Fixes to weight_decay in adamw 2021-07-31 15:58:41 -06:00
James Betker
0c9e75bc69 Improvements to GptTts 2021-07-31 15:57:57 -06:00
James Betker
31ee9ae262 Checkin 2021-07-30 23:07:35 -06:00
James Betker
dadc54795c Add gpt_tts 2021-07-27 20:33:30 -06:00
James Betker
398185e109 More work on wave-diffusion 2021-07-27 05:36:17 -06:00
James Betker
49e3b310ea Allow audio sample rate interpolation for faster training 2021-07-26 17:44:06 -06:00
James Betker
96e90e7047 Add support for a gaussian-diffusion-based wave tacotron 2021-07-26 16:27:31 -06:00
James Betker
97d7cbbc34 Additional work for audio xformer (which doesnt really do a great job) 2021-07-23 10:58:14 -06:00
James Betker
2325e7a88c Allow inference for vqvae 2021-07-20 10:40:05 -06:00
James Betker
dfbb806a6e Add tacotron2 recipe 2021-07-20 08:37:57 -06:00
James Betker
d81386c1be Mods to support vqvae in audio mode (1d) 2021-07-20 08:36:46 -06:00
James Betker
5584cfcc7a tacotron2 work 2021-07-14 21:41:57 -06:00
James Betker
fe0c699ced Various fixes 2021-07-14 00:08:42 -06:00
James Betker
be2745f42d Add waveglow & inference capabilities to audio generator 2021-07-08 23:07:36 -06:00
James Betker
3febe6cbf4 Sample tacotron config 2021-07-08 22:14:10 -06:00
James Betker
1ff434218e tacotron2, ready for prime time! 2021-07-08 22:13:44 -06:00
James Betker
86fd3ad7fd Initial checkin of nvidia tacotron model & dataset
These two are tested, full support for training to come.
2021-07-06 11:11:35 -06:00
James Betker
3801d5d55e diffusion surfin' 2021-07-06 09:36:52 -06:00
James Betker
afa41f1804 Allow hq color jittering and corruptions that are not included in the corruption factor 2021-06-30 09:44:46 -06:00
James Betker
6fd16ea9c8 Add meta-anomaly detection, colorjitter augmentation 2021-06-29 13:41:55 -06:00
James Betker
46e9f62be0 Add unet with latent guide
This is a diffusion network that uses both a LQ image
and a reference sample HQ image that is compressed into
a latent vector to perform upsampling

The hope is that we can steer the upsampling network
with sample images.
2021-06-26 11:02:58 -06:00
James Betker
0ded106562 Merge remote-tracking branch 'origin/master' 2021-06-25 13:16:28 -06:00
James Betker
a57ed8e960 Various mods to support better jpeg image filtering 2021-06-25 13:16:15 -06:00
James Betker
61e7ca39cd
Update image_folder_dataset.py 2021-06-25 11:48:31 -06:00
James Betker
a0ef07ddb8
Create unet_latent_guide.py 2021-06-25 11:25:14 -06:00
James Betker
e7890dc0ba Misc fixes for diffusion nets 2021-06-21 10:38:07 -06:00
James Betker
8e3a33e001 Fix a bug where non-rank-0 is computing FID before all images are saved. 2021-06-16 16:27:09 -06:00
James Betker
68cbbed886 Add some cool diffusion testing scripts 2021-06-16 16:26:36 -06:00
James Betker
730c0135fd Guided diffusion documentation 2021-06-15 17:14:37 -06:00
James Betker
ae8de0cb9d fid saving images across all rank fix 2021-06-15 10:31:07 -06:00
James Betker
6a75bd0777 Another fix 2021-06-14 09:51:44 -06:00
James Betker
54bff35171 Fix issue where eval was not being used by all ddp processes 2021-06-14 09:50:04 -06:00
James Betker
60079a1572 Fix saver in distributed mode 2021-06-14 09:41:06 -06:00
James Betker
545f2db170 Distributed FID dataset across processes 2021-06-14 09:33:44 -06:00
James Betker
6b32c87dcb Try to make diffusion fid more deterministic 2021-06-14 09:27:43 -06:00
James Betker
5b4f86293f Add FID evaluator for diffusion models 2021-06-14 09:14:30 -06:00
James Betker
9cfe840872 Attempt to fix syncing multiple times when doing gradient accumulation 2021-06-13 14:30:30 -06:00
James Betker
1cd75dfd33 Fix ddp bug 2021-06-13 10:25:23 -06:00
James Betker
3e3ad7825f Add support for training an EMA network alongside the main networks 2021-06-12 21:01:41 -06:00
James Betker
696f320820 Get rid of feature networks 2021-06-11 20:50:07 -06:00
James Betker
65c474eecf Various changes to fix testing 2021-06-11 15:31:10 -06:00
James Betker
220f11a5e4 Half channel sizes in cifar_resnet 2021-06-09 17:06:37 -06:00