James Betker
0604060580
Finish up mods for next version of GptAsrHf
2021-11-20 21:33:49 -07:00
James Betker
14f3155ec4
misc
2021-11-20 17:45:14 -07:00
James Betker
687e0746b3
Add Torch-derived MelSpectrogramInjector
2021-11-18 20:02:45 -07:00
James Betker
c30a38cdf1
Undo baseline GDI changes
2021-11-18 20:02:09 -07:00
James Betker
f36bab95dd
Audio resample injector
2021-11-10 20:06:33 -07:00
James Betker
79367f753d
Fix error & add nonfinite warning
2021-11-09 23:58:41 -07:00
James Betker
d43f25cc20
Update losses
2021-11-08 20:10:07 -07:00
James Betker
596a62fe01
Apply fix to gpt_asr_hf and prep it for inference
...
Fix is that we were predicting two characters in advance, not next character
2021-11-04 10:09:24 -06:00
James Betker
993bd52d42
Add spec_augment injector
2021-11-01 18:43:11 -06:00
James Betker
ee9b199d2b
Build in capacity to revert & resume networks that encounter a NaN
...
I'm increasingly seeing issues where something like this can be useful. In many (most?)
cases it's just a waste of compute, though. Still, better than a cold computer for a whole
night.
2021-11-01 16:14:59 -06:00
James Betker
87364b890f
Add custom clip_grad_norm that prints out the param names in error.
2021-11-01 11:12:20 -06:00
James Betker
b404a3b747
Revert recent changes to extr
2021-10-30 20:48:06 -06:00
James Betker
e9dc37f19c
Mod trainer to copy config file into experiments root
2021-10-30 17:00:24 -06:00
James Betker
928e7026c2
Mod STFT injector to be specifiable
2021-10-28 22:34:12 -06:00
James Betker
c3421b7f6d
Dataset work for audio quality processor
2021-10-24 09:09:34 -06:00
James Betker
9a3e89ec53
Force LR fix
2021-10-21 12:01:01 -06:00
James Betker
40cb25292a
Fix force_lr logic
2021-10-21 11:51:30 -06:00
James Betker
d016a2fbad
Go back to vanilla flavor of diffusion
2021-10-17 17:32:46 -06:00
James Betker
4914c526dc
More cleanup
2021-09-29 14:24:49 -06:00
James Betker
e24c619387
Fix
2021-09-23 16:07:58 -06:00
James Betker
5c8d266d4f
chk
2021-09-17 09:15:36 -06:00
James Betker
94899d88f3
Fix overuse of checkpointing
2021-09-16 23:00:28 -06:00
James Betker
f78ce9d924
Get diffusion_dvae ready for prime time!
2021-09-16 22:43:10 -06:00
James Betker
6f48674647
Support diffusion models with extra return values & inference in diffusion_dvae
2021-09-16 10:53:46 -06:00
James Betker
b8f2e0f452
mydvae
2021-09-06 17:45:30 -06:00
James Betker
92e7e57f81
Update diffusion_noise_surfer to support audio
2021-09-01 08:34:47 -06:00
James Betker
3e073cff85
Set kernel_size in diffusion_vocoder
2021-09-01 08:33:46 -06:00
James Betker
dabd87246d
Add unet_diffusion_vocoder
2021-08-31 14:38:33 -06:00
James Betker
909754cc27
Add find_faulty_files.py
2021-08-25 18:00:43 -06:00
James Betker
cfd284f425
Fix up some stuff that allows the MEL to be computed on-GPU
2021-08-13 18:35:55 -06:00
James Betker
cdee31c60b
GPT_ASR
2021-08-13 15:02:18 -06:00
James Betker
04d14b3acc
No batch factors for eval
2021-08-09 16:02:01 -06:00
James Betker
82fc69abfa
Add "pure" evaluator
...
Which simply computes the training loss against an eval dataset
2021-08-09 14:58:35 -06:00
James Betker
b43683b772
Add lucidrains_dvae
2021-08-06 12:03:46 -06:00
James Betker
3ca51e80b2
Only fix weird path bug in windows
2021-08-05 22:21:25 -06:00
James Betker
5037220ac7
Mods to support contrastive learning on audio files
2021-08-05 05:57:04 -06:00
James Betker
341f28dd82
It works!
2021-08-04 20:07:51 -06:00
James Betker
4c98b9703f
Get dalle-style TTS to "work"
2021-08-03 21:08:27 -06:00
James Betker
2814307eee
Alterations to support VQVAE on mel spectrograms
2021-08-01 07:54:21 -06:00
James Betker
965f6e6b52
Fixes to weight_decay in adamw
2021-07-31 15:58:41 -06:00
James Betker
0c9e75bc69
Improvements to GptTts
2021-07-31 15:57:57 -06:00
James Betker
96e90e7047
Add support for a gaussian-diffusion-based wave tacotron
2021-07-26 16:27:31 -06:00
James Betker
97d7cbbc34
Additional work for audio xformer (which doesnt really do a great job)
2021-07-23 10:58:14 -06:00
James Betker
2325e7a88c
Allow inference for vqvae
2021-07-20 10:40:05 -06:00
James Betker
d81386c1be
Mods to support vqvae in audio mode (1d)
2021-07-20 08:36:46 -06:00
James Betker
5584cfcc7a
tacotron2 work
2021-07-14 21:41:57 -06:00
James Betker
be2745f42d
Add waveglow & inference capabilities to audio generator
2021-07-08 23:07:36 -06:00
James Betker
1ff434218e
tacotron2, ready for prime time!
2021-07-08 22:13:44 -06:00
James Betker
86fd3ad7fd
Initial checkin of nvidia tacotron model & dataset
...
These two are tested, full support for training to come.
2021-07-06 11:11:35 -06:00
James Betker
6fd16ea9c8
Add meta-anomaly detection, colorjitter augmentation
2021-06-29 13:41:55 -06:00