Commit Graph

838 Commits

Author SHA1 Message Date
James Betker
0dee15f875 base DVAE & vector_quantizer 2021-10-20 21:19:38 -06:00
James Betker
f2a31702b5 Clean stuff up, move more things into arch_util 2021-10-20 21:19:25 -06:00
James Betker
a6f0f854b9 Fix codes when inferring from dvae 2021-10-17 22:51:17 -06:00
James Betker
d016a2fbad Go back to vanilla flavor of diffusion 2021-10-17 17:32:46 -06:00
James Betker
23da073037 Norm decoder outputs now 2021-10-16 09:07:10 -06:00
James Betker
0edc98f6c4 Throw out the idea of conditioning on discrete codes. Oh well :( 2021-10-16 09:02:01 -06:00
James Betker
62c8c5d93e Zero out spectrogram code inputs initially. 2021-10-15 12:10:11 -06:00
James Betker
1d0b44ebc2 More tweaks to diffusion-vocoder 2021-10-15 11:51:17 -06:00
James Betker
3b19581f9a Allow num_resblocks to specified per-level 2021-10-14 11:26:04 -06:00
James Betker
83798887a8 Mods to support unet diffusion vocoder with conditioning 2021-10-13 21:23:18 -06:00
James Betker
33120cb35c Add norming to discretization_loss 2021-10-06 17:10:50 -06:00
James Betker
f2977d360c Allow attention_dim in channel attention to be specified, add converter 2021-10-05 17:29:38 -06:00
James Betker
9c0d7288ea Discretization loss attempt 2021-10-04 20:59:21 -06:00
James Betker
66f99a159c Rev2 2021-10-03 15:20:50 -06:00
James Betker
09f373e3b1 Add dvae with channel attention 2021-10-03 10:52:01 -06:00
James Betker
0396a9d2ca Increase baseline codes recording across all dvae models 2021-09-30 08:09:07 -06:00
James Betker
f84ccbdfb2 Fix quantizer with balancing_heuristic 2021-09-29 14:46:05 -06:00
James Betker
4914c526dc More cleanup 2021-09-29 14:24:49 -06:00
James Betker
6e550edfe3 Attentive dvae 2021-09-29 14:17:29 -06:00
James Betker
55b58fb67f Clean up codebase
Remove stuff that I'm likely not going to use again (or generally failed experiments)
2021-09-29 09:21:44 -06:00
James Betker
4d1a42e944 Add switchnorm to gumbel_quantizer 2021-09-24 18:49:25 -06:00
James Betker
ac57cdc794 Add scheduling to quantizer, enable cudnn_benchmarking to be disabled 2021-09-24 17:01:36 -06:00
James Betker
3e64e847c2 Gumbel quantizer 2021-09-23 23:32:03 -06:00
James Betker
c5297ccec6 Add dvae balancing heuristic 2021-09-23 21:19:36 -06:00
James Betker
e24c619387 Fix 2021-09-23 16:07:58 -06:00
James Betker
6833048bf7 Alterations to diffusion_dvae so it can be used directly on spectrograms 2021-09-23 15:56:25 -06:00
James Betker
5c8d266d4f chk 2021-09-17 09:15:36 -06:00
James Betker
a6544f1684 More checkpointing fixes 2021-09-16 23:12:43 -06:00
James Betker
94899d88f3 Fix overuse of checkpointing 2021-09-16 23:00:28 -06:00
James Betker
f78ce9d924 Get diffusion_dvae ready for prime time! 2021-09-16 22:43:10 -06:00
James Betker
6f48674647 Support diffusion models with extra return values & inference in diffusion_dvae 2021-09-16 10:53:46 -06:00
James Betker
0382660159 Get diffusion_dvae functional 2021-09-14 17:43:31 -06:00
James Betker
76e2c497f7 Improvements to splitter 2021-09-09 23:34:56 -06:00
James Betker
742f9b4010 Batch spleeter cleaner using GPU 2021-09-09 23:14:32 -06:00
James Betker
73b930c0f6 Add diffusion_dvae
Increase split_on_silence interval
2021-09-09 16:22:05 -06:00
James Betker
b8f2e0f452 mydvae 2021-09-06 17:45:30 -06:00
James Betker
3e073cff85 Set kernel_size in diffusion_vocoder 2021-09-01 08:33:46 -06:00
James Betker
dabd87246d Add unet_diffusion_vocoder 2021-08-31 14:38:33 -06:00
James Betker
909754cc27 Add find_faulty_files.py 2021-08-25 18:00:43 -06:00
James Betker
08b33c8e3a Support silu activation 2021-08-25 09:03:14 -06:00
James Betker
67bf7f5219 dvae mods
Trying to squeeze as much performance out of this net as possible
2021-08-25 08:55:13 -06:00
James Betker
b521d94b01 Make gpt-asr more configurable 2021-08-19 16:33:41 -06:00
James Betker
570ed327ed Stop dataset - attempt #2 2021-08-18 18:29:38 -06:00
James Betker
17453ccbe8 Revert mods to lrdvae
They didn't really change anything
2021-08-17 09:09:29 -06:00
James Betker
8332923f5c Two more tools to test the audio segmentor 2021-08-17 09:09:11 -06:00
James Betker
1fede41b7b Audio segmentor 2021-08-16 22:51:53 -06:00
James Betker
729c1fd5a9 Fix up max lengths to save memory 2021-08-15 21:29:28 -06:00
James Betker
9e47e64d5a Add gpt_segmentor model
The idea is to specifically train a model that extracts phrases from
audio clips.
2021-08-15 21:23:07 -06:00
James Betker
a826d5f658 Mods to dvae
- Add resblock to each layer
- Increase filter size for each layer
- Use SiLU
2021-08-15 20:54:10 -06:00
James Betker
b8bec22f1a Fix gpt_asr inference bug 2021-08-15 20:53:42 -06:00