Commit Graph

829 Commits

Author SHA1 Message Date
James Betker
83798887a8 Mods to support unet diffusion vocoder with conditioning 2021-10-13 21:23:18 -06:00
James Betker
33120cb35c Add norming to discretization_loss 2021-10-06 17:10:50 -06:00
James Betker
f2977d360c Allow attention_dim in channel attention to be specified, add converter 2021-10-05 17:29:38 -06:00
James Betker
9c0d7288ea Discretization loss attempt 2021-10-04 20:59:21 -06:00
James Betker
66f99a159c Rev2 2021-10-03 15:20:50 -06:00
James Betker
09f373e3b1 Add dvae with channel attention 2021-10-03 10:52:01 -06:00
James Betker
0396a9d2ca Increase baseline codes recording across all dvae models 2021-09-30 08:09:07 -06:00
James Betker
f84ccbdfb2 Fix quantizer with balancing_heuristic 2021-09-29 14:46:05 -06:00
James Betker
4914c526dc More cleanup 2021-09-29 14:24:49 -06:00
James Betker
6e550edfe3 Attentive dvae 2021-09-29 14:17:29 -06:00
James Betker
55b58fb67f Clean up codebase
Remove stuff that I'm likely not going to use again (or generally failed experiments)
2021-09-29 09:21:44 -06:00
James Betker
4d1a42e944 Add switchnorm to gumbel_quantizer 2021-09-24 18:49:25 -06:00
James Betker
ac57cdc794 Add scheduling to quantizer, enable cudnn_benchmarking to be disabled 2021-09-24 17:01:36 -06:00
James Betker
3e64e847c2 Gumbel quantizer 2021-09-23 23:32:03 -06:00
James Betker
c5297ccec6 Add dvae balancing heuristic 2021-09-23 21:19:36 -06:00
James Betker
e24c619387 Fix 2021-09-23 16:07:58 -06:00
James Betker
6833048bf7 Alterations to diffusion_dvae so it can be used directly on spectrograms 2021-09-23 15:56:25 -06:00
James Betker
5c8d266d4f chk 2021-09-17 09:15:36 -06:00
James Betker
a6544f1684 More checkpointing fixes 2021-09-16 23:12:43 -06:00
James Betker
94899d88f3 Fix overuse of checkpointing 2021-09-16 23:00:28 -06:00
James Betker
f78ce9d924 Get diffusion_dvae ready for prime time! 2021-09-16 22:43:10 -06:00
James Betker
6f48674647 Support diffusion models with extra return values & inference in diffusion_dvae 2021-09-16 10:53:46 -06:00
James Betker
0382660159 Get diffusion_dvae functional 2021-09-14 17:43:31 -06:00
James Betker
76e2c497f7 Improvements to splitter 2021-09-09 23:34:56 -06:00
James Betker
742f9b4010 Batch spleeter cleaner using GPU 2021-09-09 23:14:32 -06:00
James Betker
73b930c0f6 Add diffusion_dvae
Increase split_on_silence interval
2021-09-09 16:22:05 -06:00
James Betker
b8f2e0f452 mydvae 2021-09-06 17:45:30 -06:00
James Betker
3e073cff85 Set kernel_size in diffusion_vocoder 2021-09-01 08:33:46 -06:00
James Betker
dabd87246d Add unet_diffusion_vocoder 2021-08-31 14:38:33 -06:00
James Betker
909754cc27 Add find_faulty_files.py 2021-08-25 18:00:43 -06:00
James Betker
08b33c8e3a Support silu activation 2021-08-25 09:03:14 -06:00
James Betker
67bf7f5219 dvae mods
Trying to squeeze as much performance out of this net as possible
2021-08-25 08:55:13 -06:00
James Betker
b521d94b01 Make gpt-asr more configurable 2021-08-19 16:33:41 -06:00
James Betker
570ed327ed Stop dataset - attempt #2 2021-08-18 18:29:38 -06:00
James Betker
17453ccbe8 Revert mods to lrdvae
They didn't really change anything
2021-08-17 09:09:29 -06:00
James Betker
8332923f5c Two more tools to test the audio segmentor 2021-08-17 09:09:11 -06:00
James Betker
1fede41b7b Audio segmentor 2021-08-16 22:51:53 -06:00
James Betker
729c1fd5a9 Fix up max lengths to save memory 2021-08-15 21:29:28 -06:00
James Betker
9e47e64d5a Add gpt_segmentor model
The idea is to specifically train a model that extracts phrases from
audio clips.
2021-08-15 21:23:07 -06:00
James Betker
a826d5f658 Mods to dvae
- Add resblock to each layer
- Increase filter size for each layer
- Use SiLU
2021-08-15 20:54:10 -06:00
James Betker
b8bec22f1a Fix gpt_asr inference bug 2021-08-15 20:53:42 -06:00
James Betker
a523c4f932 Auto-normalize wav files by data type 2021-08-15 09:09:51 -06:00
James Betker
98057b6516 Make lrdvae use quantized mode in eval() 2021-08-14 23:43:01 -06:00
James Betker
ad3391bd96 Fix nan issue when interpolating audio 2021-08-14 20:42:01 -06:00
James Betker
d6a73acaed Allow processing of multiple audio sources at once from nv_tacotron_dataset 2021-08-14 16:04:05 -06:00
James Betker
007976082b GPT_asr for inference 2021-08-14 14:37:17 -06:00
James Betker
e1bdd3f7c7 Fix gpt_asr bug. Initial implementation of beam search 2021-08-13 22:47:00 -06:00
James Betker
cdee31c60b GPT_ASR 2021-08-13 15:02:18 -06:00
James Betker
f5a9b88ef6 tacotron cleaners: remove quotation marks
these don't really have relevance for tts or asr
2021-08-11 16:18:44 -06:00
James Betker
20586a8edc Fix LRDVAE bug with quantizer integration 2021-08-11 16:17:22 -06:00