Commit Graph

1171 Commits

Author SHA1 Message Date
James Betker
97ea329a59 Make spleeter filter simpler (and hopefully much faster) 2021-09-17 15:29:42 -06:00
James Betker
359e9e27a7 unsupervised_audio_dataset: try to recover from failures of audio2numpy 2021-09-17 15:25:57 -06:00
James Betker
5c8d266d4f chk 2021-09-17 09:15:36 -06:00
James Betker
a6544f1684 More checkpointing fixes 2021-09-16 23:12:43 -06:00
James Betker
94899d88f3 Fix overuse of checkpointing 2021-09-16 23:00:28 -06:00
James Betker
f78ce9d924 Get diffusion_dvae ready for prime time! 2021-09-16 22:43:10 -06:00
James Betker
1197ae1928 Misc 2021-09-16 10:53:56 -06:00
James Betker
6f48674647 Support diffusion models with extra return values & inference in diffusion_dvae 2021-09-16 10:53:46 -06:00
James Betker
8d9857f33d More fixes 2021-09-14 20:45:05 -06:00
James Betker
9a9c90660f Fixes 2021-09-14 18:29:17 -06:00
James Betker
4334a67924 Spleeter mods 2021-09-14 17:43:40 -06:00
James Betker
0382660159 Get diffusion_dvae functional 2021-09-14 17:43:31 -06:00
James Betker
e513052fca Add unsupervised_audio_dataset 2021-09-14 17:43:16 -06:00
James Betker
bc603c3231 Script adjustments and fixes 2021-09-12 21:26:45 -06:00
James Betker
76e2c497f7 Improvements to splitter 2021-09-09 23:34:56 -06:00
James Betker
742f9b4010 Batch spleeter cleaner using GPU 2021-09-09 23:14:32 -06:00
James Betker
73b930c0f6 Add diffusion_dvae
Increase split_on_silence interval
2021-09-09 16:22:05 -06:00
James Betker
b8f2e0f452 mydvae 2021-09-06 17:45:30 -06:00
James Betker
92e7e57f81 Update diffusion_noise_surfer to support audio 2021-09-01 08:34:47 -06:00
James Betker
3e073cff85 Set kernel_size in diffusion_vocoder 2021-09-01 08:33:46 -06:00
James Betker
30cd33fe44 another fix 2021-08-31 14:46:46 -06:00
James Betker
8810d3de97 fix wavfile_dataset 2021-08-31 14:45:29 -06:00
James Betker
dabd87246d Add unet_diffusion_vocoder 2021-08-31 14:38:33 -06:00
James Betker
fb69985dfd Update gitignore 2021-08-31 11:36:50 -06:00
James Betker
274d352e6f dug 2021-08-30 21:45:58 -06:00
James Betker
f1a0c21fb2 asr_eval 2021-08-30 21:41:34 -06:00
James Betker
ed6eae407f More scripts for splitting and formatting audio 2021-08-30 21:20:52 -06:00
James Betker
909754cc27 Add find_faulty_files.py 2021-08-25 18:00:43 -06:00
James Betker
08b33c8e3a Support silu activation 2021-08-25 09:03:14 -06:00
James Betker
67bf7f5219 dvae mods
Trying to squeeze as much performance out of this net as possible
2021-08-25 08:55:13 -06:00
James Betker
d05cc1f46c Misc 2021-08-24 17:12:04 -06:00
James Betker
9dfe936c16 Fix ddp for sampler 2021-08-19 16:45:34 -06:00
James Betker
b521d94b01 Make gpt-asr more configurable 2021-08-19 16:33:41 -06:00
James Betker
570ed327ed Stop dataset - attempt #2 2021-08-18 18:29:38 -06:00
James Betker
17453ccbe8 Revert mods to lrdvae
They didn't really change anything
2021-08-17 09:09:29 -06:00
James Betker
8332923f5c Two more tools to test the audio segmentor 2021-08-17 09:09:11 -06:00
James Betker
7c086d0c2c libritts - only write on successful check 2021-08-16 22:52:55 -06:00
James Betker
93e903af15 Rework wavfile dataset to be usable for things other than augments 2021-08-16 22:52:35 -06:00
James Betker
d7f30232c3 Oh yeah 2021-08-16 22:52:15 -06:00
James Betker
4c01d82265 Fix for voxpopuli 2021-08-16 22:52:05 -06:00
James Betker
1fede41b7b Audio segmentor 2021-08-16 22:51:53 -06:00
James Betker
2d3372054d Add support for voxpopuli to nv_tacotron_dataset 2021-08-16 17:13:40 -06:00
James Betker
729c1fd5a9 Fix up max lengths to save memory 2021-08-15 21:29:28 -06:00
James Betker
9e47e64d5a Add gpt_segmentor model
The idea is to specifically train a model that extracts phrases from
audio clips.
2021-08-15 21:23:07 -06:00
James Betker
a826d5f658 Mods to dvae
- Add resblock to each layer
- Increase filter size for each layer
- Use SiLU
2021-08-15 20:54:10 -06:00
James Betker
b8bec22f1a Fix gpt_asr inference bug 2021-08-15 20:53:42 -06:00
James Betker
3580c52eac Fix up wavfile_dataset to be able to provide a full clip 2021-08-15 20:53:26 -06:00
James Betker
a523c4f932 Auto-normalize wav files by data type 2021-08-15 09:09:51 -06:00
James Betker
98057b6516 Make lrdvae use quantized mode in eval() 2021-08-14 23:43:01 -06:00
James Betker
c28f657ab8 Allow usage of pre-rendered mels saved to npy files 2021-08-14 23:38:15 -06:00