Commit Graph

1872 Commits

Author SHA1 Message Date
James Betker
1d29999648 Uupdates to the TTS production scripts 2022-02-03 20:00:01 -07:00
James Betker
bc506d4bcd Mods to unet_diffusion_tts6 to support super resolution mode 2022-02-03 19:59:39 -07:00
James Betker
4249681c4b Mods to support a autoregressive CTC code generator 2022-02-03 19:58:54 -07:00
James Betker
8132766d38 tts6 2022-01-31 20:15:06 -07:00
James Betker
fbea6e8eac Adjustments to diffusion networks 2022-01-30 16:14:06 -07:00
James Betker
e58dab14c3 new diffusion updates from testing 2022-01-29 11:01:01 -07:00
James Betker
935a4e853e get rid of nil tokens in <2> 2022-01-27 22:45:57 -07:00
James Betker
0152174c0e Add wandb_step_factor argument 2022-01-27 19:58:58 -07:00
James Betker
e0e36ed98c Update use_diffuse_tts 2022-01-27 19:57:28 -07:00
James Betker
a77d376ad2 rename unet diffusion tts and add 3 2022-01-27 19:56:24 -07:00
James Betker
7badbf1b4d update usage scripts 2022-01-25 17:57:26 -07:00
James Betker
8c255811ad more fixes 2022-01-25 17:57:16 -07:00
James Betker
0f3ca28e39 Allow diffusion model to be trained with masking tokens 2022-01-25 14:26:21 -07:00
James Betker
798ed7730a i like wasting time 2022-01-24 18:12:08 -07:00
James Betker
fc09cff4b3 angry 2022-01-24 18:09:29 -07:00
James Betker
cc0d9f7216 Fix 2022-01-24 18:05:45 -07:00
James Betker
3a9e3a9db3 consolidate state 2022-01-24 17:59:31 -07:00
James Betker
dfef34ba39 Load ema to cpu memory if specified 2022-01-24 15:08:29 -07:00
James Betker
49edffb6ad Revise device mapping 2022-01-24 15:08:13 -07:00
James Betker
33511243d5 load model state dicts into the correct device
it's not clear to me that this will make a huge difference, but it's a good idea anyways
2022-01-24 14:40:09 -07:00
James Betker
3e16c509f6 Misc fixes 2022-01-24 14:31:43 -07:00
James Betker
e2ed0adbd8 use_diffuse_tts updates 2022-01-24 14:31:28 -07:00
James Betker
e420df479f Allow steps to specify which state keys to carry forward (reducing memory utilization) 2022-01-24 11:01:27 -07:00
James Betker
62475005e4 Sort data items in descending order, which I suspect will improve performance because we will hit GC less 2022-01-23 19:05:32 -07:00
James Betker
d18aec793a Revert "(re) attempt diffusion checkpointing logic"
This reverts commit b22eec8fe3.
2022-01-22 09:14:50 -07:00
James Betker
b22eec8fe3 (re) attempt diffusion checkpointing logic 2022-01-22 08:34:40 -07:00
James Betker
8f48848f91 misc 2022-01-22 08:23:29 -07:00
James Betker
851070075a text<->cond clip
I need that universal clip..
2022-01-22 08:23:14 -07:00
James Betker
8ada52ccdc Update LR layers to checkpoint better 2022-01-22 08:22:57 -07:00
James Betker
ce929a6b3f Allow grad scaler to be enabled even in fp32 mode 2022-01-21 23:13:24 -07:00
James Betker
91b4b240ac dont pickle unique files 2022-01-21 00:02:06 -07:00
James Betker
7fef7fb9ff Update fast_paired_dataset to report how many audio files it is actually using 2022-01-20 21:49:38 -07:00
James Betker
ed35cfe393 Update inference scripts 2022-01-20 11:28:50 -07:00
James Betker
20312211e0 Fix bug in code alignment 2022-01-20 11:28:12 -07:00
James Betker
8e2439f50d Decrease resolution requirements to 2048 2022-01-20 11:27:49 -07:00
James Betker
4af8525dc3 Adjust diffusion vocoder to allow training individual levels 2022-01-19 13:37:59 -07:00
James Betker
ac13bfefe8 use_diffuse_tts 2022-01-19 00:35:24 -07:00
James Betker
bcd8cc51e1 Enable collated data for diffusion purposes 2022-01-19 00:35:08 -07:00
James Betker
dc9cd8c206 Update use_gpt_tts to be usable with unified_voice2 2022-01-18 21:14:17 -07:00
James Betker
7b4544b83a Add an experimental unet_diffusion_tts to perform experiments on 2022-01-18 08:38:24 -07:00
James Betker
b6190e96b2 fast_paired 2022-01-17 15:46:02 -07:00
James Betker
1d30d79e34 De-specify fast-paired-dataset 2022-01-16 21:20:00 -07:00
James Betker
2b36ca5f8e Revert paired back 2022-01-16 21:10:46 -07:00
James Betker
ad3e7df086 Split the fast random into its own new dataset 2022-01-16 21:10:11 -07:00
James Betker
7331862755 Updated paired to randomly index data, offsetting memory costs and speeding up initialization 2022-01-16 21:09:22 -07:00
James Betker
37e4e737b5 a few fixes 2022-01-16 15:17:17 -07:00
James Betker
35db5ebf41 paired_voice_audio_dataset - aligned codes support 2022-01-15 17:38:26 -07:00
James Betker
3f177cd2b3 requirements 2022-01-15 17:28:59 -07:00
James Betker
b398ecca01 wer fix 2022-01-15 17:28:17 -07:00
James Betker
9100e7fa9b Add a diffusion network that takes aligned text instead of MELs 2022-01-15 17:28:02 -07:00