Commit Graph

1707 Commits

Author SHA1 Message Date
James Betker
2134f06516 Implement conditioning-free diffusion at the eval level 2022-02-27 15:11:42 -07:00
James Betker
436fe24822 Add conditioning-free guidance 2022-02-27 15:00:06 -07:00
James Betker
ac920798bb misc 2022-02-27 14:49:11 -07:00
James Betker
ba155e4e2f script for uploading models to the HF hub 2022-02-27 14:48:38 -07:00
James Betker
dbc74e96b2 w2v_matcher 2022-02-27 14:48:23 -07:00
James Betker
42879d7296 w2v_wrapper ramping dropout mode
this is an experimental feature that needs some testing
2022-02-27 14:47:51 -07:00
James Betker
c375287db9 Re-instate autocasting 2022-02-25 11:06:18 -07:00
James Betker
34ee32a90e get rid of autocasting in tts7 2022-02-24 21:53:51 -07:00
James Betker
f458f5d8f1 abort early if losses reach nan too much, and save the model 2022-02-24 20:55:30 -07:00
James Betker
18dc62453f Don't step if NaN losses are encountered. 2022-02-24 17:45:08 -07:00
James Betker
ea500ad42a Use clustered masking in udtts7 2022-02-24 07:57:26 -07:00
James Betker
7c17c8e674 gurgl 2022-02-23 21:28:24 -07:00
James Betker
e6824e398f Load dvae to cpu 2022-02-23 21:21:45 -07:00
James Betker
81017d9696 put frechet_distance on cuda 2022-02-23 21:21:13 -07:00
James Betker
9a7bbf33df f 2022-02-23 18:03:38 -07:00
James Betker
68726eac74 . 2022-02-23 17:58:07 -07:00
James Betker
b7319ab518 Support vocoder type diffusion in audio_diffusion_fid 2022-02-23 17:25:16 -07:00
James Betker
58f6c9805b adf 2022-02-22 23:12:58 -07:00
James Betker
03752c1cd6 Report NaN 2022-02-22 23:09:37 -07:00
James Betker
7201b4500c default text_to_sequence cleaners 2022-02-21 19:14:22 -07:00
James Betker
ba7f54c162 w2v: new inference function 2022-02-21 19:13:03 -07:00
James Betker
896ac029ae allow continuation of samples encountered 2022-02-21 19:12:50 -07:00
James Betker
6313a94f96 eval: integrate a n-gram language model into decoding 2022-02-21 19:12:34 -07:00
James Betker
af50afe222 pairedvoice: error out if clip is too short 2022-02-21 19:11:10 -07:00
James Betker
38802a96c8 remove timesteps from cond calculation 2022-02-21 12:32:21 -07:00
James Betker
668876799d unet_diffusion_tts7 2022-02-20 15:22:38 -07:00
James Betker
0872e17e60 unified_voice mods 2022-02-19 20:37:35 -07:00
James Betker
7b12799370 Reformat mel_text_clip for use in eval 2022-02-19 20:37:26 -07:00
James Betker
bcba65c539 DataParallel Fix 2022-02-19 20:36:35 -07:00
James Betker
34001ad765 et 2022-02-18 18:52:33 -07:00
James Betker
baf7b65566 Attempt to make w2v play with DDP AND checkpointing 2022-02-18 18:47:11 -07:00
James Betker
f3776f1992 reset ctc loss from "mean" to "sum" 2022-02-17 22:00:58 -07:00
James Betker
2b20da679c make spec_augment a parameter 2022-02-17 20:22:05 -07:00
James Betker
a813fbed9c Update to evaluator 2022-02-17 17:30:33 -07:00
James Betker
e1d71e1bd5 w2v_wrapper: get rid of ctc attention mask 2022-02-15 20:54:40 -07:00
James Betker
79e8f36d30 Convert CLIP models into new folder 2022-02-15 20:53:07 -07:00
James Betker
8f767b8b4f ... 2022-02-15 07:08:17 -07:00
James Betker
29e07913a8 Fix 2022-02-15 06:58:11 -07:00
James Betker
dd585df772 LAMB optimizer 2022-02-15 06:48:13 -07:00
James Betker
2bdb515068 A few mods to make wav2vec2 trainable with DDP on DLAS 2022-02-15 06:28:54 -07:00
James Betker
52b61b9f77 Update scripts and attempt to figure out how UnifiedVoice could be used to produce CTC codes 2022-02-13 20:48:06 -07:00
James Betker
a4f1641eea Add & refine WER evaluator for w2v 2022-02-13 20:47:29 -07:00
James Betker
e16af944c0 BSO fix 2022-02-12 20:01:04 -07:00
James Betker
29534180b2 w2v fine tuner 2022-02-12 20:00:59 -07:00
James Betker
0c3cc5ebad use script updates to fix output size disparities 2022-02-12 20:00:46 -07:00
James Betker
15fd60aad3 Allow EMA training to be disabled 2022-02-12 20:00:23 -07:00
James Betker
3252972057 ctc_code_gen mods 2022-02-12 19:59:54 -07:00
James Betker
35170c77b3 fix sweep 2022-02-11 11:43:11 -07:00
James Betker
c6b6d120fe fix ranking 2022-02-11 11:34:57 -07:00
James Betker
095944569c deep_update dicts 2022-02-11 11:32:25 -07:00