Commit Graph

1587 Commits

Author SHA1 Message Date
James Betker
3c242403f5 adjust location of pre-optimizer step so I can visualize the new grad norms 2022-03-04 08:56:42 -07:00
James Betker
58019a2ce3 audio diffusion fid updates 2022-03-03 21:53:32 -07:00
James Betker
998c53ad4f w2v_matcher mods 2022-03-03 21:52:51 -07:00
James Betker
9029e4f20c Add a base-wrapper 2022-03-03 21:52:28 -07:00
James Betker
6873ad6660 Support functionality 2022-03-03 21:52:16 -07:00
James Betker
6af5d129ce Add experimental gradient boosting into tts7 2022-03-03 21:51:40 -07:00
James Betker
7ea84f1ac3 asdf 2022-03-03 13:43:44 -07:00
James Betker
3cd6c7f428 Get rid of unused codes in vq 2022-03-03 13:41:38 -07:00
James Betker
619da9ea28 Get rid of discretization loss 2022-03-03 13:36:25 -07:00
James Betker
beb7c8a39d asdf 2022-03-01 21:41:31 -07:00
James Betker
70fa780edb Add mechanism to export grad norms 2022-03-01 20:19:52 -07:00
James Betker
d9f8f92840 Codified fp16 2022-03-01 15:46:04 -07:00
James Betker
45ab444c04 Rework minicoder to always checkpoint 2022-03-01 14:09:18 -07:00
James Betker
db0c3340ac Implement guidance-free diffusion in eval
And a few other fixes
2022-03-01 11:49:36 -07:00
James Betker
2134f06516 Implement conditioning-free diffusion at the eval level 2022-02-27 15:11:42 -07:00
James Betker
436fe24822 Add conditioning-free guidance 2022-02-27 15:00:06 -07:00
James Betker
ac920798bb misc 2022-02-27 14:49:11 -07:00
James Betker
ba155e4e2f script for uploading models to the HF hub 2022-02-27 14:48:38 -07:00
James Betker
dbc74e96b2 w2v_matcher 2022-02-27 14:48:23 -07:00
James Betker
42879d7296 w2v_wrapper ramping dropout mode
this is an experimental feature that needs some testing
2022-02-27 14:47:51 -07:00
James Betker
c375287db9 Re-instate autocasting 2022-02-25 11:06:18 -07:00
James Betker
34ee32a90e get rid of autocasting in tts7 2022-02-24 21:53:51 -07:00
James Betker
f458f5d8f1 abort early if losses reach nan too much, and save the model 2022-02-24 20:55:30 -07:00
James Betker
18dc62453f Don't step if NaN losses are encountered. 2022-02-24 17:45:08 -07:00
James Betker
ea500ad42a Use clustered masking in udtts7 2022-02-24 07:57:26 -07:00
James Betker
7c17c8e674 gurgl 2022-02-23 21:28:24 -07:00
James Betker
e6824e398f Load dvae to cpu 2022-02-23 21:21:45 -07:00
James Betker
81017d9696 put frechet_distance on cuda 2022-02-23 21:21:13 -07:00
James Betker
9a7bbf33df f 2022-02-23 18:03:38 -07:00
James Betker
68726eac74 . 2022-02-23 17:58:07 -07:00
James Betker
b7319ab518 Support vocoder type diffusion in audio_diffusion_fid 2022-02-23 17:25:16 -07:00
James Betker
58f6c9805b adf 2022-02-22 23:12:58 -07:00
James Betker
03752c1cd6 Report NaN 2022-02-22 23:09:37 -07:00
James Betker
7201b4500c default text_to_sequence cleaners 2022-02-21 19:14:22 -07:00
James Betker
ba7f54c162 w2v: new inference function 2022-02-21 19:13:03 -07:00
James Betker
896ac029ae allow continuation of samples encountered 2022-02-21 19:12:50 -07:00
James Betker
6313a94f96 eval: integrate a n-gram language model into decoding 2022-02-21 19:12:34 -07:00
James Betker
af50afe222 pairedvoice: error out if clip is too short 2022-02-21 19:11:10 -07:00
James Betker
38802a96c8 remove timesteps from cond calculation 2022-02-21 12:32:21 -07:00
James Betker
668876799d unet_diffusion_tts7 2022-02-20 15:22:38 -07:00
James Betker
0872e17e60 unified_voice mods 2022-02-19 20:37:35 -07:00
James Betker
7b12799370 Reformat mel_text_clip for use in eval 2022-02-19 20:37:26 -07:00
James Betker
bcba65c539 DataParallel Fix 2022-02-19 20:36:35 -07:00
James Betker
34001ad765 et 2022-02-18 18:52:33 -07:00
James Betker
baf7b65566 Attempt to make w2v play with DDP AND checkpointing 2022-02-18 18:47:11 -07:00
James Betker
f3776f1992 reset ctc loss from "mean" to "sum" 2022-02-17 22:00:58 -07:00
James Betker
2b20da679c make spec_augment a parameter 2022-02-17 20:22:05 -07:00
James Betker
a813fbed9c Update to evaluator 2022-02-17 17:30:33 -07:00
James Betker
e1d71e1bd5 w2v_wrapper: get rid of ctc attention mask 2022-02-15 20:54:40 -07:00
James Betker
79e8f36d30 Convert CLIP models into new folder 2022-02-15 20:53:07 -07:00
James Betker
8f767b8b4f ... 2022-02-15 07:08:17 -07:00
James Betker
29e07913a8 Fix 2022-02-15 06:58:11 -07:00
James Betker
dd585df772 LAMB optimizer 2022-02-15 06:48:13 -07:00
James Betker
2bdb515068 A few mods to make wav2vec2 trainable with DDP on DLAS 2022-02-15 06:28:54 -07:00
James Betker
52b61b9f77 Update scripts and attempt to figure out how UnifiedVoice could be used to produce CTC codes 2022-02-13 20:48:06 -07:00
James Betker
a4f1641eea Add & refine WER evaluator for w2v 2022-02-13 20:47:29 -07:00
James Betker
e16af944c0 BSO fix 2022-02-12 20:01:04 -07:00
James Betker
29534180b2 w2v fine tuner 2022-02-12 20:00:59 -07:00
James Betker
0c3cc5ebad use script updates to fix output size disparities 2022-02-12 20:00:46 -07:00
James Betker
15fd60aad3 Allow EMA training to be disabled 2022-02-12 20:00:23 -07:00
James Betker
3252972057 ctc_code_gen mods 2022-02-12 19:59:54 -07:00
James Betker
35170c77b3 fix sweep 2022-02-11 11:43:11 -07:00
James Betker
c6b6d120fe fix ranking 2022-02-11 11:34:57 -07:00
James Betker
095944569c deep_update dicts 2022-02-11 11:32:25 -07:00
James Betker
ab1f6e8ac6 deepcopy map 2022-02-11 11:29:32 -07:00
James Betker
496fb81997 use fork instead 2022-02-11 11:22:25 -07:00
James Betker
4abc094b47 fix train bug 2022-02-11 11:18:15 -07:00
James Betker
006add64c5 sweep fix 2022-02-11 11:17:08 -07:00
James Betker
102142d1eb f 2022-02-11 11:05:13 -07:00
James Betker
40b08a52d0 dafuk 2022-02-11 11:01:31 -07:00
James Betker
f6a7f12cad Remove broken evaluator 2022-02-11 11:00:29 -07:00
James Betker
46b97049dc Fix eval 2022-02-11 10:59:32 -07:00
James Betker
5175b7d91a training sweeper checkin 2022-02-11 10:46:37 -07:00
James Betker
302ac8652d Undo mask during training 2022-02-11 09:35:12 -07:00
James Betker
618a20412a new rev of ctc_code_gen with surrogate LM loss 2022-02-10 23:09:57 -07:00
James Betker
d1d1ae32a1 audio diffusion frechet distance measurement! 2022-02-10 22:55:46 -07:00
James Betker
23a310b488 Fix BSO 2022-02-10 20:54:51 -07:00
James Betker
1e28e02f98 BSO improvement to make it work with distributed optimizers 2022-02-10 09:53:13 -07:00
James Betker
836eb08afb Update BSO to use the proper step size 2022-02-10 09:44:15 -07:00
James Betker
820a29f81e ctc code gen mods 2022-02-10 09:44:01 -07:00
James Betker
ac9417b956 ctc_code_gen: mask out all padding tokens 2022-02-09 17:26:30 -07:00
James Betker
a930f2576e Begin a migration to specifying training rate on megasamples instead of arbitrary "steps"
This should help me greatly in tuning models.  It's also necessary now that batch size isn't really
respected; we simply step once the gradient direction becomes unstable.
2022-02-09 17:25:05 -07:00
James Betker
93ca619267 script updates 2022-02-09 14:26:52 -07:00
James Betker
ddb77ef502 ctc_code_gen: use a mean() on the ConditioningEncoder 2022-02-09 14:26:44 -07:00
James Betker
3d946356f8 batch_size_optimizer works. sweet! no more tuning batch sizes. 2022-02-09 14:26:23 -07:00
James Betker
18938248e4 Add batch_size_optimizer support 2022-02-08 23:51:31 -07:00
James Betker
9e9ae328f2 mild updates 2022-02-08 23:51:17 -07:00
James Betker
ff35d13b99 Use non-uniform noise in diffusion_tts6 2022-02-08 07:27:41 -07:00
James Betker
f44b064c5e Update scripts 2022-02-07 19:43:18 -07:00
James Betker
34fbb78671 Straight CtcCodeGenerator as an encoder 2022-02-07 15:46:46 -07:00
James Betker
c24682c668 Record load times in fast_paired_dataset 2022-02-07 15:45:38 -07:00
James Betker
65a546c4d7 Fix for tts6 2022-02-05 16:00:14 -07:00
James Betker
5ae816bead ctc gen checkin 2022-02-05 15:59:53 -07:00
James Betker
bb3d1ab03d More cleanup 2022-02-04 11:06:17 -07:00
James Betker
5cc342de66 Clean up 2022-02-04 11:00:42 -07:00
James Betker
8fb147e8ab add an autoregressive ctc code generator 2022-02-04 11:00:15 -07:00
James Betker
7f4fc55344 Update SR model 2022-02-03 21:42:53 -07:00
James Betker
de1a1d501a Move audio injectors into their own file 2022-02-03 21:42:37 -07:00
James Betker
687393de59 Add a better split_on_silence (processing_pipeline)
Going to extend this a bit more going forwards to support the entire pipeline.
2022-02-03 20:00:26 -07:00
James Betker
1d29999648 Uupdates to the TTS production scripts 2022-02-03 20:00:01 -07:00