Commit Graph

1789 Commits

Author SHA1 Message Date
James Betker
c1bdb4f9a1 degrade gumbel softmax over time 2022-05-17 16:23:04 -06:00
James Betker
3853f37257 stable layernorm 2022-05-17 16:07:03 -06:00
James Betker
6a2c29f596 Fix inverted logic 2022-05-17 15:39:07 -06:00
James Betker
519151d83f m2v 2022-05-17 15:37:59 -06:00
James Betker
d1de94d75c Stash mel2vec work (gonna throw it all away..) 2022-05-17 12:35:01 -06:00
James Betker
8202b9f39c some stuff 2022-05-15 21:50:54 -06:00
James Betker
ab5acead0e add exp loss for diffusion models 2022-05-15 21:50:38 -06:00
James Betker
ee218ab9b7 uv3 2022-05-13 17:57:47 -06:00
James Betker
eb64d18075 Fix phoneme tokenizer 2022-05-13 17:56:26 -06:00
James Betker
51f8c1bced phonetic dataset 2022-05-12 11:57:28 -06:00
James Betker
3d7e2a2846 fix collection 2022-05-11 21:50:05 -06:00
James Betker
ba2b71796a k 2022-05-11 21:20:06 -06:00
James Betker
efa737b685 re-add distributed collect to clvp 2022-05-11 21:14:18 -06:00
James Betker
545453077e uv3 2022-05-09 15:36:22 -06:00
James Betker
96a5cc66ee uv3 2022-05-09 15:35:51 -06:00
James Betker
b42b4e18de clean up unified voice
- remove unused code
- fix inference model to use the terms "prior" and "posterior" to properly define the modeling order (they were inverted before)
- default some settings I never intend to change in the future
2022-05-09 14:45:49 -06:00
James Betker
9118f58849 uncomment music projector.. 2022-05-09 09:19:26 -06:00
James Betker
74dd095326 a 2022-05-08 18:54:09 -06:00
James Betker
1177c35dec music fid updates 2022-05-08 18:49:39 -06:00
James Betker
7812c23c7a revert fill_gaps back to old masking behavior 2022-05-08 00:10:19 -06:00
James Betker
58ed27d7a8 new gap_filler 2022-05-07 12:44:23 -06:00
James Betker
6c8032b4be more work 2022-05-06 21:56:49 -06:00
James Betker
f541610256 contrastive_audio 2022-05-06 16:37:22 -06:00
James Betker
79543e5488 Simpler form of the wavegen model 2022-05-06 16:37:04 -06:00
James Betker
d8925ccde5 few things with gap filling 2022-05-06 14:33:44 -06:00
James Betker
b83b53cf84 norm mel 2022-05-06 00:49:54 -06:00
James Betker
b13d983c24 and mel_head 2022-05-06 00:25:27 -06:00
James Betker
d5fb79564a remove mel_pred 2022-05-06 00:24:05 -06:00
James Betker
e9bb692490 fixed aligned_latent 2022-05-06 00:20:21 -06:00
James Betker
1609101a42 musical gap filler 2022-05-05 16:47:08 -06:00
James Betker
d66ab2d28c Remove unused waveform_gens 2022-05-04 21:06:54 -06:00
James Betker
47662b9ec5 some random crap 2022-05-04 20:29:23 -06:00
James Betker
6655f7845a add pixel shuffling for 1d cases 2022-05-04 08:03:09 -06:00
James Betker
c42c53e75a Add a trainable network for converting a normal distribution into a latent space 2022-05-02 09:47:30 -06:00
James Betker
e402089556 abstractify 2022-05-02 00:11:26 -06:00
James Betker
ab219fbefb output variance 2022-05-02 00:10:33 -06:00
James Betker
3b074aac34 add checkpointing 2022-05-02 00:07:42 -06:00
James Betker
ae5f934ea1 diffwave 2022-05-02 00:05:04 -06:00
James Betker
f4254609c1 MDF
around and around in circles........
2022-05-01 23:04:56 -06:00
James Betker
b712d3b72b break out get_conditioning_latent from unified_voice 2022-05-01 23:04:44 -06:00
James Betker
afa2df57c9 gen3 2022-04-30 10:41:38 -06:00
James Betker
64c7582bf5 full pipeline 2022-04-28 22:47:26 -06:00
James Betker
8aa6651fc7 fix surrogate loss return in waveform_gen2 2022-04-28 10:10:11 -06:00
James Betker
e208d9fb80 gate augmentations with a flag 2022-04-28 10:09:22 -06:00
James Betker
3f67cb2023 music diffusion fid adjustments 2022-04-28 10:08:55 -06:00
James Betker
ab8176b217 audio prep misc 2022-04-28 10:08:38 -06:00
James Betker
f02b01bd9d reverse univnet classifier 2022-04-20 21:37:55 -06:00
James Betker
9df85c902e New gen2
Which is basically a autoencoder with a giant diffusion appendage attached
2022-04-20 21:37:34 -06:00
James Betker
b1c2c48720 music diffusion fid 2022-04-20 00:28:03 -06:00
James Betker
084b1c1527 file splitter 2022-04-20 00:27:49 -06:00
James Betker
b4549eed9f uv2 fix 2022-04-20 00:27:38 -06:00
James Betker
24fdafd855 fix2 2022-04-20 00:03:29 -06:00
James Betker
0af0051399 fix 2022-04-20 00:01:57 -06:00
James Betker
419f4d37bd gen2 music 2022-04-19 23:38:37 -06:00
James Betker
c85ab738c5 paired fix 2022-04-16 23:41:57 -06:00
James Betker
8fe0dff33c support tts typing 2022-04-16 23:36:57 -06:00
James Betker
48cb6a5abd misc 2022-04-16 20:28:04 -06:00
James Betker
147478a148 cvvp 2022-04-16 20:27:46 -06:00
James Betker
546ecd5aeb music! 2022-04-15 21:21:37 -06:00
James Betker
254357724d gradprop 2022-04-15 09:37:20 -06:00
James Betker
fbf1f4f637 update 2022-04-15 09:34:44 -06:00
James Betker
82aad335ba add distributued logic for loss 2022-04-15 09:31:48 -06:00
James Betker
efe12cb816 Update clvp to add masking probabilities in conditioning and to support code inputs 2022-04-15 09:11:23 -06:00
James Betker
3cad1b8114 more fixes 2022-04-11 15:18:44 -06:00
James Betker
6dea7da7a8 another fix 2022-04-11 12:29:43 -06:00
James Betker
f2c172291f fix audio_diffusion_fid for autoregressive latent inputs 2022-04-11 12:08:15 -06:00
James Betker
8ea5c307fb Fixes for training the diffusion model on autoregressive inputs 2022-04-11 11:02:44 -06:00
James Betker
a3622462c1 Change latent_conditioner back 2022-04-11 09:00:13 -06:00
James Betker
03d0b90bda fixes 2022-04-10 21:02:12 -06:00
James Betker
19ca5b26c1 Remove flat0 and move it into flat 2022-04-10 21:01:59 -06:00
James Betker
81c952a00a undo relative 2022-04-08 16:32:52 -06:00
James Betker
944b4c3335 more undos 2022-04-08 16:31:08 -06:00
James Betker
032983e2ed fix bug and allow position encodings to be trained separately from the rest of the model 2022-04-08 16:26:01 -06:00
James Betker
09ab1aa9bc revert rotary embeddings work
I'm not really sure that this is going to work. I'd rather explore re-using what I've already trained
2022-04-08 16:18:35 -06:00
James Betker
2fb9ffb0aa Align autoregressive text using start and stop tokens 2022-04-08 09:41:59 -06:00
James Betker
628569af7b Another fix 2022-04-08 09:41:18 -06:00
James Betker
423293e518 fix xtransformers bug 2022-04-08 09:12:46 -06:00
James Betker
048f6f729a remove lightweight_gan 2022-04-07 23:12:08 -07:00
James Betker
e634996a9c autoregressive_codegen: support key_value caching for faster inference 2022-04-07 23:08:46 -07:00
James Betker
d05e162f95 reformat x_transformers 2022-04-07 23:08:03 -07:00
James Betker
7c578eb59b Fix inference in new autoregressive_codegen 2022-04-07 21:22:46 -06:00
James Betker
3f8d7955ef unified_voice with rotary embeddings 2022-04-07 20:11:14 -06:00
James Betker
573e5552b9 CLVP v1 2022-04-07 20:10:57 -06:00
James Betker
71b73db044 clean up 2022-04-07 11:34:10 -06:00
James Betker
6fc4f49e86 some dumb stuff 2022-04-07 11:32:34 -06:00
James Betker
e6387c7613 Fix eval logic to not run immediately 2022-04-07 11:29:57 -06:00
James Betker
305dc95e4b cg2 2022-04-06 21:24:36 -06:00
James Betker
e011166dd6 autoregressive_codegen r3 2022-04-06 21:04:23 -06:00
James Betker
33ef17e9e5 fix context 2022-04-06 00:45:42 -06:00
James Betker
37bdfe82b2 Modify x_transformers to do checkpointing and use relative positional biases 2022-04-06 00:35:29 -06:00
James Betker
09879b434d bring in x_transformers 2022-04-06 00:21:58 -06:00
James Betker
3d916e7687 Fix evaluation when using multiple batch sizes 2022-04-05 07:51:09 -06:00
James Betker
572d137589 track iteration rate 2022-04-04 12:33:25 -06:00
James Betker
4cdb0169d0 update training data encountered when using force_start_step 2022-04-04 12:25:00 -06:00
James Betker
cdd12ff46c Add code validation to autoregressive_codegen 2022-04-04 09:51:41 -06:00
James Betker
99de63a922 man I'm really on it tonight.... 2022-04-02 22:01:33 -06:00
James Betker
a4bdc80933 moikmadsf 2022-04-02 21:59:50 -06:00
James Betker
1cf20b7337 sdfds 2022-04-02 21:58:09 -06:00
James Betker
b6afc4d542 dsfa 2022-04-02 21:57:00 -06:00
James Betker
4c6bdfc9e2 get rid of relative position embeddings, which do not work with DDP & checkpointing 2022-04-02 21:55:32 -06:00