Commit Graph

119 Commits

Author SHA1 Message Date
James Betker
48e3ee9a5b Shuffle conditioning inputs along the positional axis to reduce fitting on prosody and other positional information
The mels should still retain some short-range positional information the model can use
for tone and frequencies, for example.
2021-12-20 19:05:56 -07:00
James Betker
53858b2055 Fix gpt_tts_hf inference 2021-12-20 17:45:26 -07:00
James Betker
712d746e9b gpt_tts: format conditioning inputs more for contextual voice clues and less for prosidy
also support single conditional inputs
2021-12-19 17:42:29 -07:00
James Betker
c813befd53 Remove dedicated positioning embeddings 2021-12-19 09:01:31 -07:00
James Betker
b4ddcd7111 More inference improvements 2021-12-19 09:01:19 -07:00
James Betker
f9c45d70f0 Fix mel terminator 2021-12-18 17:18:06 -07:00
James Betker
937045cb63 Fixes 2021-12-18 16:45:38 -07:00
James Betker
9b9f7ea61b GptTtsHf: Make the input/target placement easier to reason about 2021-12-17 10:24:14 -07:00
James Betker
2fb4213a3e More lossy fixes 2021-12-17 10:01:42 -07:00
James Betker
9e8a9bf6ca Various fixes to gpt_tts_hf 2021-12-16 23:28:44 -07:00
James Betker
62c8ed9a29 move speech utils 2021-12-16 20:47:37 -07:00
James Betker
4f8c4d130c gpt_tts_hf: pad mel tokens with an <end_of_sequence> token. 2021-12-12 20:04:50 -07:00
James Betker
8917c02a4d gpt_tts_hf inference first pass 2021-12-12 19:51:44 -07:00
James Betker
5a664aa56e misc 2021-12-11 08:17:26 -07:00
James Betker
6ccff3f49f Record codes more often 2021-12-07 09:22:45 -07:00
James Betker
d0b2f931bf Add feature to diffusion vocoder where the spectrogram conditioning layers can be re-trained apart from the rest of the model 2021-12-07 09:22:30 -07:00
James Betker
662920bde3 Log codes when simply fetching codebook_indices 2021-12-06 09:21:43 -07:00
James Betker
380a5d5475 gdi.. 2021-12-03 08:53:09 -07:00
James Betker
101a01f744 Fix dvae codes issue 2021-12-02 23:28:36 -07:00
James Betker
07b0124712 GptTtsHf! 2021-12-02 21:48:42 -07:00
James Betker
85542ec547 One last fix for gpt_asr_hf2 2021-12-02 21:19:28 -07:00
James Betker
04454ee63a Add evaluation logic for gpt_asr_hf2 2021-12-02 21:04:36 -07:00
James Betker
5956eb757c ffffff 2021-11-24 00:19:47 -07:00
James Betker
f1ed0588e3 another fix 2021-11-24 00:11:21 -07:00
James Betker
7a3c4a4fc6 Fix lr quantizer decode 2021-11-24 00:01:26 -07:00
James Betker
3f6ecfe0db q fix 2021-11-23 23:50:27 -07:00
James Betker
d9747fe623 Integrate with lr_quantizer 2021-11-23 19:48:22 -07:00
James Betker
82d0e7720e Add choke to lucidrains_dvae 2021-11-23 18:53:37 -07:00
James Betker
934395d4b8 A few fixes for gpt_asr_hf2 2021-11-23 09:29:29 -07:00
James Betker
01e635168b whoops 2021-11-22 17:24:13 -07:00
James Betker
973f47c525 misc nonfunctional 2021-11-22 17:16:39 -07:00
James Betker
3125ca38f5 Further wandb logs 2021-11-22 16:40:19 -07:00
James Betker
0604060580 Finish up mods for next version of GptAsrHf 2021-11-20 21:33:49 -07:00
James Betker
14f3155ec4 misc 2021-11-20 17:45:14 -07:00
James Betker
555b7e52ad Add rev2 of GptAsrHf 2021-11-18 20:02:24 -07:00
James Betker
1287915f3c Fix dvae test failure 2021-11-18 00:58:36 -07:00
James Betker
019acfa4c5 Allow flat dvae 2021-11-18 00:53:42 -07:00
James Betker
f3db41f125 Fix code logging 2021-11-18 00:34:37 -07:00
James Betker
c584320cf3 Fix gpt_asr_hf distillation 2021-11-07 21:53:21 -07:00
James Betker
a367ea3fda Add script for computing attention for gpt_asr 2021-11-07 18:42:06 -07:00
James Betker
756b4dad09 Working gpt_asr_hf inference - and it's a beast! 2021-11-06 21:47:15 -06:00
James Betker
596a62fe01 Apply fix to gpt_asr_hf and prep it for inference
Fix is that we were predicting two characters in advance, not next character
2021-11-04 10:09:24 -06:00
James Betker
993bd52d42 Add spec_augment injector 2021-11-01 18:43:11 -06:00
James Betker
4cff774b0e Reduce complexity of the encoder for gpt_asr_hf 2021-11-01 17:02:28 -06:00
James Betker
da55ca0438 gpt_asr using the huggingfaces transformer 2021-11-01 17:00:22 -06:00
James Betker
83cccef9d8 Condition on full signal 2021-10-30 19:58:34 -06:00
James Betker
df45a9dec2 Fix inference mode for lucidrains_gpt 2021-10-30 16:59:18 -06:00
James Betker
92fe8b4dd9 ffffpt2 2021-10-29 17:29:49 -06:00
James Betker
95ca88efce Fix feedforward 2021-10-29 17:27:51 -06:00
James Betker
b476516340 Check in backing changes (which may have broken something?) 2021-10-29 17:22:33 -06:00