Commit Graph

191 Commits

Author SHA1 Message Date
James Betker
1f6a5310b8 More fixes to use_gpt_tts 2022-01-07 22:30:55 -07:00
James Betker
65ffe38fce misc 2022-01-06 22:16:17 -07:00
James Betker
61cd351b71 update unified 2022-01-06 09:48:11 -07:00
James Betker
10fd1110be Fix (?) use_gpt_tts for unified_voice 2022-01-05 20:09:31 -07:00
James Betker
17fb934575 wer update 2021-12-31 16:21:39 -07:00
James Betker
f2cd6a7f08 For loading conditional clips, default to falling back to loading the clip itself 2021-12-30 09:10:14 -07:00
James Betker
8a02ba5935 Transit s2s clips back to CPU memory after processing 2021-12-29 08:54:07 -07:00
James Betker
af6d5cd526 Add resume into speech-speech 2021-12-29 08:50:49 -07:00
James Betker
0e4bcc33ab Additional debugging 2021-12-29 00:23:27 -07:00
James Betker
b24a51f0aa Check in speech2speech CLIP inference tool 2021-12-29 00:19:44 -07:00
James Betker
c1bef01dfa GptAsrHf2 checkin 2021-12-28 20:48:38 -07:00
James Betker
a5b4bee719 Improve asr_eval 2021-12-28 11:45:15 -07:00
James Betker
4a32949b0e update inference mode for unified 2021-12-26 15:33:21 -07:00
James Betker
b595c62893 One way decoder for decoding from mel codes 2021-12-25 12:18:00 -07:00
James Betker
ab9cafa572 Make tokenization configs more configurable 2021-12-25 12:17:50 -07:00
James Betker
8e26400ce2 Add inference for unified gpt 2021-12-24 13:27:06 -07:00
James Betker
a42b94ab72 gpt_tts_hf inference fixes 2021-12-22 13:22:15 -07:00
James Betker
53858b2055 Fix gpt_tts_hf inference 2021-12-20 17:45:26 -07:00
James Betker
b4ddcd7111 More inference improvements 2021-12-19 09:01:19 -07:00
James Betker
f9c45d70f0 Fix mel terminator 2021-12-18 17:18:06 -07:00
James Betker
937045cb63 Fixes 2021-12-18 16:45:38 -07:00
James Betker
dee34f096c Add use_gpt_tts script 2021-12-16 23:28:54 -07:00
James Betker
62c8ed9a29 move speech utils 2021-12-16 20:47:37 -07:00
James Betker
aa7cfd1edf Add support for mel norms across the channel dim 2021-12-12 19:52:08 -07:00
James Betker
63bf135b93 Support norms 2021-12-11 08:30:49 -07:00
James Betker
959979086d fix 2021-12-11 08:18:00 -07:00
James Betker
5a664aa56e misc 2021-12-11 08:17:26 -07:00
James Betker
d610540ce5 mel norm computation script 2021-12-11 08:16:50 -07:00
James Betker
b2d8fbcfc0 build a better speech synthesis toolset 2021-12-09 22:59:56 -07:00
James Betker
32cfcf3684 Turn off optimization in find_faulty_files 2021-12-09 09:02:09 -07:00
James Betker
a66a2bf91b Update find_faulty_files 2021-12-09 09:00:00 -07:00
James Betker
04454ee63a Add evaluation logic for gpt_asr_hf2 2021-12-02 21:04:36 -07:00
James Betker
82d0e7720e Add choke to lucidrains_dvae 2021-11-23 18:53:37 -07:00
James Betker
973f47c525 misc nonfunctional 2021-11-22 17:16:39 -07:00
James Betker
3125ca38f5 Further wandb logs 2021-11-22 16:40:19 -07:00
James Betker
687e0746b3 Add Torch-derived MelSpectrogramInjector 2021-11-18 20:02:45 -07:00
James Betker
c30a38cdf1 Undo baseline GDI changes 2021-11-18 20:02:09 -07:00
James Betker
9b693b0a54 Fixes to filter_clips_hifreq 2021-11-07 18:42:22 -07:00
James Betker
a367ea3fda Add script for computing attention for gpt_asr 2021-11-07 18:42:06 -07:00
James Betker
3c0f2fbb21 Add filtration script for finding resampled clips (or phone calls) 2021-11-07 14:16:11 -07:00
James Betker
756b4dad09 Working gpt_asr_hf inference - and it's a beast! 2021-11-06 21:47:15 -06:00
James Betker
596a62fe01 Apply fix to gpt_asr_hf and prep it for inference
Fix is that we were predicting two characters in advance, not next character
2021-11-04 10:09:24 -06:00
James Betker
36ed28913a Fix two scripts 2021-10-30 17:00:06 -06:00
James Betker
466b9fbcaa classify 2021-10-29 20:22:40 -06:00
James Betker
986fc9628d Check in GPT with new inference methods (but not the backing code..) 2021-10-29 17:21:40 -06:00
James Betker
579f0a70ee Move UnsupervisedAudioDataset to use my new mp3 loader 2021-10-28 22:33:12 -06:00
James Betker
bb0a0c8264 classify_into_folders script 2021-10-27 14:56:16 -06:00
James Betker
d91dcbd404 Make classifier inference script more open 2021-10-27 13:18:54 -06:00
James Betker
5d714bc566 Add deepspeech model and support for decoding with it 2021-10-27 13:09:46 -06:00
James Betker
15437b2fc3 WER script 2021-10-26 13:30:29 -06:00