Commit Graph

130 Commits

Author SHA1 Message Date
James Betker
3f244f6a68 add mel_norm to std injector 2022-03-15 22:16:59 -06:00
James Betker
f563a8dd41 fixes 2022-03-15 21:43:00 -06:00
James Betker
1e3a8554a1 updates to audio_diffusion_fid 2022-03-15 11:35:09 -06:00
James Betker
9c6f776980 Add univnet vocoder 2022-03-15 11:34:51 -06:00
James Betker
7929fd89de Refactor audio-style models into the audio folder 2022-03-15 11:06:25 -06:00
James Betker
f95d3d2b82 move waveglow to audio/vocoders 2022-03-15 11:03:07 -06:00
James Betker
0419a64107 misc 2022-03-15 10:36:34 -06:00
James Betker
eecbc0e678 Use wider spectrogram when asked 2022-03-15 10:35:11 -06:00
James Betker
896accb71f data and prep improvements 2022-03-12 15:10:11 -07:00
James Betker
7dabc17626 phase2 filter initial commit 2022-03-08 15:51:55 -07:00
James Betker
b3def182de move processing pipeline to "phase_1" 2022-03-08 15:49:51 -07:00
James Betker
2134f06516 Implement conditioning-free diffusion at the eval level 2022-02-27 15:11:42 -07:00
James Betker
e6824e398f Load dvae to cpu 2022-02-23 21:21:45 -07:00
James Betker
68726eac74 . 2022-02-23 17:58:07 -07:00
James Betker
58f6c9805b adf 2022-02-22 23:12:58 -07:00
James Betker
52b61b9f77 Update scripts and attempt to figure out how UnifiedVoice could be used to produce CTC codes 2022-02-13 20:48:06 -07:00
James Betker
0c3cc5ebad use script updates to fix output size disparities 2022-02-12 20:00:46 -07:00
James Betker
d1d1ae32a1 audio diffusion frechet distance measurement! 2022-02-10 22:55:46 -07:00
James Betker
93ca619267 script updates 2022-02-09 14:26:52 -07:00
James Betker
9e9ae328f2 mild updates 2022-02-08 23:51:17 -07:00
James Betker
f44b064c5e Update scripts 2022-02-07 19:43:18 -07:00
James Betker
5ae816bead ctc gen checkin 2022-02-05 15:59:53 -07:00
James Betker
bb3d1ab03d More cleanup 2022-02-04 11:06:17 -07:00
James Betker
7f4fc55344 Update SR model 2022-02-03 21:42:53 -07:00
James Betker
687393de59 Add a better split_on_silence (processing_pipeline)
Going to extend this a bit more going forwards to support the entire pipeline.
2022-02-03 20:00:26 -07:00
James Betker
1d29999648 Uupdates to the TTS production scripts 2022-02-03 20:00:01 -07:00
James Betker
fbea6e8eac Adjustments to diffusion networks 2022-01-30 16:14:06 -07:00
James Betker
e0e36ed98c Update use_diffuse_tts 2022-01-27 19:57:28 -07:00
James Betker
7badbf1b4d update usage scripts 2022-01-25 17:57:26 -07:00
James Betker
e2ed0adbd8 use_diffuse_tts updates 2022-01-24 14:31:28 -07:00
James Betker
8f48848f91 misc 2022-01-22 08:23:29 -07:00
James Betker
ed35cfe393 Update inference scripts 2022-01-20 11:28:50 -07:00
James Betker
8e2439f50d Decrease resolution requirements to 2048 2022-01-20 11:27:49 -07:00
James Betker
ac13bfefe8 use_diffuse_tts 2022-01-19 00:35:24 -07:00
James Betker
dc9cd8c206 Update use_gpt_tts to be usable with unified_voice2 2022-01-18 21:14:17 -07:00
James Betker
7b4544b83a Add an experimental unet_diffusion_tts to perform experiments on 2022-01-18 08:38:24 -07:00
James Betker
b398ecca01 wer fix 2022-01-15 17:28:17 -07:00
James Betker
87c83e4957 update wer script 2022-01-13 17:08:49 -07:00
James Betker
d4e27ccf62 misc updates 2022-01-11 16:25:40 -07:00
James Betker
91f28580e2 fix unified_voice 2022-01-10 16:17:31 -07:00
James Betker
136744dc1d Fixes 2022-01-10 14:32:04 -07:00
James Betker
1f6a5310b8 More fixes to use_gpt_tts 2022-01-07 22:30:55 -07:00
James Betker
65ffe38fce misc 2022-01-06 22:16:17 -07:00
James Betker
61cd351b71 update unified 2022-01-06 09:48:11 -07:00
James Betker
10fd1110be Fix (?) use_gpt_tts for unified_voice 2022-01-05 20:09:31 -07:00
James Betker
17fb934575 wer update 2021-12-31 16:21:39 -07:00
James Betker
f2cd6a7f08 For loading conditional clips, default to falling back to loading the clip itself 2021-12-30 09:10:14 -07:00
James Betker
8a02ba5935 Transit s2s clips back to CPU memory after processing 2021-12-29 08:54:07 -07:00
James Betker
af6d5cd526 Add resume into speech-speech 2021-12-29 08:50:49 -07:00
James Betker
0e4bcc33ab Additional debugging 2021-12-29 00:23:27 -07:00