James Betker
|
58f6c9805b
|
adf
|
2022-02-22 23:12:58 -07:00 |
|
James Betker
|
52b61b9f77
|
Update scripts and attempt to figure out how UnifiedVoice could be used to produce CTC codes
|
2022-02-13 20:48:06 -07:00 |
|
James Betker
|
0c3cc5ebad
|
use script updates to fix output size disparities
|
2022-02-12 20:00:46 -07:00 |
|
James Betker
|
d1d1ae32a1
|
audio diffusion frechet distance measurement!
|
2022-02-10 22:55:46 -07:00 |
|
James Betker
|
93ca619267
|
script updates
|
2022-02-09 14:26:52 -07:00 |
|
James Betker
|
9e9ae328f2
|
mild updates
|
2022-02-08 23:51:17 -07:00 |
|
James Betker
|
f44b064c5e
|
Update scripts
|
2022-02-07 19:43:18 -07:00 |
|
James Betker
|
5ae816bead
|
ctc gen checkin
|
2022-02-05 15:59:53 -07:00 |
|
James Betker
|
bb3d1ab03d
|
More cleanup
|
2022-02-04 11:06:17 -07:00 |
|
James Betker
|
7f4fc55344
|
Update SR model
|
2022-02-03 21:42:53 -07:00 |
|
James Betker
|
687393de59
|
Add a better split_on_silence (processing_pipeline)
Going to extend this a bit more going forwards to support the entire pipeline.
|
2022-02-03 20:00:26 -07:00 |
|
James Betker
|
1d29999648
|
Uupdates to the TTS production scripts
|
2022-02-03 20:00:01 -07:00 |
|
James Betker
|
fbea6e8eac
|
Adjustments to diffusion networks
|
2022-01-30 16:14:06 -07:00 |
|
James Betker
|
e0e36ed98c
|
Update use_diffuse_tts
|
2022-01-27 19:57:28 -07:00 |
|
James Betker
|
7badbf1b4d
|
update usage scripts
|
2022-01-25 17:57:26 -07:00 |
|
James Betker
|
e2ed0adbd8
|
use_diffuse_tts updates
|
2022-01-24 14:31:28 -07:00 |
|
James Betker
|
8f48848f91
|
misc
|
2022-01-22 08:23:29 -07:00 |
|
James Betker
|
ed35cfe393
|
Update inference scripts
|
2022-01-20 11:28:50 -07:00 |
|
James Betker
|
8e2439f50d
|
Decrease resolution requirements to 2048
|
2022-01-20 11:27:49 -07:00 |
|
James Betker
|
ac13bfefe8
|
use_diffuse_tts
|
2022-01-19 00:35:24 -07:00 |
|
James Betker
|
dc9cd8c206
|
Update use_gpt_tts to be usable with unified_voice2
|
2022-01-18 21:14:17 -07:00 |
|
James Betker
|
7b4544b83a
|
Add an experimental unet_diffusion_tts to perform experiments on
|
2022-01-18 08:38:24 -07:00 |
|
James Betker
|
b398ecca01
|
wer fix
|
2022-01-15 17:28:17 -07:00 |
|
James Betker
|
87c83e4957
|
update wer script
|
2022-01-13 17:08:49 -07:00 |
|
James Betker
|
d4e27ccf62
|
misc updates
|
2022-01-11 16:25:40 -07:00 |
|
James Betker
|
91f28580e2
|
fix unified_voice
|
2022-01-10 16:17:31 -07:00 |
|
James Betker
|
136744dc1d
|
Fixes
|
2022-01-10 14:32:04 -07:00 |
|
James Betker
|
1f6a5310b8
|
More fixes to use_gpt_tts
|
2022-01-07 22:30:55 -07:00 |
|
James Betker
|
65ffe38fce
|
misc
|
2022-01-06 22:16:17 -07:00 |
|
James Betker
|
61cd351b71
|
update unified
|
2022-01-06 09:48:11 -07:00 |
|
James Betker
|
10fd1110be
|
Fix (?) use_gpt_tts for unified_voice
|
2022-01-05 20:09:31 -07:00 |
|
James Betker
|
17fb934575
|
wer update
|
2021-12-31 16:21:39 -07:00 |
|
James Betker
|
f2cd6a7f08
|
For loading conditional clips, default to falling back to loading the clip itself
|
2021-12-30 09:10:14 -07:00 |
|
James Betker
|
8a02ba5935
|
Transit s2s clips back to CPU memory after processing
|
2021-12-29 08:54:07 -07:00 |
|
James Betker
|
af6d5cd526
|
Add resume into speech-speech
|
2021-12-29 08:50:49 -07:00 |
|
James Betker
|
0e4bcc33ab
|
Additional debugging
|
2021-12-29 00:23:27 -07:00 |
|
James Betker
|
b24a51f0aa
|
Check in speech2speech CLIP inference tool
|
2021-12-29 00:19:44 -07:00 |
|
James Betker
|
c1bef01dfa
|
GptAsrHf2 checkin
|
2021-12-28 20:48:38 -07:00 |
|
James Betker
|
a5b4bee719
|
Improve asr_eval
|
2021-12-28 11:45:15 -07:00 |
|
James Betker
|
4a32949b0e
|
update inference mode for unified
|
2021-12-26 15:33:21 -07:00 |
|
James Betker
|
b595c62893
|
One way decoder for decoding from mel codes
|
2021-12-25 12:18:00 -07:00 |
|
James Betker
|
ab9cafa572
|
Make tokenization configs more configurable
|
2021-12-25 12:17:50 -07:00 |
|
James Betker
|
8e26400ce2
|
Add inference for unified gpt
|
2021-12-24 13:27:06 -07:00 |
|
James Betker
|
a42b94ab72
|
gpt_tts_hf inference fixes
|
2021-12-22 13:22:15 -07:00 |
|
James Betker
|
53858b2055
|
Fix gpt_tts_hf inference
|
2021-12-20 17:45:26 -07:00 |
|
James Betker
|
b4ddcd7111
|
More inference improvements
|
2021-12-19 09:01:19 -07:00 |
|
James Betker
|
f9c45d70f0
|
Fix mel terminator
|
2021-12-18 17:18:06 -07:00 |
|
James Betker
|
937045cb63
|
Fixes
|
2021-12-18 16:45:38 -07:00 |
|
James Betker
|
dee34f096c
|
Add use_gpt_tts script
|
2021-12-16 23:28:54 -07:00 |
|
James Betker
|
62c8ed9a29
|
move speech utils
|
2021-12-16 20:47:37 -07:00 |
|
James Betker
|
aa7cfd1edf
|
Add support for mel norms across the channel dim
|
2021-12-12 19:52:08 -07:00 |
|
James Betker
|
63bf135b93
|
Support norms
|
2021-12-11 08:30:49 -07:00 |
|
James Betker
|
959979086d
|
fix
|
2021-12-11 08:18:00 -07:00 |
|
James Betker
|
5a664aa56e
|
misc
|
2021-12-11 08:17:26 -07:00 |
|
James Betker
|
d610540ce5
|
mel norm computation script
|
2021-12-11 08:16:50 -07:00 |
|
James Betker
|
b2d8fbcfc0
|
build a better speech synthesis toolset
|
2021-12-09 22:59:56 -07:00 |
|
James Betker
|
04454ee63a
|
Add evaluation logic for gpt_asr_hf2
|
2021-12-02 21:04:36 -07:00 |
|
James Betker
|
973f47c525
|
misc nonfunctional
|
2021-11-22 17:16:39 -07:00 |
|
James Betker
|
3125ca38f5
|
Further wandb logs
|
2021-11-22 16:40:19 -07:00 |
|
James Betker
|
687e0746b3
|
Add Torch-derived MelSpectrogramInjector
|
2021-11-18 20:02:45 -07:00 |
|
James Betker
|
9b693b0a54
|
Fixes to filter_clips_hifreq
|
2021-11-07 18:42:22 -07:00 |
|
James Betker
|
a367ea3fda
|
Add script for computing attention for gpt_asr
|
2021-11-07 18:42:06 -07:00 |
|
James Betker
|
3c0f2fbb21
|
Add filtration script for finding resampled clips (or phone calls)
|
2021-11-07 14:16:11 -07:00 |
|
James Betker
|
756b4dad09
|
Working gpt_asr_hf inference - and it's a beast!
|
2021-11-06 21:47:15 -06:00 |
|
James Betker
|
596a62fe01
|
Apply fix to gpt_asr_hf and prep it for inference
Fix is that we were predicting two characters in advance, not next character
|
2021-11-04 10:09:24 -06:00 |
|
James Betker
|
986fc9628d
|
Check in GPT with new inference methods (but not the backing code..)
|
2021-10-29 17:21:40 -06:00 |
|
James Betker
|
d91dcbd404
|
Make classifier inference script more open
|
2021-10-27 13:18:54 -06:00 |
|
James Betker
|
5d714bc566
|
Add deepspeech model and support for decoding with it
|
2021-10-27 13:09:46 -06:00 |
|
James Betker
|
15437b2fc3
|
WER script
|
2021-10-26 13:30:29 -06:00 |
|
James Betker
|
c861054218
|
Restore spleeter_splitter
The mods don't help - in TF mode, everything is done on the GPU anyways. Something else
is going to have to be done to fix this.
|
2021-10-09 23:55:42 -06:00 |
|
James Betker
|
32ba496632
|
More fixes
|
2021-10-09 23:27:14 -06:00 |
|
James Betker
|
932ea29a83
|
Add multiprocessing to the spleeter splitter script to try and improve performance further
|
2021-10-09 23:15:36 -06:00 |
|
James Betker
|
b94e587f46
|
Improvements to spleeter_filter_noisy_clips
|
2021-10-07 21:28:00 -06:00 |
|
James Betker
|
bb891a3a53
|
Add partitioning and improved resuming to the spleeter filtering
|
2021-10-06 17:10:12 -06:00 |
|
James Betker
|
fc8ae4679a
|
Work on spleeter filtering script
|
2021-09-29 09:24:56 -06:00 |
|
James Betker
|
ac57cdc794
|
Add scheduling to quantizer, enable cudnn_benchmarking to be disabled
|
2021-09-24 17:01:36 -06:00 |
|
James Betker
|
6833048bf7
|
Alterations to diffusion_dvae so it can be used directly on spectrograms
|
2021-09-23 15:56:25 -06:00 |
|
James Betker
|
97ea329a59
|
Make spleeter filter simpler (and hopefully much faster)
|
2021-09-17 15:29:42 -06:00 |
|
James Betker
|
f78ce9d924
|
Get diffusion_dvae ready for prime time!
|
2021-09-16 22:43:10 -06:00 |
|
James Betker
|
1197ae1928
|
Misc
|
2021-09-16 10:53:56 -06:00 |
|
James Betker
|
4334a67924
|
Spleeter mods
|
2021-09-14 17:43:40 -06:00 |
|
James Betker
|
bc603c3231
|
Script adjustments and fixes
|
2021-09-12 21:26:45 -06:00 |
|
James Betker
|
76e2c497f7
|
Improvements to splitter
|
2021-09-09 23:34:56 -06:00 |
|
James Betker
|
742f9b4010
|
Batch spleeter cleaner using GPU
|
2021-09-09 23:14:32 -06:00 |
|
James Betker
|
73b930c0f6
|
Add diffusion_dvae
Increase split_on_silence interval
|
2021-09-09 16:22:05 -06:00 |
|
James Betker
|
b8f2e0f452
|
mydvae
|
2021-09-06 17:45:30 -06:00 |
|
James Betker
|
ed6eae407f
|
More scripts for splitting and formatting audio
|
2021-08-30 21:20:52 -06:00 |
|
James Betker
|
d05cc1f46c
|
Misc
|
2021-08-24 17:12:04 -06:00 |
|
James Betker
|
b521d94b01
|
Make gpt-asr more configurable
|
2021-08-19 16:33:41 -06:00 |
|
James Betker
|
570ed327ed
|
Stop dataset - attempt #2
|
2021-08-18 18:29:38 -06:00 |
|
James Betker
|
8332923f5c
|
Two more tools to test the audio segmentor
|
2021-08-17 09:09:11 -06:00 |
|
James Betker
|
7c086d0c2c
|
libritts - only write on successful check
|
2021-08-16 22:52:55 -06:00 |
|
James Betker
|
1fede41b7b
|
Audio segmentor
|
2021-08-16 22:51:53 -06:00 |
|
James Betker
|
3580c52eac
|
Fix up wavfile_dataset to be able to provide a full clip
|
2021-08-15 20:53:26 -06:00 |
|
James Betker
|
a523c4f932
|
Auto-normalize wav files by data type
|
2021-08-15 09:09:51 -06:00 |
|
James Betker
|
c28f657ab8
|
Allow usage of pre-rendered mels saved to npy files
|
2021-08-14 23:38:15 -06:00 |
|
James Betker
|
d6a73acaed
|
Allow processing of multiple audio sources at once from nv_tacotron_dataset
|
2021-08-14 16:04:05 -06:00 |
|
James Betker
|
007976082b
|
GPT_asr for inference
|
2021-08-14 14:37:17 -06:00 |
|
James Betker
|
81e91c99de
|
Misc
|
2021-08-13 13:58:59 -06:00 |
|
James Betker
|
d0c74278bf
|
Enable multiple wavfile paths to be specified, fix eps bug in mp3 splitter
|
2021-08-11 08:46:02 -06:00 |
|