James Betker
|
7b4544b83a
|
Add an experimental unet_diffusion_tts to perform experiments on
|
2022-01-18 08:38:24 -07:00 |
|
James Betker
|
b398ecca01
|
wer fix
|
2022-01-15 17:28:17 -07:00 |
|
James Betker
|
87c83e4957
|
update wer script
|
2022-01-13 17:08:49 -07:00 |
|
James Betker
|
d4e27ccf62
|
misc updates
|
2022-01-11 16:25:40 -07:00 |
|
James Betker
|
91f28580e2
|
fix unified_voice
|
2022-01-10 16:17:31 -07:00 |
|
James Betker
|
136744dc1d
|
Fixes
|
2022-01-10 14:32:04 -07:00 |
|
James Betker
|
1f6a5310b8
|
More fixes to use_gpt_tts
|
2022-01-07 22:30:55 -07:00 |
|
James Betker
|
65ffe38fce
|
misc
|
2022-01-06 22:16:17 -07:00 |
|
James Betker
|
61cd351b71
|
update unified
|
2022-01-06 09:48:11 -07:00 |
|
James Betker
|
10fd1110be
|
Fix (?) use_gpt_tts for unified_voice
|
2022-01-05 20:09:31 -07:00 |
|
James Betker
|
17fb934575
|
wer update
|
2021-12-31 16:21:39 -07:00 |
|
James Betker
|
f2cd6a7f08
|
For loading conditional clips, default to falling back to loading the clip itself
|
2021-12-30 09:10:14 -07:00 |
|
James Betker
|
8a02ba5935
|
Transit s2s clips back to CPU memory after processing
|
2021-12-29 08:54:07 -07:00 |
|
James Betker
|
af6d5cd526
|
Add resume into speech-speech
|
2021-12-29 08:50:49 -07:00 |
|
James Betker
|
0e4bcc33ab
|
Additional debugging
|
2021-12-29 00:23:27 -07:00 |
|
James Betker
|
b24a51f0aa
|
Check in speech2speech CLIP inference tool
|
2021-12-29 00:19:44 -07:00 |
|
James Betker
|
c1bef01dfa
|
GptAsrHf2 checkin
|
2021-12-28 20:48:38 -07:00 |
|
James Betker
|
a5b4bee719
|
Improve asr_eval
|
2021-12-28 11:45:15 -07:00 |
|
James Betker
|
4a32949b0e
|
update inference mode for unified
|
2021-12-26 15:33:21 -07:00 |
|
James Betker
|
b595c62893
|
One way decoder for decoding from mel codes
|
2021-12-25 12:18:00 -07:00 |
|
James Betker
|
ab9cafa572
|
Make tokenization configs more configurable
|
2021-12-25 12:17:50 -07:00 |
|
James Betker
|
8e26400ce2
|
Add inference for unified gpt
|
2021-12-24 13:27:06 -07:00 |
|
James Betker
|
a42b94ab72
|
gpt_tts_hf inference fixes
|
2021-12-22 13:22:15 -07:00 |
|
James Betker
|
53858b2055
|
Fix gpt_tts_hf inference
|
2021-12-20 17:45:26 -07:00 |
|
James Betker
|
b4ddcd7111
|
More inference improvements
|
2021-12-19 09:01:19 -07:00 |
|
James Betker
|
f9c45d70f0
|
Fix mel terminator
|
2021-12-18 17:18:06 -07:00 |
|
James Betker
|
937045cb63
|
Fixes
|
2021-12-18 16:45:38 -07:00 |
|
James Betker
|
dee34f096c
|
Add use_gpt_tts script
|
2021-12-16 23:28:54 -07:00 |
|
James Betker
|
62c8ed9a29
|
move speech utils
|
2021-12-16 20:47:37 -07:00 |
|
James Betker
|
aa7cfd1edf
|
Add support for mel norms across the channel dim
|
2021-12-12 19:52:08 -07:00 |
|
James Betker
|
63bf135b93
|
Support norms
|
2021-12-11 08:30:49 -07:00 |
|
James Betker
|
959979086d
|
fix
|
2021-12-11 08:18:00 -07:00 |
|
James Betker
|
5a664aa56e
|
misc
|
2021-12-11 08:17:26 -07:00 |
|
James Betker
|
d610540ce5
|
mel norm computation script
|
2021-12-11 08:16:50 -07:00 |
|
James Betker
|
b2d8fbcfc0
|
build a better speech synthesis toolset
|
2021-12-09 22:59:56 -07:00 |
|
James Betker
|
32cfcf3684
|
Turn off optimization in find_faulty_files
|
2021-12-09 09:02:09 -07:00 |
|
James Betker
|
a66a2bf91b
|
Update find_faulty_files
|
2021-12-09 09:00:00 -07:00 |
|
James Betker
|
04454ee63a
|
Add evaluation logic for gpt_asr_hf2
|
2021-12-02 21:04:36 -07:00 |
|
James Betker
|
82d0e7720e
|
Add choke to lucidrains_dvae
|
2021-11-23 18:53:37 -07:00 |
|
James Betker
|
973f47c525
|
misc nonfunctional
|
2021-11-22 17:16:39 -07:00 |
|
James Betker
|
3125ca38f5
|
Further wandb logs
|
2021-11-22 16:40:19 -07:00 |
|
James Betker
|
687e0746b3
|
Add Torch-derived MelSpectrogramInjector
|
2021-11-18 20:02:45 -07:00 |
|
James Betker
|
c30a38cdf1
|
Undo baseline GDI changes
|
2021-11-18 20:02:09 -07:00 |
|
James Betker
|
9b693b0a54
|
Fixes to filter_clips_hifreq
|
2021-11-07 18:42:22 -07:00 |
|
James Betker
|
a367ea3fda
|
Add script for computing attention for gpt_asr
|
2021-11-07 18:42:06 -07:00 |
|
James Betker
|
3c0f2fbb21
|
Add filtration script for finding resampled clips (or phone calls)
|
2021-11-07 14:16:11 -07:00 |
|
James Betker
|
756b4dad09
|
Working gpt_asr_hf inference - and it's a beast!
|
2021-11-06 21:47:15 -06:00 |
|
James Betker
|
596a62fe01
|
Apply fix to gpt_asr_hf and prep it for inference
Fix is that we were predicting two characters in advance, not next character
|
2021-11-04 10:09:24 -06:00 |
|
James Betker
|
36ed28913a
|
Fix two scripts
|
2021-10-30 17:00:06 -06:00 |
|
James Betker
|
466b9fbcaa
|
classify
|
2021-10-29 20:22:40 -06:00 |
|
James Betker
|
986fc9628d
|
Check in GPT with new inference methods (but not the backing code..)
|
2021-10-29 17:21:40 -06:00 |
|
James Betker
|
579f0a70ee
|
Move UnsupervisedAudioDataset to use my new mp3 loader
|
2021-10-28 22:33:12 -06:00 |
|
James Betker
|
bb0a0c8264
|
classify_into_folders script
|
2021-10-27 14:56:16 -06:00 |
|
James Betker
|
d91dcbd404
|
Make classifier inference script more open
|
2021-10-27 13:18:54 -06:00 |
|
James Betker
|
5d714bc566
|
Add deepspeech model and support for decoding with it
|
2021-10-27 13:09:46 -06:00 |
|
James Betker
|
15437b2fc3
|
WER script
|
2021-10-26 13:30:29 -06:00 |
|
James Betker
|
ba6e46c02a
|
Further simplify diffusion_vocoder and make noise_surfer work
|
2021-10-26 08:54:30 -06:00 |
|
James Betker
|
f2a31702b5
|
Clean stuff up, move more things into arch_util
|
2021-10-20 21:19:25 -06:00 |
|
James Betker
|
d016a2fbad
|
Go back to vanilla flavor of diffusion
|
2021-10-17 17:32:46 -06:00 |
|
James Betker
|
c861054218
|
Restore spleeter_splitter
The mods don't help - in TF mode, everything is done on the GPU anyways. Something else
is going to have to be done to fix this.
|
2021-10-09 23:55:42 -06:00 |
|
James Betker
|
32ba496632
|
More fixes
|
2021-10-09 23:27:14 -06:00 |
|
James Betker
|
932ea29a83
|
Add multiprocessing to the spleeter splitter script to try and improve performance further
|
2021-10-09 23:15:36 -06:00 |
|
James Betker
|
b94e587f46
|
Improvements to spleeter_filter_noisy_clips
|
2021-10-07 21:28:00 -06:00 |
|
James Betker
|
bb891a3a53
|
Add partitioning and improved resuming to the spleeter filtering
|
2021-10-06 17:10:12 -06:00 |
|
James Betker
|
4914c526dc
|
More cleanup
|
2021-09-29 14:24:49 -06:00 |
|
James Betker
|
fc8ae4679a
|
Work on spleeter filtering script
|
2021-09-29 09:24:56 -06:00 |
|
James Betker
|
55b58fb67f
|
Clean up codebase
Remove stuff that I'm likely not going to use again (or generally failed experiments)
|
2021-09-29 09:21:44 -06:00 |
|
James Betker
|
ac57cdc794
|
Add scheduling to quantizer, enable cudnn_benchmarking to be disabled
|
2021-09-24 17:01:36 -06:00 |
|
James Betker
|
6833048bf7
|
Alterations to diffusion_dvae so it can be used directly on spectrograms
|
2021-09-23 15:56:25 -06:00 |
|
James Betker
|
97ea329a59
|
Make spleeter filter simpler (and hopefully much faster)
|
2021-09-17 15:29:42 -06:00 |
|
James Betker
|
f78ce9d924
|
Get diffusion_dvae ready for prime time!
|
2021-09-16 22:43:10 -06:00 |
|
James Betker
|
1197ae1928
|
Misc
|
2021-09-16 10:53:56 -06:00 |
|
James Betker
|
4334a67924
|
Spleeter mods
|
2021-09-14 17:43:40 -06:00 |
|
James Betker
|
bc603c3231
|
Script adjustments and fixes
|
2021-09-12 21:26:45 -06:00 |
|
James Betker
|
76e2c497f7
|
Improvements to splitter
|
2021-09-09 23:34:56 -06:00 |
|
James Betker
|
742f9b4010
|
Batch spleeter cleaner using GPU
|
2021-09-09 23:14:32 -06:00 |
|
James Betker
|
73b930c0f6
|
Add diffusion_dvae
Increase split_on_silence interval
|
2021-09-09 16:22:05 -06:00 |
|
James Betker
|
b8f2e0f452
|
mydvae
|
2021-09-06 17:45:30 -06:00 |
|
James Betker
|
92e7e57f81
|
Update diffusion_noise_surfer to support audio
|
2021-09-01 08:34:47 -06:00 |
|
James Betker
|
274d352e6f
|
dug
|
2021-08-30 21:45:58 -06:00 |
|
James Betker
|
f1a0c21fb2
|
asr_eval
|
2021-08-30 21:41:34 -06:00 |
|
James Betker
|
ed6eae407f
|
More scripts for splitting and formatting audio
|
2021-08-30 21:20:52 -06:00 |
|
James Betker
|
909754cc27
|
Add find_faulty_files.py
|
2021-08-25 18:00:43 -06:00 |
|
James Betker
|
d05cc1f46c
|
Misc
|
2021-08-24 17:12:04 -06:00 |
|
James Betker
|
b521d94b01
|
Make gpt-asr more configurable
|
2021-08-19 16:33:41 -06:00 |
|
James Betker
|
570ed327ed
|
Stop dataset - attempt #2
|
2021-08-18 18:29:38 -06:00 |
|
James Betker
|
8332923f5c
|
Two more tools to test the audio segmentor
|
2021-08-17 09:09:11 -06:00 |
|
James Betker
|
7c086d0c2c
|
libritts - only write on successful check
|
2021-08-16 22:52:55 -06:00 |
|
James Betker
|
1fede41b7b
|
Audio segmentor
|
2021-08-16 22:51:53 -06:00 |
|
James Betker
|
3580c52eac
|
Fix up wavfile_dataset to be able to provide a full clip
|
2021-08-15 20:53:26 -06:00 |
|
James Betker
|
a523c4f932
|
Auto-normalize wav files by data type
|
2021-08-15 09:09:51 -06:00 |
|
James Betker
|
c28f657ab8
|
Allow usage of pre-rendered mels saved to npy files
|
2021-08-14 23:38:15 -06:00 |
|
James Betker
|
d6a73acaed
|
Allow processing of multiple audio sources at once from nv_tacotron_dataset
|
2021-08-14 16:04:05 -06:00 |
|
James Betker
|
007976082b
|
GPT_asr for inference
|
2021-08-14 14:37:17 -06:00 |
|
James Betker
|
81e91c99de
|
Misc
|
2021-08-13 13:58:59 -06:00 |
|
James Betker
|
d0c74278bf
|
Enable multiple wavfile paths to be specified, fix eps bug in mp3 splitter
|
2021-08-11 08:46:02 -06:00 |
|
James Betker
|
e19c00398e
|
More improvements to random_mp3_splitter
|
2021-08-09 21:31:12 -06:00 |
|
James Betker
|
4100469902
|
Add a tool to split mp3 files into arbitrary chunks of wav files
|
2021-08-08 23:23:13 -06:00 |
|
James Betker
|
690d7e86d3
|
Fix nv_tacotron_dataset bug which incorrectly mapped filenames
dammit..
|
2021-08-08 11:38:52 -06:00 |
|
James Betker
|
a2afb25e42
|
Fix inference, always flow full text tokens through transformer
|
2021-08-07 20:11:10 -06:00 |
|