Commit Graph

110 Commits

Author SHA1 Message Date
James Betker
f4484fd155 Add "dataset_debugger" support
This allows the datasets themselves compile statistics and report them
via tensorboard and wandb.
2022-01-06 12:38:20 -07:00
James Betker
f3cab45658 Revise audio datasets to include interesting statistics in batch
Stats include:
- How many indices were skipped to retrieve a given index
- Whether or not a conditioning input was actually the file itself
2022-01-06 11:15:16 -07:00
James Betker
06c1093090 Remove collating from paired_voice_audio_dataset
This will now be done at the model level, which is more efficient
2022-01-06 10:29:39 -07:00
James Betker
5e1d1da2e9 Clean paired_voice 2022-01-06 10:26:53 -07:00
James Betker
0fe34f57d1 Use torch resampler 2022-01-05 15:47:22 -07:00
James Betker
d5a5111890 Fix collating on by default on grand_conjoined 2022-01-01 10:30:15 -07:00
James Betker
4d9ba4a48a can i has fix now 2022-01-01 00:48:27 -07:00
James Betker
56752f1dbc Fix collator bug 2022-01-01 00:33:31 -07:00
James Betker
c28d8770c7 fix tensor lengths 2022-01-01 00:23:46 -07:00
James Betker
bbacffb790 dataset improvements and fix to unified_voice_Bilevel 2022-01-01 00:16:30 -07:00
James Betker
17fb934575 wer update 2021-12-31 16:21:39 -07:00
James Betker
f0c4cd6317 Taking another stab at a BPE tokenizer 2021-12-30 13:41:24 -07:00
James Betker
f2cd6a7f08 For loading conditional clips, default to falling back to loading the clip itself 2021-12-30 09:10:14 -07:00
James Betker
51ce1b5007 Add conditioning clips features to grand_conjoined 2021-12-29 14:44:32 -07:00
James Betker
c6ef0eef0b asdf 2021-12-29 10:07:39 -07:00
James Betker
53784ec806 grand conjoined dataset: support collating 2021-12-29 09:44:37 -07:00
James Betker
07c2b9907c Add voice2voice clip model 2021-12-28 16:18:12 -07:00
James Betker
746392f35c Fix DS 2021-12-25 15:28:59 -07:00
James Betker
736c2626ee build in character tokenizer 2021-12-25 15:21:01 -07:00
James Betker
52410fd9d9 256-bpe tokenizer 2021-12-25 08:52:08 -07:00
James Betker
ead2a74bf0 Add debug_failures flag 2021-12-23 16:12:16 -07:00
James Betker
8b19c37409 UnifiedGptVoice! 2021-12-23 15:20:26 -07:00
James Betker
5bc9772cb0 grand: support validation mode 2021-12-23 15:03:20 -07:00
James Betker
e55d949855 GrandConjoinedDataset 2021-12-23 14:32:33 -07:00
James Betker
b9de8a8eda More fixes 2021-12-22 19:21:29 -07:00
James Betker
191e0130ee Another fix 2021-12-22 18:30:50 -07:00
James Betker
6c6daa5795 Build a bigger, better tokenizer 2021-12-22 17:46:18 -07:00
James Betker
c737632eae Train and use a bespoke tokenizer 2021-12-22 15:06:14 -07:00
James Betker
a9629f7022 Try out using the GPT tokenizer rather than nv_tacotron
This results in a significant compression of the text domain, I'm curious what the
effect on speech quality will be.
2021-12-22 14:03:18 -07:00
James Betker
ced81a760b restore nv_tacotron 2021-12-22 13:48:53 -07:00
James Betker
7bf4f9f580 duplicate nvtacotron 2021-12-22 13:48:30 -07:00
James Betker
9e8a9bf6ca Various fixes to gpt_tts_hf 2021-12-16 23:28:44 -07:00
James Betker
31fc693a8a dafsdf 2021-12-02 22:55:36 -07:00
James Betker
040d998922 maasd 2021-12-02 22:53:48 -07:00
James Betker
cc10e7e7e8 Add tsv loader 2021-12-02 22:43:07 -07:00
James Betker
702607556d nv_tacotron_dataset: allow it to load conditioning signals 2021-12-02 22:14:44 -07:00
James Betker
0604060580 Finish up mods for next version of GptAsrHf 2021-11-20 21:33:49 -07:00
James Betker
18b1de9b2c Add exclusion_lists to unsupervised_audio_dataset 2021-11-07 18:46:47 -07:00
James Betker
fd14746bf8 badtimes 2021-11-03 00:33:38 -06:00
James Betker
2fa80486de tacotron_dataset: recover gracefully 2021-11-03 00:31:50 -06:00
James Betker
af51d00dee Load wav files from voxpopuli instead of oggs 2021-11-02 09:32:26 -06:00
James Betker
f7d0901ce6 Decouple MEL from nv_tacotron_dataset 2021-10-31 15:01:38 -06:00
James Betker
b8b268b5f6 Misc 2021-10-31 14:29:23 -06:00
James Betker
579f0a70ee Move UnsupervisedAudioDataset to use my new mp3 loader 2021-10-28 22:33:12 -06:00
James Betker
5d714bc566 Add deepspeech model and support for decoding with it 2021-10-27 13:09:46 -06:00
James Betker
21b6daa0ed Introduce clip resampling 2021-10-26 10:42:23 -06:00
James Betker
c3421b7f6d Dataset work for audio quality processor 2021-10-24 09:09:34 -06:00
James Betker
06ea6191a9 Initial implementation of audio_with_noise dataset 2021-10-21 16:45:19 -06:00
James Betker
d016a2fbad Go back to vanilla flavor of diffusion 2021-10-17 17:32:46 -06:00
James Betker
6833048bf7 Alterations to diffusion_dvae so it can be used directly on spectrograms 2021-09-23 15:56:25 -06:00