DL-Art-School

Author	SHA1	Message	Date
James Betker	c85ab738c5	paired fix	2022-04-16 23:41:57 -06:00
James Betker	8fe0dff33c	support tts typing	2022-04-16 23:36:57 -06:00
James Betker	7929fd89de	Refactor audio-style models into the audio folder	2022-03-15 11:06:25 -06:00
James Betker	30ddac69aa	lots of bad entries	2022-03-05 23:15:59 -07:00
James Betker	dcf98df0c2	++	2022-03-05 23:12:34 -07:00
James Betker	64d764ccd7	fml	2022-03-05 23:11:10 -07:00
James Betker	ef63ff84e2	pvd2	2022-03-05 23:08:39 -07:00
James Betker	1a05712764	pvd	2022-03-05 23:05:29 -07:00
James Betker	03752c1cd6	Report NaN	2022-02-22 23:09:37 -07:00
James Betker	af50afe222	pairedvoice: error out if clip is too short	2022-02-21 19:11:10 -07:00
James Betker	2b36ca5f8e	Revert paired back	2022-01-16 21:10:46 -07:00
James Betker	7331862755	Updated paired to randomly index data, offsetting memory costs and speeding up initialization	2022-01-16 21:09:22 -07:00
James Betker	37e4e737b5	a few fixes	2022-01-16 15:17:17 -07:00
James Betker	35db5ebf41	paired_voice_audio_dataset - aligned codes support	2022-01-15 17:38:26 -07:00
James Betker	6706591d3d	Fix dataset	2022-01-06 15:24:37 -07:00
James Betker	f4484fd155	Add "dataset_debugger" support This allows the datasets themselves compile statistics and report them via tensorboard and wandb.	2022-01-06 12:38:20 -07:00
James Betker	f3cab45658	Revise audio datasets to include interesting statistics in batch Stats include: - How many indices were skipped to retrieve a given index - Whether or not a conditioning input was actually the file itself	2022-01-06 11:15:16 -07:00
James Betker	06c1093090	Remove collating from paired_voice_audio_dataset This will now be done at the model level, which is more efficient	2022-01-06 10:29:39 -07:00
James Betker	5e1d1da2e9	Clean paired_voice	2022-01-06 10:26:53 -07:00
James Betker	0fe34f57d1	Use torch resampler	2022-01-05 15:47:22 -07:00
James Betker	bbacffb790	dataset improvements and fix to unified_voice_Bilevel	2022-01-01 00:16:30 -07:00
James Betker	f0c4cd6317	Taking another stab at a BPE tokenizer	2021-12-30 13:41:24 -07:00
James Betker	51ce1b5007	Add conditioning clips features to grand_conjoined	2021-12-29 14:44:32 -07:00
James Betker	746392f35c	Fix DS	2021-12-25 15:28:59 -07:00
James Betker	736c2626ee	build in character tokenizer	2021-12-25 15:21:01 -07:00
James Betker	52410fd9d9	256-bpe tokenizer	2021-12-25 08:52:08 -07:00
James Betker	ead2a74bf0	Add debug_failures flag	2021-12-23 16:12:16 -07:00
James Betker	e55d949855	GrandConjoinedDataset	2021-12-23 14:32:33 -07:00
James Betker	b9de8a8eda	More fixes	2021-12-22 19:21:29 -07:00
James Betker	191e0130ee	Another fix	2021-12-22 18:30:50 -07:00
James Betker	6c6daa5795	Build a bigger, better tokenizer	2021-12-22 17:46:18 -07:00
James Betker	c737632eae	Train and use a bespoke tokenizer	2021-12-22 15:06:14 -07:00
James Betker	a9629f7022	Try out using the GPT tokenizer rather than nv_tacotron This results in a significant compression of the text domain, I'm curious what the effect on speech quality will be.	2021-12-22 14:03:18 -07:00
James Betker	7bf4f9f580	duplicate nvtacotron	2021-12-22 13:48:30 -07:00

34 Commits