Commit Graph

42 Commits

Author SHA1 Message Date
James Betker
d4a6298658 more debugging 2022-01-01 14:25:27 -07:00
James Betker
53784ec806 grand conjoined dataset: support collating 2021-12-29 09:44:37 -07:00
James Betker
e55d949855 GrandConjoinedDataset 2021-12-23 14:32:33 -07:00
James Betker
a9629f7022 Try out using the GPT tokenizer rather than nv_tacotron
This results in a significant compression of the text domain, I'm curious what the
effect on speech quality will be.
2021-12-22 14:03:18 -07:00
James Betker
f7d0901ce6 Decouple MEL from nv_tacotron_dataset 2021-10-31 15:01:38 -06:00
James Betker
c3421b7f6d Dataset work for audio quality processor 2021-10-24 09:09:34 -06:00
James Betker
6833048bf7 Alterations to diffusion_dvae so it can be used directly on spectrograms 2021-09-23 15:56:25 -06:00
James Betker
e513052fca Add unsupervised_audio_dataset 2021-09-14 17:43:16 -06:00
James Betker
570ed327ed Stop dataset - attempt #2 2021-08-18 18:29:38 -06:00
James Betker
d7f30232c3 Oh yeah 2021-08-16 22:52:15 -06:00
James Betker
4c76257c71 Dont require collation for nv_tacotron 2021-08-12 15:44:55 -06:00
James Betker
5037220ac7 Mods to support contrastive learning on audio files 2021-08-05 05:57:04 -06:00
James Betker
d9936df363 Add gpt_tts dataset and implement inference
- Adds a script which preprocesses quantized mels given a DVAE
- Adds a dataset which can consume preprocessed qmels
- Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens)
- Adds inference to gpt_tts
2021-08-04 00:44:04 -06:00
James Betker
1ff434218e tacotron2, ready for prime time! 2021-07-08 22:13:44 -06:00
James Betker
86fd3ad7fd Initial checkin of nvidia tacotron model & dataset
These two are tested, full support for training to come.
2021-07-06 11:11:35 -06:00
James Betker
65c474eecf Various changes to fix testing 2021-06-11 15:31:10 -06:00
James Betker
6649ef2dae Add zipfilesdataset 2021-05-24 21:35:00 -06:00
James Betker
8e0e883050 Mods to support labeled datasets & random augs for those datasets 2020-12-15 17:15:56 -07:00
James Betker
26ceca68c0 BYOL with structure! 2020-12-10 15:07:35 -07:00
James Betker
66cbae8731 Add random_dataset for testing 2020-12-09 14:55:05 -07:00
James Betker
97ff25a086 BYOL!
Man, is there anything ExtensibleTrainer can't train? :)
2020-12-08 13:07:53 -07:00
James Betker
06d1c62c5a iGPT support!
Sweeeeet
2020-12-03 15:32:21 -07:00
James Betker
c963e5f2ce Add ImageFolderDataset
This one has been a long time coming.. How does torch not have something like this?
2020-12-01 17:45:37 -07:00
James Betker
2d3449d7a5 stylegan2 in ml art school! 2020-11-12 15:42:05 -07:00
James Betker
ff58c6484a Fixes to unified chunk datasets to support stereoscopic training 2020-10-26 11:12:22 -06:00
James Betker
8e5b6682bf Add PairedFrameDataset 2020-10-23 20:58:07 -06:00
James Betker
9ead2c0a08 Multiscale training in! 2020-10-17 22:54:12 -06:00
James Betker
24792bdb4f Codebase cleanup
Removed a lot of legacy stuff I have no intent on using again.
Plan is to shape this repo into something more extensible (get it? hah!)
2020-10-13 20:56:39 -06:00
James Betker
57814f18cf More features for multi-frame-dataset 2020-09-28 14:26:15 -06:00
James Betker
254cb1e915 More dataset integration work 2020-09-25 22:19:38 -06:00
James Betker
5189b11dac Add combined dataset for training across multiple datasets 2020-09-11 08:44:06 -06:00
James Betker
6226b52130 Pin memory in dataloaders by default 2020-09-04 15:30:46 -06:00
James Betker
a65b07607c Reference network 2020-08-25 11:56:59 -06:00
James Betker
67139602f5 Test modifications
Allows bifurcating large images put into the test pipeline

This code is fixed and not dynamic. Needs some fixes.
2020-05-19 09:37:58 -06:00
James Betker
585b05e66b Cap test workers at 10 2020-05-13 09:20:45 -06:00
James Betker
f994466289 Initialize test dataloader with a worker count proportional to the batch size. 2020-05-10 10:49:37 -06:00
James Betker
44b89330c2 Support inference across batches, support inference on cpu, checkpoint
This is a checkpoint of a set of long tests with reduced-complexity networks. Some takeaways:
1) A full GAN using the resnet discriminator does appear to converge, but the quality is capped.
2) Likewise, a combination GAN/feature loss does not converge. The feature loss is optimized but
    the model appears unable to fight the discriminator, so the G-loss steadily increases.

Going forwards, I want to try some bigger models. In particular, I want to change the generator
to increase complexity and capacity. I also want to add skip connections between the
disc and generator.
2020-05-04 08:48:25 -06:00
James Betker
3781ea725c Add Resnet Discriminator with BN 2020-04-29 20:51:57 -06:00
James Betker
e98d92fc77 Allow test to operate on batches 2020-04-23 23:59:09 -06:00
James Betker
12d92dc443 Add GTLQ dataset 2020-04-22 00:40:38 -06:00
James Betker
f4b33b0531 Some random fixes/adjustments 2020-04-22 00:38:53 -06:00
XintaoWang
037933ba66 mmsr 2019-08-23 21:42:47 +08:00