Commit Graph

246 Commits

Author SHA1 Message Date
James Betker
f04a7bdf63 Bug fixes for tacotron dataset on mozilla cv
- Support a max mel length (mozilla cv has some tracks that are basically unbounded..)
- Don't fail on low sample rates (mozilla cv has some of those)
2021-08-11 16:17:03 -06:00
James Betker
2d3f0cc33c nv_tacotron_dataset - Allow training on mozilla cv 2021-08-11 13:34:31 -06:00
James Betker
d0c74278bf Enable multiple wavfile paths to be specified, fix eps bug in mp3 splitter 2021-08-11 08:46:02 -06:00
James Betker
e19c00398e More improvements to random_mp3_splitter 2021-08-09 21:31:12 -06:00
James Betker
74342b860b Revert "Undo forced text padding"
This reverts commit 83ab5e6a00.
2021-08-09 11:56:34 -06:00
James Betker
d4e33bf15f Fixes to the mp3 splitter 2021-08-09 11:55:46 -06:00
James Betker
4100469902 Add a tool to split mp3 files into arbitrary chunks of wav files 2021-08-08 23:23:13 -06:00
James Betker
83ab5e6a00 Undo forced text padding 2021-08-08 11:42:20 -06:00
James Betker
690d7e86d3 Fix nv_tacotron_dataset bug which incorrectly mapped filenames
dammit..
2021-08-08 11:38:52 -06:00
James Betker
a2afb25e42 Fix inference, always flow full text tokens through transformer 2021-08-07 20:11:10 -06:00
James Betker
b43683b772 Add lucidrains_dvae 2021-08-06 12:03:46 -06:00
James Betker
62c7570512 Constrain wav_aug a bit more 2021-08-06 08:19:38 -06:00
James Betker
f126040da2 Undo noise first 2021-08-05 23:24:38 -06:00
James Betker
908ef5495f Add noise first to audio_aug 2021-08-05 23:22:44 -06:00
James Betker
d6007c6de1 dataset fixes 2021-08-05 23:12:59 -06:00
James Betker
d120e1aa99 Add audio augmentation to wavfile_dataset, utility to test audio similary 2021-08-05 22:14:49 -06:00
James Betker
4017236ba9 Fix up inference for gpt_tts 2021-08-05 06:46:30 -06:00
James Betker
5037220ac7 Mods to support contrastive learning on audio files 2021-08-05 05:57:04 -06:00
James Betker
341f28dd82 It works! 2021-08-04 20:07:51 -06:00
James Betker
d9936df363 Add gpt_tts dataset and implement inference
- Adds a script which preprocesses quantized mels given a DVAE
- Adds a dataset which can consume preprocessed qmels
- Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens)
- Adds inference to gpt_tts
2021-08-04 00:44:04 -06:00
James Betker
dadc54795c Add gpt_tts 2021-07-27 20:33:30 -06:00
James Betker
49e3b310ea Allow audio sample rate interpolation for faster training 2021-07-26 17:44:06 -06:00
James Betker
96e90e7047 Add support for a gaussian-diffusion-based wave tacotron 2021-07-26 16:27:31 -06:00
James Betker
d81386c1be Mods to support vqvae in audio mode (1d) 2021-07-20 08:36:46 -06:00
James Betker
1ff434218e tacotron2, ready for prime time! 2021-07-08 22:13:44 -06:00
James Betker
86fd3ad7fd Initial checkin of nvidia tacotron model & dataset
These two are tested, full support for training to come.
2021-07-06 11:11:35 -06:00
James Betker
afa41f1804 Allow hq color jittering and corruptions that are not included in the corruption factor 2021-06-30 09:44:46 -06:00
James Betker
6fd16ea9c8 Add meta-anomaly detection, colorjitter augmentation 2021-06-29 13:41:55 -06:00
James Betker
46e9f62be0 Add unet with latent guide
This is a diffusion network that uses both a LQ image
and a reference sample HQ image that is compressed into
a latent vector to perform upsampling

The hope is that we can steer the upsampling network
with sample images.
2021-06-26 11:02:58 -06:00
James Betker
0ded106562 Merge remote-tracking branch 'origin/master' 2021-06-25 13:16:28 -06:00
James Betker
a57ed8e960 Various mods to support better jpeg image filtering 2021-06-25 13:16:15 -06:00
James Betker
61e7ca39cd
Update image_folder_dataset.py 2021-06-25 11:48:31 -06:00
James Betker
6b32c87dcb Try to make diffusion fid more deterministic 2021-06-14 09:27:43 -06:00
James Betker
65c474eecf Various changes to fix testing 2021-06-11 15:31:10 -06:00
James Betker
6c6e82406e Pass a corruption factor through the dataset into the upsampling network
The intuition is this will help guide the network to make better informed decisions
about how it performs upsampling based on how it perceives the underlying content.

(I'm giving up on letting networks detect their own quality - I'm not convinced it is
actually feasible)
2021-06-07 09:13:54 -06:00
James Betker
fb405d9ef1 CIFAR stuff
- Extract coarse labels for the CIFAR dataset
- Add simple resnet that branches lower layers based on coarse labels
- Some other cleanup
2021-06-05 14:16:02 -06:00
James Betker
e6c537824a Allow validation for ce 2021-06-04 21:21:04 -06:00
James Betker
7c251af7a8 Support cifar100 with resnet 2021-06-04 17:29:07 -06:00
James Betker
6084915af8 Support gaussian diffusion models
Adds support for GD models, courtesy of some maths from openai.

Also:
- Fixes requirement for eval{} even when it isn't being used
- Adds support for denormalizing an imagenet norm
2021-06-02 21:47:32 -06:00
James Betker
45bc76ba92 Fixes and mods to support training classifiers on imagenet 2021-06-01 17:25:24 -06:00
James Betker
6649ef2dae Add zipfilesdataset 2021-05-24 21:35:00 -06:00
James Betker
9bbe6fc81e Get segformer to a trainable state 2021-04-25 11:45:20 -06:00
James Betker
23e01314d4 Add dataset, ui for labeling and evaluator for pointwise classification 2021-04-23 17:17:13 -06:00
James Betker
b687ef4cd0 Misc 2021-04-21 18:09:46 -06:00
James Betker
f89ea5f1c6 Mods to support lightweight_gan model 2021-03-02 20:51:48 -07:00
James Betker
784b96c059 Misc options to add support for training stylegan2-rosinality models:
- Allow image_folder_dataset to normalize inbound images
- ExtensibleTrainer can denormalize images on the output path
- Support .webp - an output from LSUN
- Support logistic GAN divergence loss
- Support stylegan2 TF weight extraction for discriminator
- New injector that produces latent noise (with separated paths)
- Modify FID evaluator to be operable with rosinality-style GANs
2021-02-08 08:09:21 -07:00
James Betker
34f8c8641f Support training imagenet classifier 2021-01-11 20:09:16 -07:00
James Betker
4119cd6240 Fix to image_folder_dataset to accomodate images with mismatched dimensions 2021-01-10 12:57:21 -07:00
James Betker
5e7ade0114 ImageFolderDataset - corrupt lq images alongside each other 2021-01-03 16:36:38 -07:00
James Betker
193cdc6636 Move discriminators to the create_model paradigm
Also cleans up a lot of old discriminator models that I have no intention
of using again.
2021-01-01 15:56:09 -07:00
James Betker
1de1fa30ac Disable refs and centers altogether in single_image_dataset
I suspect that this might be a cause of failures on parallel datasets.
Plus it is unnecessary computation.
2020-12-31 10:13:24 -07:00
James Betker
ba543d1152 Glean mods
- Fixes fixed upscale factor issues
- Refines a few ops to decrease computation & parameterization
2020-12-27 12:25:06 -07:00
James Betker
2706a84f15 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-12-26 13:50:34 -07:00
James Betker
90e2362c00 Fix bug with full_image_dataset 2020-12-26 13:50:27 -07:00
James Betker
3fd627fc62 Mods to support image classification & filtering 2020-12-26 13:49:27 -07:00
James Betker
1bbcb96ee8 Implement a few changes to support training BYOL networks 2020-12-23 10:50:23 -07:00
James Betker
e7aeb17404 ImageFolder dataset: allow intermediary downscale before corrupt
For massive upscales (ex: 8x), corruption does almost nothing when applied
at the HQ level. This patch adds support to perform corruption at a specified
intermediary scale. The dataset downscales to this level, performs the corruption,
then downscales the rest of the way to get the LQ image.
2020-12-22 15:42:21 -07:00
James Betker
7938f9f50b Fix bug with single_image_dataset which prevented working on multiple directories from working 2020-12-19 15:13:46 -07:00
James Betker
d875ca8342 More refactor changes 2020-12-18 09:24:31 -07:00
James Betker
a8179ff53c Image label work 2020-12-18 08:53:18 -07:00
James Betker
e838c6e75b Rosinality stylegan2 port 2020-12-17 14:18:46 -07:00
James Betker
12cf052889 Add an image patch labeling UI 2020-12-17 10:16:21 -07:00
James Betker
4310e66848 Fix bug in 'corrupt_before_downsize=true' 2020-12-16 09:41:59 -07:00
James Betker
8e0e883050 Mods to support labeled datasets & random augs for those datasets 2020-12-15 17:15:56 -07:00
James Betker
0a19e53df0 BYOL mods 2020-12-14 23:59:11 -07:00
James Betker
087e9280ed Add labeling feature to image_folder_dataset 2020-12-14 23:58:37 -07:00
James Betker
ec0ee25f4b Structural latents checkpoint 2020-12-11 12:01:09 -07:00
James Betker
26ceca68c0 BYOL with structure! 2020-12-10 15:07:35 -07:00
James Betker
8e4b9f42fd New BYOL dataset which uses a form of RandomCrop that lends itself to
structural guidance to the latents.
2020-12-10 09:57:18 -07:00
James Betker
66cbae8731 Add random_dataset for testing 2020-12-09 14:55:05 -07:00
James Betker
97ff25a086 BYOL!
Man, is there anything ExtensibleTrainer can't train? :)
2020-12-08 13:07:53 -07:00
James Betker
c0aeaabc31 Spinenet playground 2020-12-07 12:49:32 -07:00
James Betker
88fc049c8d spinenet latent playground! 2020-12-05 20:30:36 -07:00
James Betker
11155aead4 Directly use dataset keys
This has been a long time coming. Cleans up messy "GT" nomenclature and simplifies ExtensibleTraner.feed_data
2020-12-04 20:14:53 -07:00
James Betker
06d1c62c5a iGPT support!
Sweeeeet
2020-12-03 15:32:21 -07:00
James Betker
c963e5f2ce Add ImageFolderDataset
This one has been a long time coming.. How does torch not have something like this?
2020-12-01 17:45:37 -07:00
James Betker
0c6d7971b9 Dataset documentation 2020-11-26 11:58:39 -07:00
James Betker
45a489110f Fix datasets 2020-11-26 11:50:38 -07:00
James Betker
205c9a5335 Learn how to functionally use srflow networks 2020-11-25 13:59:06 -07:00
James Betker
f3c1fc1bcd Dataset modifications 2020-11-24 13:20:12 -07:00
James Betker
f80acfcab6 Throw if dataset isn't going to work with force_multiple setting 2020-11-19 23:47:00 -07:00
James Betker
98eada1e4c More circular dependency fixes + unet fixes 2020-11-15 11:53:35 -07:00
James Betker
5cade6b874 Move stylegan2 around, bring in unet 2020-11-14 22:04:48 -07:00
James Betker
cdc5ac30e9 oddity 2020-11-13 20:11:57 -07:00
James Betker
2d3449d7a5 stylegan2 in ml art school! 2020-11-12 15:42:05 -07:00
James Betker
6be6c92e5d Fix yet ANOTHER OBO error in multi_frame_dataset 2020-11-06 20:38:34 -07:00
James Betker
c21088e238 Fix OBO error in multi_frame_dataset
In some datasets, this meant one frame was included in a sequence where it didn't belong. In datasets with mismatched chunk sizes, this resulted in an error.
2020-11-03 14:32:06 -07:00
James Betker
e990be0449 Improve ignore_first logic 2020-11-03 11:56:32 -07:00
James Betker
f13fdd43ed Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-02 08:47:42 -07:00
James Betker
fed16abc22 Report chunking errors 2020-11-02 08:47:18 -07:00
James Betker
3676f26d94 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-31 20:55:45 -06:00
James Betker
ea8c20c0e2 Fix bug with multiscale_dataset 2020-10-31 20:54:41 -06:00
James Betker
bb39d3efe5 Bump image corruption factor a bit 2020-10-31 20:50:24 -06:00
James Betker
7303d8c932 Add psnr approximator 2020-10-31 11:08:55 -06:00
James Betker
b24ff3c88d Fix bug that causes multiscale dataset to crash 2020-10-30 14:01:24 -06:00
James Betker
74738489b9 Fixes and additional support for progressive zoom 2020-10-30 09:59:54 -06:00
James Betker
25b007a0f5 Increase jpeg corruption & add error 2020-10-28 17:37:39 -06:00
James Betker
796659b0ac Add 'jpeg-normal' corruption 2020-10-28 16:40:47 -06:00
James Betker
31cf1ac98d Retrofit full_image_dataset to work with new arch. 2020-10-27 10:26:19 -06:00
James Betker
ff58c6484a Fixes to unified chunk datasets to support stereoscopic training 2020-10-26 11:12:22 -06:00