Commit Graph

193 Commits

Author SHA1 Message Date
James Betker
c0f61a2e15 Rework how DVAE tokens are ordered
It might make more sense to have top tokens, then bottom tokens
with top tokens having different discretized values.
2021-08-05 07:07:17 -06:00
James Betker
5037220ac7 Mods to support contrastive learning on audio files 2021-08-05 05:57:04 -06:00
James Betker
4c98b9703f Get dalle-style TTS to "work" 2021-08-03 21:08:27 -06:00
James Betker
2814307eee Alterations to support VQVAE on mel spectrograms 2021-08-01 07:54:21 -06:00
James Betker
dadc54795c Add gpt_tts 2021-07-27 20:33:30 -06:00
James Betker
96e90e7047 Add support for a gaussian-diffusion-based wave tacotron 2021-07-26 16:27:31 -06:00
James Betker
97d7cbbc34 Additional work for audio xformer (which doesnt really do a great job) 2021-07-23 10:58:14 -06:00
James Betker
d81386c1be Mods to support vqvae in audio mode (1d) 2021-07-20 08:36:46 -06:00
James Betker
1ff434218e tacotron2, ready for prime time! 2021-07-08 22:13:44 -06:00
James Betker
86fd3ad7fd Initial checkin of nvidia tacotron model & dataset
These two are tested, full support for training to come.
2021-07-06 11:11:35 -06:00
James Betker
6fd16ea9c8 Add meta-anomaly detection, colorjitter augmentation 2021-06-29 13:41:55 -06:00
James Betker
46e9f62be0 Add unet with latent guide
This is a diffusion network that uses both a LQ image
and a reference sample HQ image that is compressed into
a latent vector to perform upsampling

The hope is that we can steer the upsampling network
with sample images.
2021-06-26 11:02:58 -06:00
James Betker
a57ed8e960 Various mods to support better jpeg image filtering 2021-06-25 13:16:15 -06:00
James Betker
e7890dc0ba Misc fixes for diffusion nets 2021-06-21 10:38:07 -06:00
James Betker
6a75bd0777 Another fix 2021-06-14 09:51:44 -06:00
James Betker
54bff35171 Fix issue where eval was not being used by all ddp processes 2021-06-14 09:50:04 -06:00
James Betker
3e3ad7825f Add support for training an EMA network alongside the main networks 2021-06-12 21:01:41 -06:00
James Betker
696f320820 Get rid of feature networks 2021-06-11 20:50:07 -06:00
James Betker
65c474eecf Various changes to fix testing 2021-06-11 15:31:10 -06:00
James Betker
2ad2b56438 Don't do wandb except on rank 0 2021-06-06 16:52:07 -06:00
James Betker
692e9c417b Support diffusion unet 2021-06-06 13:57:22 -06:00
James Betker
fa908a6a15 Fix wandb import issue 2021-06-04 23:27:15 -06:00
James Betker
103a88506e Log eval to wandb 2021-06-04 23:23:20 -06:00
James Betker
6084915af8 Support gaussian diffusion models
Adds support for GD models, courtesy of some maths from openai.

Also:
- Fixes requirement for eval{} even when it isn't being used
- Adds support for denormalizing an imagenet norm
2021-06-02 21:47:32 -06:00
James Betker
45bc76ba92 Fixes and mods to support training classifiers on imagenet 2021-06-01 17:25:24 -06:00
James Betker
f129eaa39e Clean up byol a bit
- Remove option to aug in dataset (there's really no reason for this now that kornia works on GPU on windows)
- Other stufff
2021-05-24 21:35:46 -06:00
James Betker
119f17c808 Add testing capabilities for segformer & contrastive feature 2021-04-27 09:59:50 -06:00
James Betker
9bbe6fc81e Get segformer to a trainable state 2021-04-25 11:45:20 -06:00
James Betker
17555e7d07 misc adjustments for stylegan 2021-04-21 18:14:17 -06:00
James Betker
f89ea5f1c6 Mods to support lightweight_gan model 2021-03-02 20:51:48 -07:00
James Betker
784b96c059 Misc options to add support for training stylegan2-rosinality models:
- Allow image_folder_dataset to normalize inbound images
- ExtensibleTrainer can denormalize images on the output path
- Support .webp - an output from LSUN
- Support logistic GAN divergence loss
- Support stylegan2 TF weight extraction for discriminator
- New injector that produces latent noise (with separated paths)
- Modify FID evaluator to be operable with rosinality-style GANs
2021-02-08 08:09:21 -07:00
James Betker
7070142805 Make vqvae3_hard more configurable 2021-02-04 09:03:22 -07:00
James Betker
51b63b2aa6 Add switched_conv with hard routing and make vqvae use it. 2021-01-25 08:25:29 -07:00
James Betker
557cdec116 misc 2021-01-23 13:45:17 -07:00
James Betker
587a4f4050 resnet_unet_3
I'm being really lazy here - these nets are not really different from each other
except at which layer they terminate. This one terminates at 2x downsampling,
which is simply indicative of a direction I want to go for testing these pixpro networks.
2021-01-15 14:51:03 -07:00
James Betker
d1007ccfe7 Adjustments to pixpro to allow training against networks with arbitrarily large structural latents
- The pixpro latent now rescales the latent space instead of using a "coordinate vector", which
   **might** have performance implications.
- The latent against which the pixel loss is computed can now be a small, randomly sampled patch
   out of the entire latent, allowing further memory/computational discounts. Since the loss
   computation does not have a receptive field, this should not alter the loss.
- The instance projection size can now be separate from the pixel projection size.
- PixContrast removed entirely.
- ResUnet with full resolution added.
2021-01-12 09:17:45 -07:00
James Betker
34f8c8641f Support training imagenet classifier 2021-01-11 20:09:16 -07:00
James Betker
07168ecfb4 Enable vqvae to use a switched_conv variant 2021-01-09 20:53:14 -07:00
James Betker
61a86a3c1e VQVAE 2021-01-07 10:20:15 -07:00
James Betker
9fed90393f Add lucidrains pixpro trainer 2021-01-05 20:14:22 -07:00
James Betker
2c65b6b28e More mods to support styledsr 2021-01-04 11:32:28 -07:00
James Betker
4d8064c32c Modifications to allow partially trained stylegan discriminators to be used 2021-01-03 16:37:18 -07:00
James Betker
ce6524184c Do the last commit but in a better way 2021-01-02 22:24:12 -07:00
James Betker
edf9c38198 Make ExtensibleTrainer set the starting step for the LR scheduler 2021-01-02 22:22:34 -07:00
James Betker
bdbab65082 Allow optimizers to train separate param groups, add higher dimensional VGG discriminator
Did this to support training 512x512px networks off of a pretrained 256x256 network.
2021-01-02 15:10:06 -07:00
James Betker
193cdc6636 Move discriminators to the create_model paradigm
Also cleans up a lot of old discriminator models that I have no intention
of using again.
2021-01-01 15:56:09 -07:00
James Betker
9864fe4c04 Fix for train.py 2021-01-01 11:59:00 -07:00
James Betker
0eb1f4dd67 Revert "Get rid of CUDA_VISIBLE_DEVICES"
It is actually necessary for training in distributed mode. Only
do it then.
2020-12-31 10:31:40 -07:00
James Betker
1de1fa30ac Disable refs and centers altogether in single_image_dataset
I suspect that this might be a cause of failures on parallel datasets.
Plus it is unnecessary computation.
2020-12-31 10:13:24 -07:00
James Betker
8f0984cacf Add sr_fid evaluator 2020-12-30 20:18:58 -07:00