Commit Graph

241 Commits

Author SHA1 Message Date
James Betker
4c98b9703f Get dalle-style TTS to "work" 2021-08-03 21:08:27 -06:00
James Betker
2814307eee Alterations to support VQVAE on mel spectrograms 2021-08-01 07:54:21 -06:00
James Betker
dadc54795c Add gpt_tts 2021-07-27 20:33:30 -06:00
James Betker
96e90e7047 Add support for a gaussian-diffusion-based wave tacotron 2021-07-26 16:27:31 -06:00
James Betker
97d7cbbc34 Additional work for audio xformer (which doesnt really do a great job) 2021-07-23 10:58:14 -06:00
James Betker
d81386c1be Mods to support vqvae in audio mode (1d) 2021-07-20 08:36:46 -06:00
James Betker
1ff434218e tacotron2, ready for prime time! 2021-07-08 22:13:44 -06:00
James Betker
86fd3ad7fd Initial checkin of nvidia tacotron model & dataset
These two are tested, full support for training to come.
2021-07-06 11:11:35 -06:00
James Betker
6fd16ea9c8 Add meta-anomaly detection, colorjitter augmentation 2021-06-29 13:41:55 -06:00
James Betker
46e9f62be0 Add unet with latent guide
This is a diffusion network that uses both a LQ image
and a reference sample HQ image that is compressed into
a latent vector to perform upsampling

The hope is that we can steer the upsampling network
with sample images.
2021-06-26 11:02:58 -06:00
James Betker
a57ed8e960 Various mods to support better jpeg image filtering 2021-06-25 13:16:15 -06:00
James Betker
e7890dc0ba Misc fixes for diffusion nets 2021-06-21 10:38:07 -06:00
James Betker
6a75bd0777 Another fix 2021-06-14 09:51:44 -06:00
James Betker
54bff35171 Fix issue where eval was not being used by all ddp processes 2021-06-14 09:50:04 -06:00
James Betker
3e3ad7825f Add support for training an EMA network alongside the main networks 2021-06-12 21:01:41 -06:00
James Betker
696f320820 Get rid of feature networks 2021-06-11 20:50:07 -06:00
James Betker
65c474eecf Various changes to fix testing 2021-06-11 15:31:10 -06:00
James Betker
2ad2b56438 Don't do wandb except on rank 0 2021-06-06 16:52:07 -06:00
James Betker
692e9c417b Support diffusion unet 2021-06-06 13:57:22 -06:00
James Betker
fa908a6a15 Fix wandb import issue 2021-06-04 23:27:15 -06:00
James Betker
103a88506e Log eval to wandb 2021-06-04 23:23:20 -06:00
James Betker
6084915af8 Support gaussian diffusion models
Adds support for GD models, courtesy of some maths from openai.

Also:
- Fixes requirement for eval{} even when it isn't being used
- Adds support for denormalizing an imagenet norm
2021-06-02 21:47:32 -06:00
James Betker
45bc76ba92 Fixes and mods to support training classifiers on imagenet 2021-06-01 17:25:24 -06:00
James Betker
f129eaa39e Clean up byol a bit
- Remove option to aug in dataset (there's really no reason for this now that kornia works on GPU on windows)
- Other stufff
2021-05-24 21:35:46 -06:00
James Betker
119f17c808 Add testing capabilities for segformer & contrastive feature 2021-04-27 09:59:50 -06:00
James Betker
9bbe6fc81e Get segformer to a trainable state 2021-04-25 11:45:20 -06:00
James Betker
17555e7d07 misc adjustments for stylegan 2021-04-21 18:14:17 -06:00
James Betker
f89ea5f1c6 Mods to support lightweight_gan model 2021-03-02 20:51:48 -07:00
James Betker
784b96c059 Misc options to add support for training stylegan2-rosinality models:
- Allow image_folder_dataset to normalize inbound images
- ExtensibleTrainer can denormalize images on the output path
- Support .webp - an output from LSUN
- Support logistic GAN divergence loss
- Support stylegan2 TF weight extraction for discriminator
- New injector that produces latent noise (with separated paths)
- Modify FID evaluator to be operable with rosinality-style GANs
2021-02-08 08:09:21 -07:00
James Betker
7070142805 Make vqvae3_hard more configurable 2021-02-04 09:03:22 -07:00
James Betker
51b63b2aa6 Add switched_conv with hard routing and make vqvae use it. 2021-01-25 08:25:29 -07:00
James Betker
557cdec116 misc 2021-01-23 13:45:17 -07:00
James Betker
587a4f4050 resnet_unet_3
I'm being really lazy here - these nets are not really different from each other
except at which layer they terminate. This one terminates at 2x downsampling,
which is simply indicative of a direction I want to go for testing these pixpro networks.
2021-01-15 14:51:03 -07:00
James Betker
d1007ccfe7 Adjustments to pixpro to allow training against networks with arbitrarily large structural latents
- The pixpro latent now rescales the latent space instead of using a "coordinate vector", which
   **might** have performance implications.
- The latent against which the pixel loss is computed can now be a small, randomly sampled patch
   out of the entire latent, allowing further memory/computational discounts. Since the loss
   computation does not have a receptive field, this should not alter the loss.
- The instance projection size can now be separate from the pixel projection size.
- PixContrast removed entirely.
- ResUnet with full resolution added.
2021-01-12 09:17:45 -07:00
James Betker
34f8c8641f Support training imagenet classifier 2021-01-11 20:09:16 -07:00
James Betker
07168ecfb4 Enable vqvae to use a switched_conv variant 2021-01-09 20:53:14 -07:00
James Betker
61a86a3c1e VQVAE 2021-01-07 10:20:15 -07:00
James Betker
9fed90393f Add lucidrains pixpro trainer 2021-01-05 20:14:22 -07:00
James Betker
2c65b6b28e More mods to support styledsr 2021-01-04 11:32:28 -07:00
James Betker
4d8064c32c Modifications to allow partially trained stylegan discriminators to be used 2021-01-03 16:37:18 -07:00
James Betker
ce6524184c Do the last commit but in a better way 2021-01-02 22:24:12 -07:00
James Betker
edf9c38198 Make ExtensibleTrainer set the starting step for the LR scheduler 2021-01-02 22:22:34 -07:00
James Betker
bdbab65082 Allow optimizers to train separate param groups, add higher dimensional VGG discriminator
Did this to support training 512x512px networks off of a pretrained 256x256 network.
2021-01-02 15:10:06 -07:00
James Betker
193cdc6636 Move discriminators to the create_model paradigm
Also cleans up a lot of old discriminator models that I have no intention
of using again.
2021-01-01 15:56:09 -07:00
James Betker
9864fe4c04 Fix for train.py 2021-01-01 11:59:00 -07:00
James Betker
0eb1f4dd67 Revert "Get rid of CUDA_VISIBLE_DEVICES"
It is actually necessary for training in distributed mode. Only
do it then.
2020-12-31 10:31:40 -07:00
James Betker
1de1fa30ac Disable refs and centers altogether in single_image_dataset
I suspect that this might be a cause of failures on parallel datasets.
Plus it is unnecessary computation.
2020-12-31 10:13:24 -07:00
James Betker
8f0984cacf Add sr_fid evaluator 2020-12-30 20:18:58 -07:00
James Betker
a777c1e4f9 Misc script fixes 2020-12-29 20:25:09 -07:00
James Betker
3fd627fc62 Mods to support image classification & filtering 2020-12-26 13:49:27 -07:00
James Betker
29db7c7a02 Further mods to BYOL 2020-12-24 09:28:41 -07:00
James Betker
1bbcb96ee8 Implement a few changes to support training BYOL networks 2020-12-23 10:50:23 -07:00
James Betker
2437b33e74 Fix srflow_latent_space_playground bug 2020-12-22 15:42:38 -07:00
James Betker
e82f4552db Update other docs with dumb config options 2020-12-18 16:21:28 -07:00
James Betker
5640e4efe4 More refactoring 2020-12-18 09:18:34 -07:00
James Betker
a8179ff53c Image label work 2020-12-18 08:53:18 -07:00
James Betker
fb2cfc795b Update requirements, add image_patch_classifier tool 2020-12-16 09:42:50 -07:00
James Betker
e5a3e6b9b5 srflow latent space misc 2020-12-14 23:59:49 -07:00
James Betker
ec0ee25f4b Structural latents checkpoint 2020-12-11 12:01:09 -07:00
James Betker
a5630d282f Get rid of 2nd trainer 2020-12-10 09:57:38 -07:00
James Betker
11155aead4 Directly use dataset keys
This has been a long time coming. Cleans up messy "GT" nomenclature and simplifies ExtensibleTraner.feed_data
2020-12-04 20:14:53 -07:00
James Betker
8a83b1c716 Go back to apex DDP, fix distributed bugs 2020-12-04 16:39:21 -07:00
James Betker
8a00f15746 Implement FlowGaussianNll evaluator 2020-12-02 14:09:54 -07:00
James Betker
2e0bbda640 Remove unused archs 2020-12-01 11:10:48 -07:00
James Betker
da604752e6 Misc RRDB changes 2020-11-29 12:21:31 -07:00
James Betker
a1d4c9f83c multires rrdb work 2020-11-28 14:35:46 -07:00
James Betker
ef8d5f88c1 Bring split gaussian nll out of split so it can be computed accurately with the rest of the nll component 2020-11-27 13:30:21 -07:00
James Betker
fd356580c0 Play with lambdas 2020-11-26 20:30:55 -07:00
James Betker
45a489110f Fix datasets 2020-11-26 11:50:38 -07:00
James Betker
f6098155cd Mods to tecogan to allow use of embeddings as input 2020-11-24 09:24:02 -07:00
James Betker
b10bcf6436 Rework stylegan_for_sr to incorporate structure as an adain block 2020-11-23 11:31:11 -07:00
James Betker
5ccdbcefe3 srflow_orig integration 2020-11-19 23:47:24 -07:00
James Betker
d7877d0a36 Fixes to teco losses and translational losses 2020-11-19 11:35:05 -07:00
James Betker
6b679e2b51 Make grad_penalty available to classical discs 2020-11-17 18:31:40 -07:00
James Betker
8a19c9ae15 Add additive mode to rrdb 2020-11-16 20:45:09 -07:00
James Betker
125cb16dce Add a FID evaluator for stylegan with structural guidance 2020-11-14 20:16:07 -07:00
James Betker
ec621c69b5 Fix train bug 2020-11-14 09:29:08 -07:00
James Betker
a07e1a7292 Add separate Evaluator module and FID evaluator 2020-11-13 11:03:54 -07:00
James Betker
fc55bdb24e Mods to how wandb are integrated 2020-11-12 15:45:25 -07:00
James Betker
db9e9e28a0 Fix an issue where GPU0 was always being used in non-ddp
Frankly, I don't understand how this has ever worked. WTF.
2020-11-12 15:43:01 -07:00
James Betker
88f349bdf1 Enable usage of wandb 2020-11-11 21:48:56 -07:00
James Betker
6a2fd5f7d0 Lots of new discriminator nets 2020-11-10 16:06:54 -07:00
James Betker
0cf52ef52c latent work 2020-11-06 20:38:23 -07:00
James Betker
74738489b9 Fixes and additional support for progressive zoom 2020-10-30 09:59:54 -06:00
James Betker
607ff3c67c RRDB with bypass 2020-10-29 09:39:45 -06:00
James Betker
da53090ce6 More adjustments to support distributed training with teco & on multi_modal_train 2020-10-27 20:58:03 -06:00
James Betker
2a3eec8fd7 Fix some distributed training snafus 2020-10-27 15:24:05 -06:00
James Betker
ff58c6484a Fixes to unified chunk datasets to support stereoscopic training 2020-10-26 11:12:22 -06:00
James Betker
9c3d059ef0 Updates to be able to train flownet2 in ExtensibleTrainer
Only supports basic losses for now, though.
2020-10-24 11:56:39 -06:00
James Betker
8636492db0 Copy train.py mods to train2 2020-10-22 17:16:36 -06:00
James Betker
e9c0b9f0fd More adjustments to support multi-modal training
Specifically - looks like at least MSE loss cannot handle autocasted tensors
2020-10-22 16:49:34 -06:00
James Betker
76789a456f Class-ify train.py and workon multi-modal trainer 2020-10-22 16:15:31 -06:00
James Betker
3e3d2af1f3 Add multi-modal trainer 2020-10-22 13:27:32 -06:00
James Betker
5753e77d67 ChainedGen: Output debugging information on blocks 2020-10-21 16:36:23 -06:00
James Betker
d8c6a4bbb8 Misc 2020-10-20 12:56:52 -06:00
James Betker
331c40f0c8 Allow starting step to be forced
Useful for testing purposes or to force a validation.
2020-10-19 15:23:04 -06:00
James Betker
981d64413b Support validation over a custom injector
Also re-enable PSNR
2020-10-19 11:01:56 -06:00
James Betker
9ead2c0a08 Multiscale training in! 2020-10-17 22:54:12 -06:00
James Betker
d856378b2e Add ChainedGenWithStructure 2020-10-16 20:44:36 -06:00
James Betker
6f8705e8cb SSGSimpler network 2020-10-15 17:18:44 -06:00