Commit Graph

792 Commits

Author SHA1 Message Date
James Betker
025a5867c4 Use syncbatchnorm instead 2021-02-04 22:26:36 -07:00
James Betker
bb79fafb89 Fix groupnorm specification 2021-02-04 22:15:38 -07:00
James Betker
43da1f9c4b Convert lambda coupler to use groupnorm instead of batchnorm 2021-02-04 21:59:44 -07:00
James Betker
7070142805 Make vqvae3_hard more configurable 2021-02-04 09:03:22 -07:00
James Betker
b980028ca8 Add get_debug_values for vqvae_3_hardswitch 2021-02-03 14:12:24 -07:00
James Betker
1405ff06b8 Fix SwitchedConvHardRoutingFunction for current cuda router 2021-02-03 14:11:55 -07:00
James Betker
d7bec392dd ... 2021-02-02 23:50:25 -07:00
James Betker
b0a8fa00bc Visual dbg in vqvae3hs 2021-02-02 23:50:01 -07:00
James Betker
f5f91850fd hardswitch variant of vqvae3 2021-02-02 21:00:04 -07:00
James Betker
320edbaa3c Move switched_conv logic around a bit 2021-02-02 20:41:24 -07:00
James Betker
0dca36946f Hard Routing mods
- Turns out my custom convolution was RIDDLED with backwards bugs, which is
   why the existing implementation wasn't working so well.
- Implements the switch logic from both Mixture of Experts and Switch Transformers
  for testing purposes.
2021-02-02 20:35:58 -07:00
James Betker
29c1c3bede Register vqvae3 2021-01-29 15:26:28 -07:00
James Betker
bc20b4739e vqvae3
Changes VQVAE as so:
- Reverts back to smaller codebook
- Adds an additional conv layer at the highest resolution for both the encoder & decoder
- Uses LeakyReLU on trunk
2021-01-29 15:24:26 -07:00
James Betker
96bc80313c Add switch norm, up dropout rate, detach selector 2021-01-26 09:31:53 -07:00
James Betker
2cdac6bd09 Add PWCNet for human optical flow 2021-01-25 08:25:44 -07:00
James Betker
51b63b2aa6 Add switched_conv with hard routing and make vqvae use it. 2021-01-25 08:25:29 -07:00
James Betker
ae4ff4a1e7 Enable lambda visualization 2021-01-23 15:53:27 -07:00
James Betker
10ec6bda1d lambda nets in switched_conv and a vqvae to use it 2021-01-23 14:57:57 -07:00
James Betker
b374dcdd46 update vqvae to double codebook size for bottom quantizer 2021-01-23 13:47:07 -07:00
James Betker
1b8a26db93 New switched_conv 2021-01-23 13:46:30 -07:00
James Betker
d919ae7148 Add VQVAE with no Conv2dTranspose 2021-01-18 08:49:59 -07:00
James Betker
587a4f4050 resnet_unet_3
I'm being really lazy here - these nets are not really different from each other
except at which layer they terminate. This one terminates at 2x downsampling,
which is simply indicative of a direction I want to go for testing these pixpro networks.
2021-01-15 14:51:03 -07:00
James Betker
038b8654b6 Pixpro: unwrap losses 2021-01-13 11:54:25 -07:00
James Betker
8990801a3f Fix pixpro stochastic sampling bugs 2021-01-13 11:34:24 -07:00
James Betker
19475a072f Pixpro: Rather than using a latent square for pixpro, use an entirely stochastic sampling of the pixels 2021-01-13 11:26:51 -07:00
James Betker
d1007ccfe7 Adjustments to pixpro to allow training against networks with arbitrarily large structural latents
- The pixpro latent now rescales the latent space instead of using a "coordinate vector", which
   **might** have performance implications.
- The latent against which the pixel loss is computed can now be a small, randomly sampled patch
   out of the entire latent, allowing further memory/computational discounts. Since the loss
   computation does not have a receptive field, this should not alter the loss.
- The instance projection size can now be separate from the pixel projection size.
- PixContrast removed entirely.
- ResUnet with full resolution added.
2021-01-12 09:17:45 -07:00
James Betker
34f8c8641f Support training imagenet classifier 2021-01-11 20:09:16 -07:00
James Betker
f3db381fa1 Allow uresnet to use pretrained resnet50 2021-01-10 12:57:31 -07:00
James Betker
07168ecfb4 Enable vqvae to use a switched_conv variant 2021-01-09 20:53:14 -07:00
James Betker
5a8156026a Did anyone ask for k-means clustering?
This is so cool...
2021-01-07 22:37:41 -07:00
James Betker
de10c7246a Add injected noise into bypass maps 2021-01-07 16:31:12 -07:00
James Betker
61a86a3c1e VQVAE 2021-01-07 10:20:15 -07:00
James Betker
01a589e712 Adjustments to pixpro & resnet-unet
I'm not really satisfied with what I got out of these networks on round 1.
Lets try again..
2021-01-06 15:00:46 -07:00
James Betker
2f2f87bbea Styled SR fixes 2021-01-05 20:14:39 -07:00
James Betker
9fed90393f Add lucidrains pixpro trainer 2021-01-05 20:14:22 -07:00
James Betker
ade2732c82 Transfer learning for styleSR
This is a concept from "Lifelong Learning GAN", although I'm skeptical of it's novelty -
basically you scale and shift the weights for the generator and discriminator of a pretrained
GAN to "shift" into new modalities, e.g. faces->birds or whatever. There are some interesting
applications of this that I would like to try out.
2021-01-04 20:10:48 -07:00
James Betker
2c65b6b28e More mods to support styledsr 2021-01-04 11:32:28 -07:00
James Betker
2225fe6ac2 Undo lucidrains changes for new discriminator
This "new" code will live in the styledsr directory from now on.
2021-01-04 10:57:09 -07:00
James Betker
40ec71da81 Move styled_sr into its own folder 2021-01-04 10:54:34 -07:00
James Betker
5916f5f7d4 Misc fixes 2021-01-04 10:53:53 -07:00
James Betker
4d8064c32c Modifications to allow partially trained stylegan discriminators to be used 2021-01-03 16:37:18 -07:00
James Betker
bdbab65082 Allow optimizers to train separate param groups, add higher dimensional VGG discriminator
Did this to support training 512x512px networks off of a pretrained 256x256 network.
2021-01-02 15:10:06 -07:00
James Betker
193cdc6636 Move discriminators to the create_model paradigm
Also cleans up a lot of old discriminator models that I have no intention
of using again.
2021-01-01 15:56:09 -07:00
James Betker
f39179e85a styled_sr: fix bug when using initial_stride 2021-01-01 12:13:21 -07:00
James Betker
913fc3b75e Need init to pick up styled_sr 2021-01-01 12:10:32 -07:00
James Betker
e992e18767 Add initial_stride term to style_sr
Also fix fid and a networks.py issue.
2021-01-01 11:59:36 -07:00
James Betker
e214e6ce33 Styled SR model 2020-12-31 20:54:18 -07:00
James Betker
b1fb82476b Add gp debug (fix) 2020-12-30 15:26:54 -07:00
James Betker
63cf3d3126 Injector auto-registration
I love it!
2020-12-29 20:58:02 -07:00
James Betker
a777c1e4f9 Misc script fixes 2020-12-29 20:25:09 -07:00
James Betker
ba543d1152 Glean mods
- Fixes fixed upscale factor issues
- Refines a few ops to decrease computation & parameterization
2020-12-27 12:25:06 -07:00
James Betker
f9be049adb GLEAN mod to support custom initial strides 2020-12-26 13:51:14 -07:00
James Betker
3fd627fc62 Mods to support image classification & filtering 2020-12-26 13:49:27 -07:00
James Betker
10fdfa1563 Migrate generators to dynamic model registration 2020-12-24 23:02:10 -07:00
James Betker
29db7c7a02 Further mods to BYOL 2020-12-24 09:28:41 -07:00
James Betker
036684893e Add LARS optimizer & support for BYOL idiosyncrasies
- Added LARS and SGD optimizer variants that support turning off certain
  features for BN and bias layers
- Added a variant of pytorch's resnet model that supports gradient checkpointing.
- Modify the trainer infrastructure to support above
- Fix bug with BYOL (should have been nonfunctional)
2020-12-23 20:33:43 -07:00
James Betker
1bbcb96ee8 Implement a few changes to support training BYOL networks 2020-12-23 10:50:23 -07:00
James Betker
ae666dc520 Fix bugs with srflow after refactor 2020-12-19 10:28:23 -07:00
James Betker
4328c2f713 Change default ReLU slope to .2 BREAKS COMPATIBILITY
This conforms my ConvGnLelu implementation with the generally accepted negative_slope=.2. I have no idea where I got .1. This will break backwards compatibility with some older models but will likely improve their performance when freshly trained. I did some auditing to find what these models might be, and I am not actively using any of them, so probably OK.
2020-12-19 08:28:03 -07:00
James Betker
9377d34ac3 glean mods 2020-12-19 08:26:07 -07:00
James Betker
92f9a129f7 GLEAN! 2020-12-18 16:04:19 -07:00
James Betker
c717765bcb Notes for lucidrains converter. 2020-12-18 09:55:38 -07:00
James Betker
b4720ea377 Move stylegan to new location 2020-12-18 09:52:36 -07:00
James Betker
1708136b55 Commit my attempt at "conforming" the lucidrains stylegan implementation to the reference spec. Not working. will probably be abandoned. 2020-12-18 09:51:48 -07:00
James Betker
209332292a Rosinality stylegan fix 2020-12-18 09:50:41 -07:00
James Betker
d875ca8342 More refactor changes 2020-12-18 09:24:31 -07:00
James Betker
5640e4efe4 More refactoring 2020-12-18 09:18:34 -07:00
James Betker
b905b108da Large cleanup
Removed a lot of old code that I won't be touching again. Refactored some
code elements into more logical places.
2020-12-18 09:10:44 -07:00
James Betker
3074f41877 Get rosinality model converter to work
Mostly, just needed to remove the custom cuda ops, not so bueno on Windows.
2020-12-17 16:03:39 -07:00
James Betker
e838c6e75b Rosinality stylegan2 port 2020-12-17 14:18:46 -07:00
James Betker
49327b99fe SRFlow outputs RRDB output 2020-12-16 10:28:02 -07:00
James Betker
c25b49bb12 Clean up of SRFlowNet_arch 2020-12-16 10:27:38 -07:00
James Betker
42ac8e3eeb Remove unnecessary comment from SRFlowNet 2020-12-16 09:43:07 -07:00
James Betker
09de3052ac Add softmax to spinenet classification head 2020-12-16 09:42:15 -07:00
James Betker
8661207d57 Merge branch 'gan_lab' of https://github.com/neonbjb/DL-Art-School into gan_lab 2020-12-15 17:16:48 -07:00
James Betker
fc376d34b2 Spinenet with logits head 2020-12-15 17:16:19 -07:00
James Betker
0a19e53df0 BYOL mods 2020-12-14 23:59:11 -07:00
James Betker
ef7eabf457 Allow RRDB to upscale 8x 2020-12-14 23:58:52 -07:00
James Betker
ec0ee25f4b Structural latents checkpoint 2020-12-11 12:01:09 -07:00
James Betker
26ceca68c0 BYOL with structure! 2020-12-10 15:07:35 -07:00
James Betker
c203cee31e Allow swapping to torch DDP as needed in code 2020-12-09 15:03:59 -07:00
James Betker
97ff25a086 BYOL!
Man, is there anything ExtensibleTrainer can't train? :)
2020-12-08 13:07:53 -07:00
James Betker
bca59ed98a Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-12-07 12:51:04 -07:00
James Betker
ea56eb61f0 Fix DDP errors for discriminator
- Don't define training_net in define_optimizers - this drops the shell and leads to problems downstream
- Get rid of support for multiple training nets per opt. This was half baked and needs a better solution if needed
   downstream.
2020-12-07 12:50:57 -07:00
James Betker
88fc049c8d spinenet latent playground! 2020-12-05 20:30:36 -07:00
James Betker
11155aead4 Directly use dataset keys
This has been a long time coming. Cleans up messy "GT" nomenclature and simplifies ExtensibleTraner.feed_data
2020-12-04 20:14:53 -07:00
James Betker
8a83b1c716 Go back to apex DDP, fix distributed bugs 2020-12-04 16:39:21 -07:00
James Betker
7a81d4e2f4 Revert gaussian loss changes 2020-12-04 12:49:20 -07:00
James Betker
711780126e Cleanup 2020-12-03 23:42:51 -07:00
James Betker
ac7256d4a3 Do tqdm reporting when calculating flow_gaussian_nll 2020-12-03 23:42:29 -07:00
James Betker
dc9ff8e05b Allow the majority of the srflow steps to checkpoint 2020-12-03 23:41:57 -07:00
James Betker
06d1c62c5a iGPT support!
Sweeeeet
2020-12-03 15:32:21 -07:00
James Betker
c18adbd606 Delete mdcn & panet
Garbage, all of it.
2020-12-02 22:25:57 -07:00
James Betker
f2880b33c9 Get rid of mean shift from MDCN 2020-12-02 14:18:33 -07:00
James Betker
8a00f15746 Implement FlowGaussianNll evaluator 2020-12-02 14:09:54 -07:00
James Betker
edf408508c Fix discriminator 2020-12-01 17:45:56 -07:00
James Betker
9a421a41f4 SRFlow: accomodate mismatches between global scale and flow_scale 2020-12-01 11:11:51 -07:00
James Betker
e343722d37 Add stepped rrdb 2020-12-01 11:11:15 -07:00
James Betker
2e0bbda640 Remove unused archs 2020-12-01 11:10:48 -07:00
James Betker
a1c8300052 Add mdcn 2020-11-30 16:14:21 -07:00
James Betker
1e0f69e34b extra_conv in gn discriminator, multiframe support in rrdb. 2020-11-29 15:39:50 -07:00
James Betker
da604752e6 Misc RRDB changes 2020-11-29 12:21:31 -07:00
James Betker
a1d4c9f83c multires rrdb work 2020-11-28 14:35:46 -07:00
James Betker
929cd45c05 Fix for RRDB scale 2020-11-27 21:37:10 -07:00
James Betker
71fa532356 Adjustments to how flow networks set size and scale 2020-11-27 21:37:00 -07:00
James Betker
6f958bb150 Maybe this is necessary after all? 2020-11-27 15:21:13 -07:00
James Betker
ef8d5f88c1 Bring split gaussian nll out of split so it can be computed accurately with the rest of the nll component 2020-11-27 13:30:21 -07:00
James Betker
4ab49b0d69 RRDB disc work 2020-11-27 12:03:08 -07:00
James Betker
6de4dabb73 Remove srflow (modified version)
Starting from orig and re-working from there.
2020-11-27 12:02:06 -07:00
James Betker
fd356580c0 Play with lambdas 2020-11-26 20:30:55 -07:00
James Betker
cb045121b3 Expose srflow rrdb 2020-11-24 13:20:20 -07:00
James Betker
f6098155cd Mods to tecogan to allow use of embeddings as input 2020-11-24 09:24:02 -07:00
James Betker
b10bcf6436 Rework stylegan_for_sr to incorporate structure as an adain block 2020-11-23 11:31:11 -07:00
James Betker
519ba6f10c Support 2x RRDB with 4x srflow 2020-11-21 14:46:15 -07:00
James Betker
cad92bada8 Report logp and logdet for srflow 2020-11-21 10:13:05 -07:00
James Betker
c37d3faa58 More adjustments to srflow_orig 2020-11-20 19:38:33 -07:00
James Betker
d51d12a41a Adjustments to srflow to (maybe?) fix training 2020-11-20 14:44:24 -07:00
James Betker
6c8c35ac47 Support training RRDB encoder [srflow] 2020-11-20 10:03:06 -07:00
James Betker
5ccdbcefe3 srflow_orig integration 2020-11-19 23:47:24 -07:00
James Betker
2b2d754d8e Bring in an original SRFlow implementation for reference 2020-11-19 21:42:39 -07:00
James Betker
1e0d7be3ce "Clean up" SRFlow 2020-11-19 21:42:24 -07:00
James Betker
d7877d0a36 Fixes to teco losses and translational losses 2020-11-19 11:35:05 -07:00
James Betker
5c10264538 Remove pyramid_disc hard dependencies 2020-11-17 18:34:11 -07:00
James Betker
6b679e2b51 Make grad_penalty available to classical discs 2020-11-17 18:31:40 -07:00
James Betker
8a19c9ae15 Add additive mode to rrdb 2020-11-16 20:45:09 -07:00
James Betker
2a507987df Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-15 16:16:30 -07:00
James Betker
931ed903c1 Allow combined additive loss 2020-11-15 16:16:18 -07:00
James Betker
4b68116977 import fix 2020-11-15 16:15:42 -07:00
James Betker
98eada1e4c More circular dependency fixes + unet fixes 2020-11-15 11:53:35 -07:00
James Betker
e587d549f7 Fix circular imports 2020-11-15 11:32:35 -07:00
James Betker
99f0cfaab5 Rework stylegan2 divergence losses
Notably: include unet loss
2020-11-15 11:26:44 -07:00
James Betker
ea94b93a37 Fixes for unet 2020-11-15 10:38:33 -07:00
James Betker
89f56b2091 Fix another import 2020-11-14 22:10:45 -07:00
James Betker
9af049c671 Import fix for unet 2020-11-14 22:09:18 -07:00
James Betker
5cade6b874 Move stylegan2 around, bring in unet 2020-11-14 22:04:48 -07:00
James Betker
125cb16dce Add a FID evaluator for stylegan with structural guidance 2020-11-14 20:16:07 -07:00
James Betker
c9258e2da3 Alter how structural guidance is given to stylegan 2020-11-14 20:15:48 -07:00
James Betker
3397c83447 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-14 09:30:09 -07:00
James Betker
423ee7cb90 Allow attention to be specified for stylegan2 2020-11-14 09:29:53 -07:00
James Betker
f406a5dd4c Mods to support stylegan2 in SR mode 2020-11-13 20:11:50 -07:00
James Betker
9c3d0b7560 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-13 20:10:47 -07:00
James Betker
67bf55495b Allow hq_batched_key to be specified 2020-11-13 20:10:12 -07:00
James Betker
0b96811611 Fix another issue with gpu ids getting thrown all over hte place 2020-11-13 20:05:52 -07:00
James Betker
a07e1a7292 Add separate Evaluator module and FID evaluator 2020-11-13 11:03:54 -07:00
James Betker
080ad61be4 Add option to work with nonrandom latents 2020-11-12 21:23:50 -07:00
James Betker
566b99ca75 GP adjustments for stylegan2 2020-11-12 16:44:51 -07:00
James Betker
44a19cd37c ExtensibleTrainer mods to support advanced checkpointing for stylegan2
Basically: stylegan2 makes use of gradient-based normalizers. These
make it so that I cannot use gradient checkpointing. But I love gradient
checkpointing. It makes things really, really fast and memory conscious.

So - only don't checkpoint when we run the regularizer loss. This is a
bit messy, but speeds up training by at least 20%.

Also: pytorch: please make checkpointing a first class citizen.
2020-11-12 15:45:07 -07:00
James Betker
db9e9e28a0 Fix an issue where GPU0 was always being used in non-ddp
Frankly, I don't understand how this has ever worked. WTF.
2020-11-12 15:43:01 -07:00
James Betker
2d3449d7a5 stylegan2 in ml art school! 2020-11-12 15:42:05 -07:00
James Betker
fd97573085 Fixes 2020-11-11 21:49:06 -07:00