Commit Graph

341 Commits

Author SHA1 Message Date
James Betker
40dc2938e8 Fix multifaceted chain gen 2020-10-22 13:27:06 -06:00
James Betker
1ef559d7ca Add a ChainedEmbeddingGen which can be simueltaneously used with multiple training paradigms 2020-10-21 22:21:51 -06:00
James Betker
5753e77d67 ChainedGen: Output debugging information on blocks 2020-10-21 16:36:23 -06:00
James Betker
dca5cddb3b Add bypass to ChainedEmbeddingGen 2020-10-21 11:07:45 -06:00
James Betker
a63bf2ea2f Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-19 15:26:11 -06:00
James Betker
76e4f0c086 Restore test.py for use as standalone validator 2020-10-19 15:26:07 -06:00
James Betker
1b1ca297f8 Fix recurrent=None bug in ChainedEmbeddingGen 2020-10-19 15:25:12 -06:00
James Betker
7df378a944 Remove separated vgg discriminator
Checkpointing happens inline instead. Was a dumb idea..

Also fixes some loss reporting issues.
2020-10-18 12:10:24 -06:00
James Betker
552e70a032 Get rid of excessive checkpointed disc params 2020-10-18 10:09:37 -06:00
James Betker
6a0d5f4813 Add a checkpointable discriminator 2020-10-18 09:57:47 -06:00
James Betker
9ead2c0a08 Multiscale training in! 2020-10-17 22:54:12 -06:00
James Betker
e706911c83 Fix spinenet bug 2020-10-17 20:20:36 -06:00
James Betker
b008a27d39 Spinenet should allow bypassing the initial conv
This makes feeding in references for recurrence easier.
2020-10-17 20:16:47 -06:00
James Betker
c1c9c5681f Swap recurrence 2020-10-17 08:40:28 -06:00
James Betker
6141aa1110 More recurrence fixes for chainedgen 2020-10-17 08:35:46 -06:00
James Betker
fc4c064867 Add recurrent support to chainedgenwithstructure 2020-10-17 08:31:34 -06:00
James Betker
d4a3e11ab2 Don't use several stages of spinenet_arch
These are used for lower outputs which I am not using
2020-10-17 08:28:37 -06:00
James Betker
d856378b2e Add ChainedGenWithStructure 2020-10-16 20:44:36 -06:00
James Betker
617d97e19d Add ChainedEmbeddingGen 2020-10-15 23:18:08 -06:00
James Betker
c4543ce124 Set post_transform_block to None where applicable 2020-10-15 17:20:42 -06:00
James Betker
6f8705e8cb SSGSimpler network 2020-10-15 17:18:44 -06:00
James Betker
920865defb Arch work 2020-10-15 10:13:06 -06:00
James Betker
1f20d59c31 Revert big switch back 2020-10-14 11:03:34 -06:00
James Betker
17d78195ee Mods to SRG to support returning switch logits 2020-10-13 20:46:37 -06:00
James Betker
cc915303a5 Fix SPSR calls into SwitchComputer 2020-10-13 10:14:47 -06:00
James Betker
9a5d6162e9 Add the "BigSwitch" 2020-10-13 10:11:10 -06:00
James Betker
ca523215c6 Fix recurrent std in arch 2020-10-12 17:42:32 -06:00
James Betker
597b6e92d6 Add ssgr1 recurrence 2020-10-12 17:18:19 -06:00
James Betker
ce163ad4a9 Update SSGdeep 2020-10-12 10:22:08 -06:00
James Betker
3409d88a1c Add PANet arch 2020-10-12 10:20:55 -06:00
James Betker
e785029936 Mods needed to support SPSR archs with teco gan 2020-10-10 22:39:55 -06:00
James Betker
fe50d6f9d0 Fix attention images 2020-10-09 19:21:55 -06:00
James Betker
58d8bf8f69 Add network architecture built for teco 2020-10-09 08:40:14 -06:00
James Betker
afe6af88af Fix attention print issue 2020-10-08 18:34:00 -06:00
James Betker
4c85ee51a4 Converge SSG architectures into unified switching base class
Also adds attention norm histogram to logging
2020-10-08 17:23:21 -06:00
James Betker
fba29d7dcc Move to apex distributeddataparallel and add switch all_reduce
Torch's distributed_data_parallel is missing "delay_allreduce", which is
necessary to get gradient checkpointing to work with recurrent models.
2020-10-08 11:20:05 -06:00
James Betker
969bcd9021 Use local checkpoint in SSG 2020-10-08 08:54:46 -06:00
James Betker
c96f5b2686 Import switched_conv as a submodule 2020-10-07 23:10:54 -06:00
James Betker
c352c8bce4 More tecogan fixes 2020-10-07 12:41:17 -06:00
James Betker
2f2e3f33f8 StackedSwitchedGenerator_5lyr 2020-10-06 20:39:32 -06:00
James Betker
6217b48e3f Fix spsr_arch bug 2020-10-06 20:38:47 -06:00
James Betker
e4b89a172f Reduce spsr7 memory usage 2020-10-05 22:05:56 -06:00
James Betker
4111942ada Support attention deferral in deep ssgr 2020-10-05 19:35:55 -06:00
James Betker
2875822024 SPSR9 arch
takes some of the stuff I learned with SGSR yesterday and applies it to spsr
2020-10-05 08:47:51 -06:00
James Betker
51044929af Don't compute attention statistics on multiple generator invocations of the same data 2020-10-05 00:34:29 -06:00
James Betker
c3ef8a4a31 Stacked switches - return a tuple 2020-10-04 21:02:24 -06:00
James Betker
ffd069fd97 Lots of SSG work
- Checkpointed pretty much the entire model - enabling recurrent inputs
- Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)
2020-10-04 20:48:08 -06:00
James Betker
aca2c7ab41 Full checkpoint-ize SSG1 2020-10-04 18:24:52 -06:00
James Betker
e3294939b0 Revert "SSG: offer option to use BN-based attention normalization"
Didn't work. Oh well.

This reverts commit 5cd2b37591.
2020-10-03 17:54:53 -06:00
James Betker
5cd2b37591 SSG: offer option to use BN-based attention normalization
Not sure how this is going to work, lets try it.
2020-10-03 16:16:19 -06:00
James Betker
9b4ed82093 Get rid of unused convs in spsr7 2020-10-03 11:36:26 -06:00
James Betker
19a4075e1e Allow checkpointing to be disabled in the options file
Also makes options a global variable for usage in utils.
2020-10-03 11:03:28 -06:00
James Betker
146a9125f2 Modify geometric & translational losses so they can be used with embeddings 2020-10-02 20:40:13 -06:00
James Betker
e30a1443cd Change sw2 refs 2020-10-02 09:01:18 -06:00
James Betker
e38716925f Fix spsr8 class init 2020-10-02 09:00:18 -06:00
James Betker
35469f08e2 Spsr 8 2020-10-02 08:58:15 -06:00
James Betker
d9ae970fd9 SSG update 2020-10-01 11:27:51 -06:00
James Betker
e3053e4e55 Exchange SpsrNet for SpsrNetSimplified 2020-09-30 17:01:04 -06:00
James Betker
896b4f5be2 Revert "spsr7 adjustments"
This reverts commit 9fee1cec71.
2020-09-29 18:30:41 -06:00
James Betker
9fee1cec71 spsr7 adjustments 2020-09-29 17:19:59 -06:00
James Betker
0b5a033503 spsr7 + cleanup
SPSR7 adds ref onto spsr6, makes more "common sense" mods.
2020-09-29 16:59:26 -06:00
James Betker
db52bec4ab spsr6
This is meant to be a variant of SPSR5 that harkens
back to the simpler earlier architectures that do not
have embeddings or ref_ inputs, but do have deep
multiplexers. It does, however, use some of the new
conjoin mechanisms.
2020-09-28 22:09:27 -06:00
James Betker
aeaf185314 Add RCAN 2020-09-27 16:00:41 -06:00
James Betker
4d29b7729e Model arch cleanup 2020-09-27 11:18:45 -06:00
James Betker
d8621e611a BackboneSpineNoHead takes ref 2020-09-26 21:25:04 -06:00
James Betker
5a27187c59 More mods to accomodate new dataset 2020-09-25 22:45:57 -06:00
James Betker
ce4613ecb9 Finish up single_image_dataset work
Sweet!
2020-09-25 16:37:54 -06:00
James Betker
ea565b7eaf More fixes 2020-09-24 17:51:52 -06:00
James Betker
553917a8d1 Fix torchvision import bug 2020-09-24 17:38:34 -06:00
James Betker
58886109d4 Update how spsr arches do attention to conform with sgsr 2020-09-24 16:53:54 -06:00
James Betker
9a50a7966d SiLU doesnt support inplace 2020-09-23 21:09:13 -06:00
James Betker
eda0eadba2 Use custom SiLU
Torch didnt have this before 1.7
2020-09-23 21:05:06 -06:00
James Betker
05963157c1 Several things
- Fixes to 'after' and 'before' defs for steps (turns out they werent working)
- Feature nets take in a list of layers to extract. Not fully implemented yet.
- Fixes bugs with RAGAN
- Allows real input into generator gan to not be detached by param
2020-09-23 11:56:36 -06:00
James Betker
f40beb5460 Add 'before' and 'after' defs to injections, steps and optimizers 2020-09-22 17:03:22 -06:00
James Betker
419f77ec19 Some new backbones 2020-09-21 12:36:49 -06:00
James Betker
9429544a60 Spinenet: implementation without 4x downsampling right off the bat 2020-09-21 12:36:30 -06:00
James Betker
53a5657850 Fix SSGR 2020-09-20 19:07:15 -06:00
James Betker
fe82785ba5 Add some new architectures to ssg 2020-09-19 21:47:10 -06:00
James Betker
b83f097082 Get rid of get_debug_values from RRDB, rectify outputs 2020-09-19 21:46:36 -06:00
James Betker
9a17ade550 Some convenience adjustments to ExtensibleTrainer 2020-09-17 21:05:32 -06:00
James Betker
723754c133 Update attention debugger outputting for SSG 2020-09-16 13:09:46 -06:00
James Betker
0918430572 SSG network
This branches off of SPSR. It is identical but substantially reduced
in complexity. It's intended to be my long term working arch.
2020-09-15 20:59:24 -06:00
James Betker
6deab85b9b Add BackboneEncoderNoRef 2020-09-15 16:55:38 -06:00
James Betker
ccf8438001 SPSR5
This is SPSR4, but the multiplexers have access to the output of the transformations
for making their decision.
2020-09-13 20:10:24 -06:00
James Betker
4e44bca611 SPSR4
aka - return of the backbone! I'm tired of massively overparameterized generators
with pile-of-shit multiplexers. Let's give this another try..
2020-09-11 22:55:37 -06:00
James Betker
19896abaea Clean up old SwitchedSpsr arch
It didn't work anyways, so why not?
2020-09-11 16:09:28 -06:00
James Betker
1086f0476b Fix ref branch using fixed filters 2020-09-11 08:58:35 -06:00
James Betker
8c469b8286 Enable memory checkpointing 2020-09-11 08:44:29 -06:00
James Betker
313424d7b5 Add new referencing discriminator
Also extend the way losses work so that you can pass
parameters into the discriminator from the config file
2020-09-10 21:35:29 -06:00
James Betker
9e5aa166de Report the standard deviation of ref branches
This patch also ups the contribution
2020-09-10 16:34:41 -06:00
James Betker
668bfbff6d Back to best arch for spsr3 2020-09-10 14:58:14 -06:00
James Betker
992b0a8d98 spsr3 with conjoin stage as part of the switch 2020-09-10 09:11:37 -06:00
James Betker
e0fc5eb50c Temporary commit - noise 2020-09-09 17:12:52 -06:00
James Betker
00da69d450 Temporary commit - ref 2020-09-09 17:09:44 -06:00
James Betker
df59d6c99d More spsr3 mods
- Most branches get their own noise vector now.
- First attention branch has the intended sole purpose of raw image processing
- Remove norms from joiner block
2020-09-09 16:46:38 -06:00
James Betker
747ded2bf7 Fixes to the spsr3
Some lessons learned:
- Biases are fairly important as a relief valve. They dont need to be everywhere, but
  most computationally heavy branches should have a bias.
- GroupNorm in SPSR is not a great idea. Since image gradients are represented
   in this model, normal means and standard deviations are not applicable. (imggrad
   has a high representation of 0).
- Don't fuck with the mainline of any generative model. As much as possible, all
   additions should be done through residual connections. Never pollute the mainline
   with reference data, do that in branches. It basically leaves the mode untrainable.
2020-09-09 15:28:14 -06:00
James Betker
0ffac391c1 SPSR with ref joining 2020-09-09 11:17:07 -06:00
James Betker
c04f244802 More mods 2020-09-08 20:36:27 -06:00
James Betker
dffbfd2ec4 Allow SRG checkpointing to be toggled 2020-09-08 15:14:43 -06:00
James Betker
e6207d4c50 SPSR3 work
SPSR3 is meant to fix whatever is causing the switching units
inside of the newer SPSR architectures to fail and basically
not use the multiplexers.
2020-09-08 15:14:23 -06:00