Commit Graph

495 Commits

Author SHA1 Message Date
James Betker
a890e3a9c0 Fix geometric loss not handling 0 index 2020-10-04 21:05:01 -06:00
James Betker
c3ef8a4a31 Stacked switches - return a tuple 2020-10-04 21:02:24 -06:00
James Betker
13f97e1e97 Add recursive loss 2020-10-04 20:48:15 -06:00
James Betker
ffd069fd97 Lots of SSG work
- Checkpointed pretty much the entire model - enabling recurrent inputs
- Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)
2020-10-04 20:48:08 -06:00
James Betker
aca2c7ab41 Full checkpoint-ize SSG1 2020-10-04 18:24:52 -06:00
James Betker
e3294939b0 Revert "SSG: offer option to use BN-based attention normalization"
Didn't work. Oh well.

This reverts commit 5cd2b37591.
2020-10-03 17:54:53 -06:00
James Betker
5cd2b37591 SSG: offer option to use BN-based attention normalization
Not sure how this is going to work, lets try it.
2020-10-03 16:16:19 -06:00
James Betker
9b4ed82093 Get rid of unused convs in spsr7 2020-10-03 11:36:26 -06:00
James Betker
3561cc164d Fix up fea_loss calculator (for validation)
Not sure how this was working in regular training mode, but it
was failing in DDP.
2020-10-03 11:19:20 -06:00
James Betker
6c9718ad64 Don't log if you aren't 0 rank 2020-10-03 11:14:13 -06:00
James Betker
922b1d76df Don't record visuals when not on rank 0 2020-10-03 11:10:03 -06:00
James Betker
8197fd646f Don't accumulate losses for metrics when the loss isn't a tensor 2020-10-03 11:03:55 -06:00
James Betker
19a4075e1e Allow checkpointing to be disabled in the options file
Also makes options a global variable for usage in utils.
2020-10-03 11:03:28 -06:00
James Betker
dd9d7b27ac Add more sophisticated mechanism for balancing GAN losses 2020-10-02 22:53:42 -06:00
James Betker
39865ca3df TOTAL_loss, dumbo 2020-10-02 21:06:10 -06:00
James Betker
4e44fcd655 Loss accumulator fix 2020-10-02 20:55:33 -06:00
James Betker
567b4d50a4 ExtensibleTrainer - don't compute backward when there is no loss 2020-10-02 20:54:06 -06:00
James Betker
146a9125f2 Modify geometric & translational losses so they can be used with embeddings 2020-10-02 20:40:13 -06:00
James Betker
e30a1443cd Change sw2 refs 2020-10-02 09:01:18 -06:00
James Betker
e38716925f Fix spsr8 class init 2020-10-02 09:00:18 -06:00
James Betker
35469f08e2 Spsr 8 2020-10-02 08:58:15 -06:00
James Betker
aa4fd89018 resnext with groupnorm 2020-10-01 15:49:28 -06:00
James Betker
8beaa47933 resnext discriminator 2020-10-01 11:48:14 -06:00
James Betker
55f2764fef Allow fixup50 to be used as a discriminator 2020-10-01 11:28:18 -06:00
James Betker
7986185fcb Change 'mod_step' to 'every' 2020-10-01 11:28:06 -06:00
James Betker
d9ae970fd9 SSG update 2020-10-01 11:27:51 -06:00
James Betker
e3053e4e55 Exchange SpsrNet for SpsrNetSimplified 2020-09-30 17:01:04 -06:00
James Betker
66d4512029 Fix up translational equivariance loss so it's ready for prime time 2020-09-30 12:01:00 -06:00
James Betker
896b4f5be2 Revert "spsr7 adjustments"
This reverts commit 9fee1cec71.
2020-09-29 18:30:41 -06:00
James Betker
9fee1cec71 spsr7 adjustments 2020-09-29 17:19:59 -06:00
James Betker
dc8f3b24de Don't let duplicate keys be used for injectors and losses 2020-09-29 16:59:44 -06:00
James Betker
0b5a033503 spsr7 + cleanup
SPSR7 adds ref onto spsr6, makes more "common sense" mods.
2020-09-29 16:59:26 -06:00
James Betker
f9b83176f1 Fix bugs in extensibletrainer 2020-09-28 22:09:42 -06:00
James Betker
db52bec4ab spsr6
This is meant to be a variant of SPSR5 that harkens
back to the simpler earlier architectures that do not
have embeddings or ref_ inputs, but do have deep
multiplexers. It does, however, use some of the new
conjoin mechanisms.
2020-09-28 22:09:27 -06:00
James Betker
7e240f2fed Recurrent / teco work 2020-09-28 22:06:56 -06:00
James Betker
aeaf185314 Add RCAN 2020-09-27 16:00:41 -06:00
James Betker
4d29b7729e Model arch cleanup 2020-09-27 11:18:45 -06:00
James Betker
31641d7f63 Add ImagePatchInjector and TranslationalLoss 2020-09-26 21:25:32 -06:00
James Betker
d8621e611a BackboneSpineNoHead takes ref 2020-09-26 21:25:04 -06:00
James Betker
5a27187c59 More mods to accomodate new dataset 2020-09-25 22:45:57 -06:00
James Betker
6d0490a0e6 Tecogan implementation work 2020-09-25 16:38:23 -06:00
James Betker
ce4613ecb9 Finish up single_image_dataset work
Sweet!
2020-09-25 16:37:54 -06:00
James Betker
ea565b7eaf More fixes 2020-09-24 17:51:52 -06:00
James Betker
553917a8d1 Fix torchvision import bug 2020-09-24 17:38:34 -06:00
James Betker
58886109d4 Update how spsr arches do attention to conform with sgsr 2020-09-24 16:53:54 -06:00
James Betker
9a50a7966d SiLU doesnt support inplace 2020-09-23 21:09:13 -06:00
James Betker
eda0eadba2 Use custom SiLU
Torch didnt have this before 1.7
2020-09-23 21:05:06 -06:00
James Betker
05963157c1 Several things
- Fixes to 'after' and 'before' defs for steps (turns out they werent working)
- Feature nets take in a list of layers to extract. Not fully implemented yet.
- Fixes bugs with RAGAN
- Allows real input into generator gan to not be detached by param
2020-09-23 11:56:36 -06:00
James Betker
4ab989e015 try again.. 2020-09-22 18:27:52 -06:00
James Betker
3b6c957194 Fix? it again? 2020-09-22 18:25:59 -06:00
James Betker
7b60d9e0d8 Fix? cosine loss 2020-09-22 18:18:35 -06:00
James Betker
2e18c4c22d Add CosineEmbeddingLoss to F 2020-09-22 17:10:29 -06:00
James Betker
f40beb5460 Add 'before' and 'after' defs to injections, steps and optimizers 2020-09-22 17:03:22 -06:00
James Betker
419f77ec19 Some new backbones 2020-09-21 12:36:49 -06:00
James Betker
9429544a60 Spinenet: implementation without 4x downsampling right off the bat 2020-09-21 12:36:30 -06:00
James Betker
53a5657850 Fix SSGR 2020-09-20 19:07:15 -06:00
James Betker
17c569ea62 Add geometric loss 2020-09-20 16:24:23 -06:00
James Betker
17dd99b29b Fix bug with discriminator noise addition
It wasn't using the scale and was applying the noise to the
underlying state variable.
2020-09-20 12:00:27 -06:00
James Betker
3138f98fbc Allow discriminator noise to be injected at the loss level, cleans up configs 2020-09-19 21:47:52 -06:00
James Betker
e9a39bfa14 Recursively detach all outputs, even if they are nested in data structures 2020-09-19 21:47:34 -06:00
James Betker
fe82785ba5 Add some new architectures to ssg 2020-09-19 21:47:10 -06:00
James Betker
b83f097082 Get rid of get_debug_values from RRDB, rectify outputs 2020-09-19 21:46:36 -06:00
James Betker
e0bd68efda Add ImageFlowInjector 2020-09-19 10:07:00 -06:00
James Betker
e2a146abc7 Add in experiments hook 2020-09-19 10:05:25 -06:00
James Betker
9a17ade550 Some convenience adjustments to ExtensibleTrainer 2020-09-17 21:05:32 -06:00
James Betker
9963b37200 Add a new script for loading a discriminator network and using it to filter images 2020-09-17 13:30:32 -06:00
James Betker
723754c133 Update attention debugger outputting for SSG 2020-09-16 13:09:46 -06:00
James Betker
0918430572 SSG network
This branches off of SPSR. It is identical but substantially reduced
in complexity. It's intended to be my long term working arch.
2020-09-15 20:59:24 -06:00
James Betker
6deab85b9b Add BackboneEncoderNoRef 2020-09-15 16:55:38 -06:00
James Betker
d0321ca5de Don't load amp state dict if amp is disabled 2020-09-14 15:21:42 -06:00
James Betker
ccf8438001 SPSR5
This is SPSR4, but the multiplexers have access to the output of the transformations
for making their decision.
2020-09-13 20:10:24 -06:00
James Betker
5b85f891af Only log the name of the first network in the total_loss training set 2020-09-12 16:07:09 -06:00
James Betker
fb595e72a4 Supporting infrastructure in ExtensibleTrainer to train spsr4
Need to be able to train 2 nets in one step: the backbone will be entirely separate
with its own optimizer (for an extremely low LR).

This functionality was already present, just not implemented correctly.
2020-09-11 22:57:06 -06:00
James Betker
4e44bca611 SPSR4
aka - return of the backbone! I'm tired of massively overparameterized generators
with pile-of-shit multiplexers. Let's give this another try..
2020-09-11 22:55:37 -06:00
James Betker
19896abaea Clean up old SwitchedSpsr arch
It didn't work anyways, so why not?
2020-09-11 16:09:28 -06:00
James Betker
50ca17bb0a Feature mode -> back to LR fea 2020-09-11 13:09:55 -06:00
James Betker
1086f0476b Fix ref branch using fixed filters 2020-09-11 08:58:35 -06:00
James Betker
8c469b8286 Enable memory checkpointing 2020-09-11 08:44:29 -06:00
James Betker
5189b11dac Add combined dataset for training across multiple datasets 2020-09-11 08:44:06 -06:00
James Betker
313424d7b5 Add new referencing discriminator
Also extend the way losses work so that you can pass
parameters into the discriminator from the config file
2020-09-10 21:35:29 -06:00
James Betker
9e5aa166de Report the standard deviation of ref branches
This patch also ups the contribution
2020-09-10 16:34:41 -06:00
James Betker
668bfbff6d Back to best arch for spsr3 2020-09-10 14:58:14 -06:00
James Betker
992b0a8d98 spsr3 with conjoin stage as part of the switch 2020-09-10 09:11:37 -06:00
James Betker
e0fc5eb50c Temporary commit - noise 2020-09-09 17:12:52 -06:00
James Betker
00da69d450 Temporary commit - ref 2020-09-09 17:09:44 -06:00
James Betker
df59d6c99d More spsr3 mods
- Most branches get their own noise vector now.
- First attention branch has the intended sole purpose of raw image processing
- Remove norms from joiner block
2020-09-09 16:46:38 -06:00
James Betker
747ded2bf7 Fixes to the spsr3
Some lessons learned:
- Biases are fairly important as a relief valve. They dont need to be everywhere, but
  most computationally heavy branches should have a bias.
- GroupNorm in SPSR is not a great idea. Since image gradients are represented
   in this model, normal means and standard deviations are not applicable. (imggrad
   has a high representation of 0).
- Don't fuck with the mainline of any generative model. As much as possible, all
   additions should be done through residual connections. Never pollute the mainline
   with reference data, do that in branches. It basically leaves the mode untrainable.
2020-09-09 15:28:14 -06:00
James Betker
0ffac391c1 SPSR with ref joining 2020-09-09 11:17:07 -06:00
James Betker
3027e6e27d Enable amp to be disabled 2020-09-09 10:45:59 -06:00
James Betker
c04f244802 More mods 2020-09-08 20:36:27 -06:00
James Betker
dffbfd2ec4 Allow SRG checkpointing to be toggled 2020-09-08 15:14:43 -06:00
James Betker
e6207d4c50 SPSR3 work
SPSR3 is meant to fix whatever is causing the switching units
inside of the newer SPSR architectures to fail and basically
not use the multiplexers.
2020-09-08 15:14:23 -06:00
James Betker
5606e8b0ee Fix SRGAN_model/fullimgdataset compatibility 1 2020-09-08 11:34:35 -06:00
James Betker
22c98f1567 Move MultiConvBlock to arch_util 2020-09-08 08:17:27 -06:00
James Betker
146ace0859 CSNLN changes (removed because it doesnt train well) 2020-09-08 08:04:16 -06:00
James Betker
f43df7f5f7 Make ExtensibleTrainer compatible with process_video 2020-09-08 08:03:41 -06:00
James Betker
a18ece62ee Add updated spsr net for test 2020-09-07 17:01:48 -06:00
James Betker
55475d2ac1 Clean up unused archs 2020-09-07 11:38:11 -06:00
James Betker
e8613041c0 Add novograd optimizer 2020-09-06 17:27:08 -06:00
James Betker
b1238d29cb Fix trainable not applying to discriminators 2020-09-05 20:31:26 -06:00
James Betker
21ae135f23 Allow Novograd to be used as an optimizer 2020-09-05 16:50:13 -06:00
James Betker
912a4d9fea Fix srg computer bug 2020-09-05 07:59:54 -06:00
James Betker
0dfd8eaf3b Support injectors that run in eval only 2020-09-05 07:59:45 -06:00
James Betker
44c75f7642 Undo SRG change 2020-09-04 17:32:16 -06:00
James Betker
6657a406ac Mods needed to support training a corruptor again:
- Allow original SPSRNet to have a specifiable block increment
- Cleanup
- Bug fixes in code that hasnt been touched in awhile.
2020-09-04 15:33:39 -06:00
James Betker
bfdfaab911 Checkpoint RRDB
Greatly reduces memory consumption with a low performance penalty
2020-09-04 15:32:00 -06:00
James Betker
696242064c Use tensor checkpointing to drastically reduce memory usage
This comes at the expense of computation, but since we can use much larger
batches, it results in a net speedup.
2020-09-03 11:33:36 -06:00
James Betker
365813bde3 Add InterpolateInjector 2020-09-03 11:32:47 -06:00
James Betker
d90c96e55e Fix greyscale injector 2020-09-02 10:29:40 -06:00
James Betker
8b52d46847 Interpreted feature loss to extensibletrainer 2020-09-02 10:08:24 -06:00
James Betker
886d59d5df Misc fixes & adjustments 2020-09-01 07:58:11 -06:00
James Betker
0a9b85f239 Fix vgg_gn input_img_factor 2020-08-31 09:50:30 -06:00
James Betker
4b4d08bdec Enable testing in ExtensibleTrainer, fix it in SRGAN_model
Also compute fea loss for this.
2020-08-31 09:41:48 -06:00
James Betker
b2091cb698 feamod fix 2020-08-30 08:08:49 -06:00
James Betker
a56e906f9c train HR feature trainer 2020-08-29 09:27:48 -06:00
James Betker
0e859a8082 4x spsr ref (not workin) 2020-08-29 09:27:18 -06:00
James Betker
25832930db Update loss with lr crossgan 2020-08-26 17:57:22 -06:00
James Betker
cbd5e7a986 Support old school crossgan in extensibletrainer 2020-08-26 17:52:35 -06:00
James Betker
8a6a2e6e2e Rev3 of the full image ref arch 2020-08-26 17:11:01 -06:00
James Betker
f35b3ad28f Fix val behavior for ExtensibleTrainer 2020-08-26 08:44:22 -06:00
James Betker
434ed70a9a Wrap vgg disc 2020-08-25 18:14:45 -06:00
James Betker
83f2f8d239 more debugging 2020-08-25 18:12:12 -06:00
James Betker
3f60281da7 Print when wrapping 2020-08-25 18:08:46 -06:00
James Betker
bae18c05e6 wrap disc grad 2020-08-25 17:58:20 -06:00
James Betker
f85f1e21db Turns out, can't do that 2020-08-25 17:18:52 -06:00
James Betker
935a735327 More dohs 2020-08-25 17:05:16 -06:00
James Betker
53e67bdb9c Distribute get_grad_no_padding 2020-08-25 17:03:18 -06:00
James Betker
2f706b7d93 I an inept. 2020-08-25 16:42:59 -06:00
James Betker
8bae0de769 ffffffffffffffffff 2020-08-25 16:41:01 -06:00
James Betker
1fe16f71dd Fix bug reporting spsr gan weight 2020-08-25 16:37:45 -06:00
James Betker
96586d6592 Fix distributed d_grad 2020-08-25 16:06:27 -06:00
James Betker
09a9079e17 Check rank before doing image logging. 2020-08-25 16:00:49 -06:00
James Betker
a1800f45ef Fix for referencingmultiplexer 2020-08-25 15:43:12 -06:00
James Betker
a65b07607c Reference network 2020-08-25 11:56:59 -06:00
James Betker
f9276007a8 More fixes to corrupt_fea 2020-08-23 17:52:18 -06:00
James Betker
0005c56cd4 dbg 2020-08-23 17:43:03 -06:00
James Betker
4bb5b3c981 corfea debugging 2020-08-23 17:39:02 -06:00
James Betker
7713cb8df5 Corrupted features in srgan 2020-08-23 17:32:03 -06:00
James Betker
dffc15184d More ExtensibleTrainer work
It runs now, just need to debug it to reach performance parity with SRGAN. Sweet.
2020-08-23 17:22:45 -06:00
James Betker
afdd93fbe9 Grey feature 2020-08-22 13:41:38 -06:00
James Betker
e59e712e39 More ExtensibleTrainer work 2020-08-22 13:08:33 -06:00
James Betker
f40545f235 ExtensibleTrainer work 2020-08-22 08:24:34 -06:00
James Betker
a498d7b1b3 Report l_g_gan_grad before weight multiplication 2020-08-20 11:57:53 -06:00
James Betker
9d77a4db2e Allow initial temperature to be specified to SPSR net for inference 2020-08-20 11:57:34 -06:00
James Betker
24bdcc1181 Let SwitchedSpsr transform count be specified 2020-08-18 09:10:25 -06:00
James Betker
74cdaa2226 Some work on extensible trainer 2020-08-18 08:49:32 -06:00
James Betker
868d0aa442 Undo early dim reduction on grad branch for SPSR_arch 2020-08-14 16:23:42 -06:00
James Betker
2d205f52ac Unite spsr_arch switched gens
Found a pretty good basis model.
2020-08-12 17:04:45 -06:00
James Betker
3d0ece804b SPSR LR2 2020-08-12 08:45:49 -06:00
James Betker
ab04ca1778 Extensible trainer (in progress) 2020-08-12 08:45:23 -06:00