James Betker
696242064c
Use tensor checkpointing to drastically reduce memory usage
...
This comes at the expense of computation, but since we can use much larger
batches, it results in a net speedup.
2020-09-03 11:33:36 -06:00
James Betker
8a6a2e6e2e
Rev3 of the full image ref arch
2020-08-26 17:11:01 -06:00
James Betker
f35b3ad28f
Fix val behavior for ExtensibleTrainer
2020-08-26 08:44:22 -06:00
James Betker
a1800f45ef
Fix for referencingmultiplexer
2020-08-25 15:43:12 -06:00
James Betker
a65b07607c
Reference network
2020-08-25 11:56:59 -06:00
James Betker
3d0ece804b
SPSR LR2
2020-08-12 08:45:49 -06:00
James Betker
59aba1daa7
LR switched SPSR arch
...
This variant doesn't do conv processing at HR, which should save
a ton of memory in inference. Lets see how it works.
2020-08-10 13:03:36 -06:00
James Betker
3ab39f0d22
Several new spsr nets
2020-08-05 10:01:24 -06:00
James Betker
f33ed578a2
Update how attention_maps are created
2020-08-01 20:23:46 -06:00
James Betker
8dd44182e6
Fix scale torch warning
2020-07-31 16:56:04 -06:00
James Betker
b06e1784e1
Fix SRG4 & switch disc
...
"fix". hehe.
2020-07-25 17:16:54 -06:00
James Betker
e6e91a1d75
Add SRG4
...
Back to the idea that maybe what we need is a hybrid
approach between pure switches and RDB.
2020-07-24 20:32:49 -06:00
James Betker
106b8da315
Assert that temperature is set properly in eval mode.
2020-07-22 20:50:59 -06:00
James Betker
47a525241f
Make attention norm optional
2020-07-18 07:24:02 -06:00
James Betker
3e7a83896b
Fix pixgan debugging issues
2020-07-16 11:45:19 -06:00
James Betker
0c4c388e15
Remove dualoutputsrg
...
Good idea, didn't pan out.
2020-07-16 10:09:24 -06:00
James Betker
4bcc409fc7
Fix loadSRG2 typo
2020-07-14 10:20:53 -06:00
James Betker
1e4083a35b
Apply temperature mods to all SRG models
...
(Honestly this needs to be base classed at this point)
2020-07-14 10:19:35 -06:00
James Betker
7659bd6818
Fix temperature equation
2020-07-14 10:17:14 -06:00
James Betker
853468ef82
Allow legacy state_dicts in srg2
2020-07-14 10:03:45 -06:00
James Betker
1b1431133b
Add DualOutputSRG
...
Also removes the old multi-return mechanism that Generators support.
Also fixes AttentionNorm.
2020-07-14 09:28:24 -06:00
James Betker
a2285ff2ee
Scale anorm by transform count
2020-07-13 08:49:09 -06:00
James Betker
dd0bbd9a7c
Enable AttentionNorm on SRG2
2020-07-13 08:38:17 -06:00
James Betker
4c0f770f2a
Fix inverted temperature curve bug
2020-07-12 11:02:50 -06:00
James Betker
14d23b9d20
Fixes, do fake swaps less often in pixgan discriminator
2020-07-11 21:22:11 -06:00
James Betker
33ca3832e1
Move ExpansionBlock to arch_util
...
Also makes all processing blocks have a conformant signature.
Alters ExpansionBlock to perform a processing conv on the passthrough
before the conjoin operation - this will break backwards compatibilty with SRG2.
2020-07-10 15:53:41 -06:00
James Betker
5e8b52f34c
Misc changes
2020-07-10 09:45:48 -06:00
James Betker
5f2c722a10
SRG2 revival
...
Big update to SRG2 architecture to pull in a lot of things that have been learned:
- Use group norm instead of batch norm
- Initialize the weights on the transformations low like is done in RRDB rather than using the scalar. Models live or die by their early stages, and this ones early stage is pretty weak
- Transform multiplexer to use u-net like architecture.
- Just use one set of configuration variables instead of a list - flat networks performed fine in this regard.
2020-07-09 17:34:51 -06:00
James Betker
8a4eb8241d
SRG3 work
...
Operates on top of a pre-trained SpineNET backbone (trained on CoCo 2017 with RetinaNet)
This variant is extremely shallow.
2020-07-07 13:46:40 -06:00
James Betker
0acad81035
More SRG2 adjustments..
2020-07-06 22:40:40 -06:00
James Betker
086b2f0570
More bugs
2020-07-06 22:28:07 -06:00
James Betker
d4d4f85fc0
Bug fixes
2020-07-06 22:25:40 -06:00
James Betker
3c31bea1ac
SRG2 architectural changes
2020-07-06 22:22:29 -06:00
James Betker
d0957bd7d4
Alter weight initialization for transformation blocks
2020-07-05 17:32:46 -06:00
James Betker
16d1bf6dd7
Replace ConvBnRelus in SRG2 with Silus
2020-07-05 17:29:20 -06:00
James Betker
510b2f887d
Remove RDB from srg2
...
Doesnt seem to work so great.
2020-07-03 22:31:20 -06:00
James Betker
703dec4472
Add SpineNet & integrate with SRG
...
New version of SRG uses SpineNet for a switch backbone.
2020-07-03 12:07:31 -06:00
James Betker
e9ee67ff10
Integrate RDB into SRG
...
The last RDB for each cluster is switched.
2020-07-01 17:19:55 -06:00
James Betker
6ac6c95177
Fix scaling bug
2020-07-01 16:42:27 -06:00
James Betker
30653181ba
Experiment: get rid of post_switch_conv
2020-07-01 16:30:40 -06:00
James Betker
17191de836
Experiment: bring initialize_weights back again
...
Something really strange going on here..
2020-07-01 15:58:13 -06:00
James Betker
d1d573de07
Experiment: new init and post-switch-conv
2020-07-01 15:25:54 -06:00
James Betker
e2398ac83c
Experiment: revert initialization changes
2020-07-01 12:08:09 -06:00
James Betker
78276afcaa
Experiment: Back to lelu
2020-07-01 11:43:25 -06:00
James Betker
b945021c90
SRG v2 - Move to Relu, rely on Module-based initialization
2020-07-01 11:33:32 -06:00
James Betker
604763be68
NSG r7
...
Converts the switching trunk to a VGG-style network to make it more comparable
to SRG architectures.
2020-07-01 09:54:29 -06:00
James Betker
87f1e9c56f
Invert ResGen2 to operate in LR space
2020-06-30 20:57:40 -06:00
James Betker
3ce1a1878d
NSG improvements (r5)
...
- Get rid of forwards(), it makes numeric_stability.py not work properly.
- Do stability auditing across layers.
- Upsample last instead of first, work in much higher dimensionality for transforms.
2020-06-30 16:59:57 -06:00
James Betker
75f148022d
Even more NSG improvements (r4)
2020-06-30 13:52:47 -06:00
James Betker
773753073f
More NSG improvements (v3)
...
Move to a fully fixup residual network for the switch (no
batch norms). Fix a bunch of other small bugs. Add in a
temporary latent feed-forward from the bottom of the
switch. Fix several initialization issues.
2020-06-29 20:26:51 -06:00