James Betker
969bcd9021
Use local checkpoint in SSG
2020-10-08 08:54:46 -06:00
James Betker
c93dd623d7
Tecogan losses work
2020-10-07 23:11:58 -06:00
James Betker
29bf78d791
Update switched_conv submodule
2020-10-07 23:11:50 -06:00
James Betker
c96f5b2686
Import switched_conv as a submodule
2020-10-07 23:10:54 -06:00
James Betker
c352c8bce4
More tecogan fixes
2020-10-07 12:41:17 -06:00
James Betker
a62a5dbb5f
Clone and detach in recursively_detach
2020-10-07 12:41:00 -06:00
James Betker
1c44d395af
Tecogan work
...
Its training! There's still probably plenty of bugs though..
2020-10-07 09:03:30 -06:00
James Betker
e9d7371a61
Add concatenate injector
2020-10-07 09:02:42 -06:00
James Betker
8a7e993aea
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-06 20:41:58 -06:00
James Betker
b2c4b2a16d
Move gpu_ids out of if statement
2020-10-06 20:40:20 -06:00
James Betker
1e415b249b
Add tag that can be applied to prevent parameter training
2020-10-06 20:39:49 -06:00
James Betker
2f2e3f33f8
StackedSwitchedGenerator_5lyr
2020-10-06 20:39:32 -06:00
James Betker
6217b48e3f
Fix spsr_arch bug
2020-10-06 20:38:47 -06:00
James Betker
4290918359
Add distributed_checkpoint for more efficient checkpoints
2020-10-06 20:38:38 -06:00
James Betker
cffc596141
Integrate flownet2 into codebase, add teco visual debugs
2020-10-06 20:35:39 -06:00
James Betker
e4b89a172f
Reduce spsr7 memory usage
2020-10-05 22:05:56 -06:00
James Betker
4111942ada
Support attention deferral in deep ssgr
2020-10-05 19:35:55 -06:00
James Betker
840927063a
Work on tecogan losses
2020-10-05 19:35:28 -06:00
James Betker
0e3ea63a14
Misc
2020-10-05 18:01:50 -06:00
James Betker
2875822024
SPSR9 arch
...
takes some of the stuff I learned with SGSR yesterday and applies it to spsr
2020-10-05 08:47:51 -06:00
James Betker
51044929af
Don't compute attention statistics on multiple generator invocations of the same data
2020-10-05 00:34:29 -06:00
James Betker
e760658fdb
Another fix..
2020-10-04 21:08:00 -06:00
James Betker
a890e3a9c0
Fix geometric loss not handling 0 index
2020-10-04 21:05:01 -06:00
James Betker
c3ef8a4a31
Stacked switches - return a tuple
2020-10-04 21:02:24 -06:00
James Betker
13f97e1e97
Add recursive loss
2020-10-04 20:48:15 -06:00
James Betker
ffd069fd97
Lots of SSG work
...
- Checkpointed pretty much the entire model - enabling recurrent inputs
- Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)
2020-10-04 20:48:08 -06:00
James Betker
aca2c7ab41
Full checkpoint-ize SSG1
2020-10-04 18:24:52 -06:00
James Betker
fc396baf1a
Move loaded_options to util
...
Doesn't seem to work with python 3.6
2020-10-03 20:29:06 -06:00
James Betker
2d8e9a9d30
Options fix?
2020-10-03 20:27:12 -06:00
James Betker
e3294939b0
Revert "SSG: offer option to use BN-based attention normalization"
...
Didn't work. Oh well.
This reverts commit 5cd2b37591
.
2020-10-03 17:54:53 -06:00
James Betker
43c6c67fd1
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-03 16:17:31 -06:00
James Betker
5cd2b37591
SSG: offer option to use BN-based attention normalization
...
Not sure how this is going to work, lets try it.
2020-10-03 16:16:19 -06:00
James Betker
c896939523
Fix recursive checkpoint
2020-10-03 16:15:52 -06:00
James Betker
3cbb9ecd45
Misc
2020-10-03 16:15:42 -06:00
James Betker
35731502c3
Fix checkpoint recursion
2020-10-03 12:52:50 -06:00
James Betker
9b4ed82093
Get rid of unused convs in spsr7
2020-10-03 11:36:26 -06:00
James Betker
b2b81b13a4
Remove recursive utils import
2020-10-03 11:30:05 -06:00
James Betker
3561cc164d
Fix up fea_loss calculator (for validation)
...
Not sure how this was working in regular training mode, but it
was failing in DDP.
2020-10-03 11:19:20 -06:00
James Betker
21d3bb83b2
Use tqdm reporting with validation
2020-10-03 11:16:39 -06:00
James Betker
6c9718ad64
Don't log if you aren't 0 rank
2020-10-03 11:14:13 -06:00
James Betker
922b1d76df
Don't record visuals when not on rank 0
2020-10-03 11:10:03 -06:00
James Betker
8197fd646f
Don't accumulate losses for metrics when the loss isn't a tensor
2020-10-03 11:03:55 -06:00
James Betker
19a4075e1e
Allow checkpointing to be disabled in the options file
...
Also makes options a global variable for usage in utils.
2020-10-03 11:03:28 -06:00
James Betker
dd9d7b27ac
Add more sophisticated mechanism for balancing GAN losses
2020-10-02 22:53:42 -06:00
James Betker
39865ca3df
TOTAL_loss, dumbo
2020-10-02 21:06:10 -06:00
James Betker
4e44fcd655
Loss accumulator fix
2020-10-02 20:55:33 -06:00
James Betker
567b4d50a4
ExtensibleTrainer - don't compute backward when there is no loss
2020-10-02 20:54:06 -06:00
James Betker
146a9125f2
Modify geometric & translational losses so they can be used with embeddings
2020-10-02 20:40:13 -06:00
James Betker
e30a1443cd
Change sw2 refs
2020-10-02 09:01:18 -06:00
James Betker
e38716925f
Fix spsr8 class init
2020-10-02 09:00:18 -06:00
James Betker
efbf6b737b
Update validate_data to work with SingleImageDataset
2020-10-02 08:58:34 -06:00
James Betker
35469f08e2
Spsr 8
2020-10-02 08:58:15 -06:00
James Betker
c9a9e5c525
Prompt user for gpu_id if multiple gpus are detected
2020-10-01 17:24:50 -06:00
James Betker
aa4fd89018
resnext with groupnorm
2020-10-01 15:49:28 -06:00
James Betker
8beaa47933
resnext discriminator
2020-10-01 11:48:14 -06:00
James Betker
55f2764fef
Allow fixup50 to be used as a discriminator
2020-10-01 11:28:18 -06:00
James Betker
7986185fcb
Change 'mod_step' to 'every'
2020-10-01 11:28:06 -06:00
James Betker
d9ae970fd9
SSG update
2020-10-01 11:27:51 -06:00
James Betker
e3053e4e55
Exchange SpsrNet for SpsrNetSimplified
2020-09-30 17:01:04 -06:00
James Betker
66d4512029
Fix up translational equivariance loss so it's ready for prime time
2020-09-30 12:01:00 -06:00
James Betker
896b4f5be2
Revert "spsr7 adjustments"
...
This reverts commit 9fee1cec71
.
2020-09-29 18:30:41 -06:00
James Betker
9fee1cec71
spsr7 adjustments
2020-09-29 17:19:59 -06:00
James Betker
dc8f3b24de
Don't let duplicate keys be used for injectors and losses
2020-09-29 16:59:44 -06:00
James Betker
0b5a033503
spsr7 + cleanup
...
SPSR7 adds ref onto spsr6, makes more "common sense" mods.
2020-09-29 16:59:26 -06:00
James Betker
f9b83176f1
Fix bugs in extensibletrainer
2020-09-28 22:09:42 -06:00
James Betker
db52bec4ab
spsr6
...
This is meant to be a variant of SPSR5 that harkens
back to the simpler earlier architectures that do not
have embeddings or ref_ inputs, but do have deep
multiplexers. It does, however, use some of the new
conjoin mechanisms.
2020-09-28 22:09:27 -06:00
James Betker
7e240f2fed
Recurrent / teco work
2020-09-28 22:06:56 -06:00
James Betker
57814f18cf
More features for multi-frame-dataset
2020-09-28 14:26:15 -06:00
James Betker
aeaf185314
Add RCAN
2020-09-27 16:00:41 -06:00
James Betker
4d29b7729e
Model arch cleanup
2020-09-27 11:18:45 -06:00
James Betker
7dff802144
Add MultiFrameDataset
...
Retrieves video sequence patches rather than single images.
2020-09-27 11:13:06 -06:00
James Betker
d8c3fc9327
Fix random noise corruptor
...
It was functioning as a color shift
2020-09-27 11:12:24 -06:00
James Betker
c85da79697
Move many dataset functions into a base class
2020-09-27 11:11:58 -06:00
James Betker
eb12b5f887
Misc
2020-09-26 21:27:17 -06:00
James Betker
31641d7f63
Add ImagePatchInjector and TranslationalLoss
2020-09-26 21:25:32 -06:00
James Betker
d8621e611a
BackboneSpineNoHead takes ref
2020-09-26 21:25:04 -06:00
James Betker
5a27187c59
More mods to accomodate new dataset
2020-09-25 22:45:57 -06:00
James Betker
254cb1e915
More dataset integration work
2020-09-25 22:19:38 -06:00
James Betker
6d0490a0e6
Tecogan implementation work
2020-09-25 16:38:23 -06:00
James Betker
ce4613ecb9
Finish up single_image_dataset work
...
Sweet!
2020-09-25 16:37:54 -06:00
James Betker
1cf73c2cce
Fix dataset for a val set that includes lq
2020-09-24 18:01:07 -06:00
James Betker
ea565b7eaf
More fixes
2020-09-24 17:51:52 -06:00
James Betker
553917a8d1
Fix torchvision import bug
2020-09-24 17:38:34 -06:00
James Betker
58886109d4
Update how spsr arches do attention to conform with sgsr
2020-09-24 16:53:54 -06:00
James Betker
9a50a7966d
SiLU doesnt support inplace
2020-09-23 21:09:13 -06:00
James Betker
eda0eadba2
Use custom SiLU
...
Torch didnt have this before 1.7
2020-09-23 21:05:06 -06:00
James Betker
05963157c1
Several things
...
- Fixes to 'after' and 'before' defs for steps (turns out they werent working)
- Feature nets take in a list of layers to extract. Not fully implemented yet.
- Fixes bugs with RAGAN
- Allows real input into generator gan to not be detached by param
2020-09-23 11:56:36 -06:00
James Betker
4ab989e015
try again..
2020-09-22 18:27:52 -06:00
James Betker
3b6c957194
Fix? it again?
2020-09-22 18:25:59 -06:00
James Betker
7b60d9e0d8
Fix? cosine loss
2020-09-22 18:18:35 -06:00
James Betker
2e18c4c22d
Add CosineEmbeddingLoss to F
2020-09-22 17:10:29 -06:00
James Betker
f40beb5460
Add 'before' and 'after' defs to injections, steps and optimizers
2020-09-22 17:03:22 -06:00
James Betker
419f77ec19
Some new backbones
2020-09-21 12:36:49 -06:00
James Betker
9429544a60
Spinenet: implementation without 4x downsampling right off the bat
2020-09-21 12:36:30 -06:00
James Betker
384e3d54cc
Extract images into jpg, have a multiplier & size threshold
2020-09-21 12:36:03 -06:00
James Betker
bde35ced47
Fix recursive detach
2020-09-20 19:08:13 -06:00
James Betker
53a5657850
Fix SSGR
2020-09-20 19:07:15 -06:00
James Betker
17c569ea62
Add geometric loss
2020-09-20 16:24:23 -06:00
James Betker
17dd99b29b
Fix bug with discriminator noise addition
...
It wasn't using the scale and was applying the noise to the
underlying state variable.
2020-09-20 12:00:27 -06:00
James Betker
dab8ab8a8f
Offer option to configure the size of the normal distribution that the target size is drawn from
2020-09-20 11:59:31 -06:00
James Betker
3138f98fbc
Allow discriminator noise to be injected at the loss level, cleans up configs
2020-09-19 21:47:52 -06:00
James Betker
e9a39bfa14
Recursively detach all outputs, even if they are nested in data structures
2020-09-19 21:47:34 -06:00
James Betker
fe82785ba5
Add some new architectures to ssg
2020-09-19 21:47:10 -06:00
James Betker
b83f097082
Get rid of get_debug_values from RRDB, rectify outputs
2020-09-19 21:46:36 -06:00
James Betker
e0bd68efda
Add ImageFlowInjector
2020-09-19 10:07:00 -06:00
James Betker
e2a146abc7
Add in experiments hook
2020-09-19 10:05:25 -06:00
James Betker
4f75cf0f02
Revert "Full image dataset operates on lists"
...
Going with an entirely new dataset instead..
This reverts commit 36ec32bf11
.
2020-09-18 09:50:43 -06:00
James Betker
36ec32bf11
Full image dataset operates on lists
2020-09-18 09:50:26 -06:00
James Betker
3cb2a9a9d3
New dataset, initial work
2020-09-18 09:49:13 -06:00
James Betker
9a17ade550
Some convenience adjustments to ExtensibleTrainer
2020-09-17 21:05:32 -06:00
James Betker
57fc3f490c
Add script for extracting image tiles with reference images
2020-09-17 13:30:51 -06:00
James Betker
9963b37200
Add a new script for loading a discriminator network and using it to filter images
2020-09-17 13:30:32 -06:00
James Betker
f5cd23e2d5
Further patch size adjustments
2020-09-16 16:50:35 -06:00
James Betker
723754c133
Update attention debugger outputting for SSG
2020-09-16 13:09:46 -06:00
James Betker
0b047e5f80
Increase scale of the patch selector random distribution
...
This will cause larger slices of an image to appear more frequently,
increasing the difficulty of the generator.
2020-09-16 08:27:42 -06:00
James Betker
f211575e9d
Save models before validation
...
Validation often fails with OOM, wasting hours of training time.
Save models first.
2020-09-16 08:17:17 -06:00
James Betker
0918430572
SSG network
...
This branches off of SPSR. It is identical but substantially reduced
in complexity. It's intended to be my long term working arch.
2020-09-15 20:59:24 -06:00
James Betker
c833bd1eac
Misc changes
2020-09-15 20:57:59 -06:00
James Betker
6deab85b9b
Add BackboneEncoderNoRef
2020-09-15 16:55:38 -06:00
James Betker
d0321ca5de
Don't load amp state dict if amp is disabled
2020-09-14 15:21:42 -06:00
James Betker
94deab2792
Fix error serving gt_fullsize_ref in full_image_dataset
2020-09-14 15:05:44 -06:00
James Betker
ccf8438001
SPSR5
...
This is SPSR4, but the multiplexers have access to the output of the transformations
for making their decision.
2020-09-13 20:10:24 -06:00
James Betker
5b85f891af
Only log the name of the first network in the total_loss training set
2020-09-12 16:07:09 -06:00
James Betker
fb595e72a4
Supporting infrastructure in ExtensibleTrainer to train spsr4
...
Need to be able to train 2 nets in one step: the backbone will be entirely separate
with its own optimizer (for an extremely low LR).
This functionality was already present, just not implemented correctly.
2020-09-11 22:57:06 -06:00
James Betker
4e44bca611
SPSR4
...
aka - return of the backbone! I'm tired of massively overparameterized generators
with pile-of-shit multiplexers. Let's give this another try..
2020-09-11 22:55:37 -06:00
James Betker
19896abaea
Clean up old SwitchedSpsr arch
...
It didn't work anyways, so why not?
2020-09-11 16:09:28 -06:00
James Betker
4c2ee66fe4
Fix video processor
2020-09-11 13:10:14 -06:00
James Betker
50ca17bb0a
Feature mode -> back to LR fea
2020-09-11 13:09:55 -06:00
James Betker
1086f0476b
Fix ref branch using fixed filters
2020-09-11 08:58:35 -06:00
James Betker
8c469b8286
Enable memory checkpointing
2020-09-11 08:44:29 -06:00
James Betker
5189b11dac
Add combined dataset for training across multiple datasets
2020-09-11 08:44:06 -06:00
James Betker
313424d7b5
Add new referencing discriminator
...
Also extend the way losses work so that you can pass
parameters into the discriminator from the config file
2020-09-10 21:35:29 -06:00
James Betker
9e5aa166de
Report the standard deviation of ref branches
...
This patch also ups the contribution
2020-09-10 16:34:41 -06:00
James Betker
668bfbff6d
Back to best arch for spsr3
2020-09-10 14:58:14 -06:00
James Betker
992b0a8d98
spsr3 with conjoin stage as part of the switch
2020-09-10 09:11:37 -06:00
James Betker
e0fc5eb50c
Temporary commit - noise
2020-09-09 17:12:52 -06:00
James Betker
00da69d450
Temporary commit - ref
2020-09-09 17:09:44 -06:00
James Betker
df59d6c99d
More spsr3 mods
...
- Most branches get their own noise vector now.
- First attention branch has the intended sole purpose of raw image processing
- Remove norms from joiner block
2020-09-09 16:46:38 -06:00
James Betker
747ded2bf7
Fixes to the spsr3
...
Some lessons learned:
- Biases are fairly important as a relief valve. They dont need to be everywhere, but
most computationally heavy branches should have a bias.
- GroupNorm in SPSR is not a great idea. Since image gradients are represented
in this model, normal means and standard deviations are not applicable. (imggrad
has a high representation of 0).
- Don't fuck with the mainline of any generative model. As much as possible, all
additions should be done through residual connections. Never pollute the mainline
with reference data, do that in branches. It basically leaves the mode untrainable.
2020-09-09 15:28:14 -06:00
James Betker
0ffac391c1
SPSR with ref joining
2020-09-09 11:17:07 -06:00
James Betker
c41dc9a48c
Add missing requirements
2020-09-09 10:46:08 -06:00
James Betker
3027e6e27d
Enable amp to be disabled
2020-09-09 10:45:59 -06:00
James Betker
c04f244802
More mods
2020-09-08 20:36:27 -06:00
James Betker
dffbfd2ec4
Allow SRG checkpointing to be toggled
2020-09-08 15:14:43 -06:00
James Betker
e6207d4c50
SPSR3 work
...
SPSR3 is meant to fix whatever is causing the switching units
inside of the newer SPSR architectures to fail and basically
not use the multiplexers.
2020-09-08 15:14:23 -06:00
James Betker
5606e8b0ee
Fix SRGAN_model/fullimgdataset compatibility 1
2020-09-08 11:34:35 -06:00
James Betker
22c98f1567
Move MultiConvBlock to arch_util
2020-09-08 08:17:27 -06:00
James Betker
146ace0859
CSNLN changes (removed because it doesnt train well)
2020-09-08 08:04:16 -06:00
James Betker
f43df7f5f7
Make ExtensibleTrainer compatible with process_video
2020-09-08 08:03:41 -06:00
James Betker
a18ece62ee
Add updated spsr net for test
2020-09-07 17:01:48 -06:00
James Betker
55475d2ac1
Clean up unused archs
2020-09-07 11:38:11 -06:00
James Betker
e8613041c0
Add novograd optimizer
2020-09-06 17:27:08 -06:00
James Betker
a5c2388368
Use lower LQ image size when it is being fed in
2020-09-06 17:26:32 -06:00
James Betker
b1238d29cb
Fix trainable not applying to discriminators
2020-09-05 20:31:26 -06:00
James Betker
21ae135f23
Allow Novograd to be used as an optimizer
2020-09-05 16:50:13 -06:00
James Betker
912a4d9fea
Fix srg computer bug
2020-09-05 07:59:54 -06:00
James Betker
0dfd8eaf3b
Support injectors that run in eval only
2020-09-05 07:59:45 -06:00
James Betker
17aa205e96
New dataset that reads from lmdb
2020-09-04 17:32:57 -06:00
James Betker
44c75f7642
Undo SRG change
2020-09-04 17:32:16 -06:00
James Betker
6657a406ac
Mods needed to support training a corruptor again:
...
- Allow original SPSRNet to have a specifiable block increment
- Cleanup
- Bug fixes in code that hasnt been touched in awhile.
2020-09-04 15:33:39 -06:00
James Betker
bfdfaab911
Checkpoint RRDB
...
Greatly reduces memory consumption with a low performance penalty
2020-09-04 15:32:00 -06:00
James Betker
8580490a85
Reduce usage of resize operations when not needed in dataloaders.
2020-09-04 15:31:24 -06:00
James Betker
6226b52130
Pin memory in dataloaders by default
2020-09-04 15:30:46 -06:00
James Betker
64a24503f6
Add extract_subimages_with_ref_lmdb for generating lmdb with reference images
2020-09-04 15:30:34 -06:00
James Betker
696242064c
Use tensor checkpointing to drastically reduce memory usage
...
This comes at the expense of computation, but since we can use much larger
batches, it results in a net speedup.
2020-09-03 11:33:36 -06:00
James Betker
365813bde3
Add InterpolateInjector
2020-09-03 11:32:47 -06:00
James Betker
d90c96e55e
Fix greyscale injector
2020-09-02 10:29:40 -06:00
James Betker
8b52d46847
Interpreted feature loss to extensibletrainer
2020-09-02 10:08:24 -06:00
James Betker
886d59d5df
Misc fixes & adjustments
2020-09-01 07:58:11 -06:00
James Betker
0a9b85f239
Fix vgg_gn input_img_factor
2020-08-31 09:50:30 -06:00
James Betker
4b4d08bdec
Enable testing in ExtensibleTrainer, fix it in SRGAN_model
...
Also compute fea loss for this.
2020-08-31 09:41:48 -06:00
James Betker
b2091cb698
feamod fix
2020-08-30 08:08:49 -06:00
James Betker
a56e906f9c
train HR feature trainer
2020-08-29 09:27:48 -06:00
James Betker
0e859a8082
4x spsr ref (not workin)
2020-08-29 09:27:18 -06:00
James Betker
623f3b99b2
Stupid pathing..
2020-08-26 17:58:24 -06:00
James Betker
25832930db
Update loss with lr crossgan
2020-08-26 17:57:22 -06:00
James Betker
80aa83bfd2
Try copytree for tb_logger again.
2020-08-26 17:55:02 -06:00
James Betker
cbd5e7a986
Support old school crossgan in extensibletrainer
2020-08-26 17:52:35 -06:00
James Betker
b593d8e7c3
Save tb_logger to alt_path
2020-08-26 17:45:07 -06:00
James Betker
8a6a2e6e2e
Rev3 of the full image ref arch
2020-08-26 17:11:01 -06:00
James Betker
f35b3ad28f
Fix val behavior for ExtensibleTrainer
2020-08-26 08:44:22 -06:00
James Betker
434ed70a9a
Wrap vgg disc
2020-08-25 18:14:45 -06:00
James Betker
83f2f8d239
more debugging
2020-08-25 18:12:12 -06:00
James Betker
3f60281da7
Print when wrapping
2020-08-25 18:08:46 -06:00
James Betker
bae18c05e6
wrap disc grad
2020-08-25 17:58:20 -06:00
James Betker
f85f1e21db
Turns out, can't do that
2020-08-25 17:18:52 -06:00
James Betker
935a735327
More dohs
2020-08-25 17:05:16 -06:00
James Betker
53e67bdb9c
Distribute get_grad_no_padding
2020-08-25 17:03:18 -06:00
James Betker
2f706b7d93
I an inept.
2020-08-25 16:42:59 -06:00
James Betker
8bae0de769
ffffffffffffffffff
2020-08-25 16:41:01 -06:00
James Betker
1fe16f71dd
Fix bug reporting spsr gan weight
2020-08-25 16:37:45 -06:00
James Betker
96586d6592
Fix distributed d_grad
2020-08-25 16:06:27 -06:00
James Betker
09a9079e17
Check rank before doing image logging.
2020-08-25 16:00:49 -06:00
James Betker
a1800f45ef
Fix for referencingmultiplexer
2020-08-25 15:43:12 -06:00
James Betker
19487d9bbd
Fix distributed launch for large distributed runs
2020-08-25 15:42:59 -06:00
James Betker
03eb29a4d9
Fix LQGT dataset
2020-08-25 11:57:25 -06:00
James Betker
a65b07607c
Reference network
2020-08-25 11:56:59 -06:00
James Betker
f224907603
Fix LQGT_dataset, add full_image_dataset
2020-08-24 17:12:43 -06:00
James Betker
5ec04aedc8
Let noise be configurable
...
LQ noise is not currently configurable for some reason..
2020-08-24 15:00:14 -06:00
James Betker
f9276007a8
More fixes to corrupt_fea
2020-08-23 17:52:18 -06:00
James Betker
0005c56cd4
dbg
2020-08-23 17:43:03 -06:00
James Betker
4bb5b3c981
corfea debugging
2020-08-23 17:39:02 -06:00
James Betker
7713cb8df5
Corrupted features in srgan
2020-08-23 17:32:03 -06:00
James Betker
dffc15184d
More ExtensibleTrainer work
...
It runs now, just need to debug it to reach performance parity with SRGAN. Sweet.
2020-08-23 17:22:45 -06:00
James Betker
afdd93fbe9
Grey feature
2020-08-22 13:41:38 -06:00
James Betker
e59e712e39
More ExtensibleTrainer work
2020-08-22 13:08:33 -06:00
James Betker
f40545f235
ExtensibleTrainer work
2020-08-22 08:24:34 -06:00
James Betker
a498d7b1b3
Report l_g_gan_grad before weight multiplication
2020-08-20 11:57:53 -06:00
James Betker
9d77a4db2e
Allow initial temperature to be specified to SPSR net for inference
2020-08-20 11:57:34 -06:00
James Betker
24bdcc1181
Let SwitchedSpsr transform count be specified
2020-08-18 09:10:25 -06:00
James Betker
40bb0597bb
misc
2020-08-18 08:50:24 -06:00
James Betker
74cdaa2226
Some work on extensible trainer
2020-08-18 08:49:32 -06:00
James Betker
0c98c61f4a
Enable start_step to be specified
2020-08-15 18:34:59 -06:00
James Betker
868d0aa442
Undo early dim reduction on grad branch for SPSR_arch
2020-08-14 16:23:42 -06:00
James Betker
2d205f52ac
Unite spsr_arch switched gens
...
Found a pretty good basis model.
2020-08-12 17:04:45 -06:00
James Betker
bdaa67deb7
Misc
2020-08-12 08:46:15 -06:00
James Betker
3d0ece804b
SPSR LR2
2020-08-12 08:45:49 -06:00
James Betker
ab04ca1778
Extensible trainer (in progress)
2020-08-12 08:45:23 -06:00
James Betker
cb316fabc7
Use LR data for image gradient prediction when HR data is disjoint
2020-08-10 15:00:28 -06:00
James Betker
f0e2816239
Denoise attention maps
2020-08-10 14:59:58 -06:00
James Betker
59aba1daa7
LR switched SPSR arch
...
This variant doesn't do conv processing at HR, which should save
a ton of memory in inference. Lets see how it works.
2020-08-10 13:03:36 -06:00
James Betker
4e972144ae
More attention fixes for switched_spsr
2020-08-07 21:11:50 -06:00
James Betker
d02509ef97
spsr_switched missing import
2020-08-07 21:05:29 -06:00
James Betker
887806ffa0
Finish up spsr_switched
2020-08-07 21:03:48 -06:00
James Betker
1d5f4f6102
Crossgan
2020-08-07 21:03:39 -06:00
James Betker
fd7b6ca0a9
Comptue gan_grad_branch....
2020-08-06 12:11:40 -06:00
James Betker
30b16d5235
Update how branch GAN grad is disseminated
2020-08-06 11:13:02 -06:00
James Betker
1f21c02f8b
Add cross-compare discriminator
2020-08-06 08:56:21 -06:00
James Betker
be272248af
More RAGAN fixes
2020-08-05 16:47:21 -06:00
James Betker
26a6a5d512
Compute grad GAN loss against both the branch and final target, simplify pixel loss
...
Also fixes a memory leak issue where we weren't detaching our loss stats when
logging them. This stabilizes memory usage substantially.
2020-08-05 12:08:15 -06:00
James Betker
299ee13988
More RAGAN fixes
2020-08-05 11:03:06 -06:00
James Betker
b8a4df0a0a
Enable RAGAN in SPSR, retrofit old RAGAN for efficiency
2020-08-05 10:34:34 -06:00
James Betker
3ab39f0d22
Several new spsr nets
2020-08-05 10:01:24 -06:00
James Betker
3c0a2d6efe
Fix grad branch debug out
2020-08-04 16:43:43 -06:00
James Betker
ec2a795d53
Fix multistep optimizer (feeding from wrong config params)
2020-08-04 16:42:58 -06:00
James Betker
4bfbdaf94f
Don't recompute generator outputs for D in standard operation
...
Should significantly improve training performance with negligible
results differences.
2020-08-04 11:28:52 -06:00
James Betker
11b227edfc
Whoops
2020-08-04 10:30:40 -06:00
James Betker
6d25bcd5df
Apply fixes to grad discriminator
2020-08-04 10:25:13 -06:00
James Betker
96d66f51c5
Update requirements
2020-08-03 16:57:56 -06:00
James Betker
c7e5d3888a
Add pix_grad_branch loss to metrics
2020-08-03 16:21:05 -06:00
James Betker
0d070b47a7
Add simplified SPSR architecture
...
Basically just cleaning up the code, removing some bad conventions,
and reducing complexity somewhat so that I can play around with
this arch a bit more easily.
2020-08-03 10:25:37 -06:00
James Betker
47e24039b5
Fix bug that makes feature loss run even when it is off
2020-08-02 20:37:51 -06:00
James Betker
328afde9c0
Integrate SPSR into SRGAN_model
...
SPSR_model really isn't that different from SRGAN_model. Rather than continuing to re-implement
everything I've done in SRGAN_model, port the new stuff from SPSR over.
This really demonstrates the need to refactor SRGAN_model a bit to make it cleaner. It is quite the
beast these days..
2020-08-02 12:55:08 -06:00
James Betker
c8da78966b
Substantial SPSR mods & fixes
...
- Added in gradient accumulation via mega-batch-factor
- Added AMP
- Added missing train hooks
- Added debug image outputs
- Cleaned up including removing GradientPenaltyLoss, custom SpectralNorm
- Removed all the custom discriminators
2020-08-02 10:45:24 -06:00
James Betker
f894ba8f98
Add SPSR_module
...
This is a port from the SPSR repo, it's going to need a lot of work to be properly integrated
but as of this commit it at least runs.
2020-08-01 22:02:54 -06:00
James Betker
f33ed578a2
Update how attention_maps are created
2020-08-01 20:23:46 -06:00
James Betker
c139f5cd17
More torch 1.6 fixes
2020-07-31 17:03:20 -06:00
James Betker
a66fbb32b6
Fix fixed_disc DataParallel issue
2020-07-31 16:59:23 -06:00
James Betker
8dd44182e6
Fix scale torch warning
2020-07-31 16:56:04 -06:00
James Betker
bcebed19b7
Fix pixdisc bugs
2020-07-31 16:38:14 -06:00
James Betker
eb11a08d1c
Enable disjoint feature networks
...
This is done by pre-training a feature net that predicts the features
of HR images from LR images. Then use the original feature network
and this new one in tandem to work only on LR/Gen images.
2020-07-31 16:29:47 -06:00
James Betker
6e086d0c20
Fix fixed_disc
2020-07-31 15:07:10 -06:00
James Betker
d5fa059594
Add capability to have old discriminators serve as feature networks
2020-07-31 14:59:54 -06:00
James Betker
6b45b35447
Allow multi_step_lr_scheduler to load a new LR schedule when restoring state
2020-07-31 11:21:11 -06:00
James Betker
e37726f302
Add feature_model for training custom feature nets
2020-07-31 11:20:39 -06:00
James Betker
7629cb0e61
Add FDPL Loss
...
New loss type that can replace PSNR loss. Works against the frequency domain
and focuses on frequency features loss during hr->lr conversion.
2020-07-30 20:47:57 -06:00
James Betker
85ee64b8d9
Turn down feadisc intensity
...
Honestly - this feature is probably going to be removed soon, so backwards
compatibility is not a huge deal anymore.
2020-07-27 15:28:55 -06:00
James Betker
ebb199e884
Get rid of safety valve (probably being encountered in val)
2020-07-26 22:51:59 -06:00
James Betker
0892d5fe99
LQGT_dataset gan debug
2020-07-26 22:48:35 -06:00
James Betker
d09ed4e5f7
Misc fixes
2020-07-26 22:44:24 -06:00
James Betker
c54784ae9e
Fix feature disc log item error
2020-07-26 22:25:59 -06:00
James Betker
9a8f227501
Allow separate dataset to pushed in for GAN-only training
2020-07-26 21:44:45 -06:00
James Betker
b06e1784e1
Fix SRG4 & switch disc
...
"fix". hehe.
2020-07-25 17:16:54 -06:00
James Betker
e6e91a1d75
Add SRG4
...
Back to the idea that maybe what we need is a hybrid
approach between pure switches and RDB.
2020-07-24 20:32:49 -06:00
James Betker
3320ad685f
Fix mega_batch_factor not set for test
2020-07-24 12:26:44 -06:00
James Betker
c50cce2a62
Add an abstract, configurabler weight scheduling class and apply it to the feature weight
2020-07-23 17:03:54 -06:00
James Betker
9ccf771629
Fix feature validation, wrong device
...
Only shows up in distributed training for some reason.
2020-07-23 10:16:34 -06:00
James Betker
a7541b6d8d
Fix illegal tb_logger use in distributed training
2020-07-23 09:14:01 -06:00
James Betker
bba283776c
Enable find_unused_parameters for DistributedDataParallel
...
attention_norm has some parameters which are not used to compute grad,
which is causing failures in the distributed case.
2020-07-23 09:08:13 -06:00
James Betker
dbf6147504
Add switched discriminator
...
The logic is that the discriminator may be incapable of providing a truly
targeted loss for all image regions since it has to be too generic
(basically the same argument for the switched generator). So add some
switches in! See how it works!
2020-07-22 20:52:59 -06:00
James Betker
8a0a1569f3
Enable force_multiple in LQ_dataset
2020-07-22 20:51:16 -06:00
James Betker
106b8da315
Assert that temperature is set properly in eval mode.
2020-07-22 20:50:59 -06:00
James Betker
c74b9ee2e4
Add a way to disable grad on portions of the generator graph to save memory
2020-07-22 11:40:42 -06:00
James Betker
e3adafbeac
Add convert_model.py and a hacky way to add extra layers to a model
2020-07-22 11:39:45 -06:00
James Betker
7f7e17e291
Update feature discriminator further
...
Move the feature/disc losses closer and add a feature computation layer.
2020-07-20 20:54:45 -06:00
James Betker
46aa776fbb
Allow feature discriminator unet to only output closest layer to feature output
2020-07-19 19:05:08 -06:00
James Betker
8a9f215653
Huge set of mods to support progressive generator growth
2020-07-18 14:18:48 -06:00
James Betker
47a525241f
Make attention norm optional
2020-07-18 07:24:02 -06:00
James Betker
ad97a6a18a
Progressive SRG first check-in
2020-07-18 07:23:26 -06:00
James Betker
b08b1cad45
Fix feature decay
2020-07-16 23:27:06 -06:00
James Betker
3e7a83896b
Fix pixgan debugging issues
2020-07-16 11:45:19 -06:00
James Betker
a1bff64d1a
More fixes
2020-07-16 10:48:48 -06:00
James Betker
240f254263
More loss fixes
2020-07-16 10:45:50 -06:00
James Betker
6cfa67d831
Fix featuredisc broadcast error
2020-07-16 10:18:30 -06:00
James Betker
8d061a2687
Add u-net discriminator with feature output
2020-07-16 10:10:09 -06:00
James Betker
0c4c388e15
Remove dualoutputsrg
...
Good idea, didn't pan out.
2020-07-16 10:09:24 -06:00
James Betker
4bcc409fc7
Fix loadSRG2 typo
2020-07-14 10:20:53 -06:00
James Betker
1e4083a35b
Apply temperature mods to all SRG models
...
(Honestly this needs to be base classed at this point)
2020-07-14 10:19:35 -06:00
James Betker
7659bd6818
Fix temperature equation
2020-07-14 10:17:14 -06:00
James Betker
853468ef82
Allow legacy state_dicts in srg2
2020-07-14 10:03:45 -06:00
James Betker
1b1431133b
Add DualOutputSRG
...
Also removes the old multi-return mechanism that Generators support.
Also fixes AttentionNorm.
2020-07-14 09:28:24 -06:00
James Betker
a2285ff2ee
Scale anorm by transform count
2020-07-13 08:49:09 -06:00
James Betker
dd0bbd9a7c
Enable AttentionNorm on SRG2
2020-07-13 08:38:17 -06:00
James Betker
4c0f770f2a
Fix inverted temperature curve bug
2020-07-12 11:02:50 -06:00
James Betker
14d23b9d20
Fixes, do fake swaps less often in pixgan discriminator
2020-07-11 21:22:11 -06:00
James Betker
ba6187859a
err5
2020-07-10 23:02:56 -06:00
James Betker
902527dfaa
err4
2020-07-10 23:00:21 -06:00
James Betker
020b3361fa
err3
2020-07-10 22:57:34 -06:00
James Betker
b3a2c21250
err2
2020-07-10 22:52:02 -06:00
James Betker
716433db1f
err1
2020-07-10 22:50:56 -06:00
James Betker
ef9f1307eb
Sometimes don't use compression artifacts
2020-07-10 22:25:53 -06:00
James Betker
0b7193392f
Implement unet disc
...
The latest discriminator architecture was already pretty much a unet. This
one makes that official and uses shared layers. It also upsamples one additional
time and throws out the lowest upsampling result.
The intent is to delete the old vgg pixdisc, but I'll keep it around for a bit since
I'm still trying out a few models with it.
2020-07-10 16:24:42 -06:00
James Betker
812c684f7d
Update pixgan swap algorithm
...
- Swap multiple blocks in the image instead of just one. The discriminator was clearly
learning that most blocks have one region that needs to be fixed.
- Relax block size constraints. This was in place to gaurantee that the discriminator
signal was clean. Instead, just downsample the "loss image" with bilinear interpolation.
The result is noisier, but this is actually probably healthy for the discriminator.
2020-07-10 15:56:14 -06:00
James Betker
33ca3832e1
Move ExpansionBlock to arch_util
...
Also makes all processing blocks have a conformant signature.
Alters ExpansionBlock to perform a processing conv on the passthrough
before the conjoin operation - this will break backwards compatibilty with SRG2.
2020-07-10 15:53:41 -06:00
James Betker
5e8b52f34c
Misc changes
2020-07-10 09:45:48 -06:00
James Betker
5f2c722a10
SRG2 revival
...
Big update to SRG2 architecture to pull in a lot of things that have been learned:
- Use group norm instead of batch norm
- Initialize the weights on the transformations low like is done in RRDB rather than using the scalar. Models live or die by their early stages, and this ones early stage is pretty weak
- Transform multiplexer to use u-net like architecture.
- Just use one set of configuration variables instead of a list - flat networks performed fine in this regard.
2020-07-09 17:34:51 -06:00
James Betker
12da993da8
More fixes...
2020-07-08 22:07:09 -06:00
James Betker
7d6eb28b87
More fixes
2020-07-08 22:00:57 -06:00
James Betker
b2507be13c
Fix up pixgan loss and pixdisc
2020-07-08 21:27:48 -06:00
James Betker
26a4a66d1c
Bug fixes and new gan mechanism
...
- Removed a bunch of unnecessary image loggers. These were just consuming space and never being viewed
- Got rid of support of artificial var_ref support. The new pixdisc is what i wanted to implement then - it's much better.
- Add pixgan GAN mechanism. This is purpose-built for the pixdisc. It is intended to promote a healthy discriminator
- Megabatchfactor was applied twice on metrics, fixed that
Adds pix_gan (untested) which swaps a portion of the fake and real image with each other, then expects the discriminator
to properly discriminate the swapped regions.
2020-07-08 17:40:26 -06:00
James Betker
4305be97b4
Update log metrics
...
They should now be universal regardless of job configuration
2020-07-07 15:33:22 -06:00
James Betker
8a4eb8241d
SRG3 work
...
Operates on top of a pre-trained SpineNET backbone (trained on CoCo 2017 with RetinaNet)
This variant is extremely shallow.
2020-07-07 13:46:40 -06:00
James Betker
0acad81035
More SRG2 adjustments..
2020-07-06 22:40:40 -06:00
James Betker
086b2f0570
More bugs
2020-07-06 22:28:07 -06:00
James Betker
d4d4f85fc0
Bug fixes
2020-07-06 22:25:40 -06:00
James Betker
3c31bea1ac
SRG2 architectural changes
2020-07-06 22:22:29 -06:00
James Betker
9a1c3241f5
Switch discriminator to groupnorm
2020-07-06 20:59:59 -06:00
James Betker
60c6352843
Misc
2020-07-06 20:44:07 -06:00
James Betker
6beefa6d0c
PixDisc - Add two more levels of losses coming from this gen at higher resolutions
2020-07-06 11:15:52 -06:00
James Betker
2636d3b620
Fix assertion error
2020-07-06 09:23:53 -06:00
James Betker
8f92c0a088
Interpolate attention well before softmax
2020-07-06 09:18:30 -06:00
James Betker
72f90cabf8
More pixdisc fixes
2020-07-05 22:03:16 -06:00
James Betker
909007ee6a
Add G_warmup
...
Let the Generator get to a point where it is at least competing with the discriminator before firing off.
Backwards from most GAN architectures, but this one is a bit different from most.
2020-07-05 21:58:35 -06:00
James Betker
a47a5dca43
Fix pixdisc bug
2020-07-05 21:57:52 -06:00
James Betker
d0957bd7d4
Alter weight initialization for transformation blocks
2020-07-05 17:32:46 -06:00
James Betker
16d1bf6dd7
Replace ConvBnRelus in SRG2 with Silus
2020-07-05 17:29:20 -06:00
James Betker
10f7e49214
Add ConvBnSilu to replace ConvBnRelu
...
Relu produced good performance gains over LeakyRelu, but
GAN performance degraded significantly. Try SiLU as an alternative
to see if it's the leaky-ness we are looking for or the smooth activation
curvature.
2020-07-05 13:39:08 -06:00
James Betker
9934e5d082
Move SRG1 to identical to new
2020-07-05 08:49:34 -06:00
James Betker
416538f31c
SRG1 conjoined except ConvBnRelu
2020-07-05 08:44:17 -06:00
James Betker
c58c2b09ca
Back to remove all biases (looks like a ConvBnRelu made its way in..)
2020-07-04 22:41:02 -06:00
James Betker
86cda86e94
Re-add biases, also add new init
...
A/B testing where we lost our GAN competitiveness.
2020-07-04 22:24:42 -06:00
James Betker
b03741f30e
Remove all biases from generator
...
Continuing to investigate loss of GAN competitiveness, this is a big difference
between "old" SRG1 and "new".
2020-07-04 22:19:55 -06:00
James Betker
726e946e79
Turn BN off in SRG1
...
This wont work well but just testing if GAN performance comes back
2020-07-04 14:51:27 -06:00
James Betker
0ee39d419b
OrderedDict not needed
2020-07-04 14:09:27 -06:00
James Betker
9048105b72
Break out SRG1 as separate network
...
Something strange is going on. These networks do not respond to
discriminator gradients properly anymore. SRG1 did, however so
reverting back to last known good state to figure out why.
2020-07-04 13:28:50 -06:00
James Betker
188de5e15a
Misc changes
2020-07-04 13:22:50 -06:00
James Betker
510b2f887d
Remove RDB from srg2
...
Doesnt seem to work so great.
2020-07-03 22:31:20 -06:00
James Betker
77d3765364
Fix new feature loss calc
2020-07-03 22:20:13 -06:00
James Betker
ed6a15e768
Add feature to dataset which allows it to force images to be a certain size.
2020-07-03 15:19:16 -06:00
James Betker
da4335c25e
Add a feature-based validation test
2020-07-03 15:18:57 -06:00
James Betker
703dec4472
Add SpineNet & integrate with SRG
...
New version of SRG uses SpineNet for a switch backbone.
2020-07-03 12:07:31 -06:00
James Betker
3ed7a2b9ab
Move ConvBnRelu/Lelu to arch_util
2020-07-03 12:06:38 -06:00
James Betker
ea9c6765ca
Move train imports into init_dist
2020-07-02 15:11:21 -06:00
James Betker
e9ee67ff10
Integrate RDB into SRG
...
The last RDB for each cluster is switched.
2020-07-01 17:19:55 -06:00
James Betker
6ac6c95177
Fix scaling bug
2020-07-01 16:42:27 -06:00
James Betker
30653181ba
Experiment: get rid of post_switch_conv
2020-07-01 16:30:40 -06:00
James Betker
17191de836
Experiment: bring initialize_weights back again
...
Something really strange going on here..
2020-07-01 15:58:13 -06:00
James Betker
d1d573de07
Experiment: new init and post-switch-conv
2020-07-01 15:25:54 -06:00
James Betker
480d1299d7
Remove RRDB with switching
...
This idea never really panned out, removing it.
2020-07-01 12:08:32 -06:00
James Betker
e2398ac83c
Experiment: revert initialization changes
2020-07-01 12:08:09 -06:00
James Betker
78276afcaa
Experiment: Back to lelu
2020-07-01 11:43:25 -06:00
James Betker
b945021c90
SRG v2 - Move to Relu, rely on Module-based initialization
2020-07-01 11:33:32 -06:00
James Betker
ee6443ad7d
Add numeric stability computation script
2020-07-01 11:30:34 -06:00
James Betker
c0bb123504
Misc changes
2020-07-01 11:28:23 -06:00
James Betker
604763be68
NSG r7
...
Converts the switching trunk to a VGG-style network to make it more comparable
to SRG architectures.
2020-07-01 09:54:29 -06:00
James Betker
87f1e9c56f
Invert ResGen2 to operate in LR space
2020-06-30 20:57:40 -06:00
James Betker
e07d8abafb
NSG rev 6
...
- Disable style passthrough
- Process multiplexers starting at base resolution
2020-06-30 20:47:26 -06:00
James Betker
3ce1a1878d
NSG improvements (r5)
...
- Get rid of forwards(), it makes numeric_stability.py not work properly.
- Do stability auditing across layers.
- Upsample last instead of first, work in much higher dimensionality for transforms.
2020-06-30 16:59:57 -06:00
James Betker
75f148022d
Even more NSG improvements (r4)
2020-06-30 13:52:47 -06:00
James Betker
773753073f
More NSG improvements (v3)
...
Move to a fully fixup residual network for the switch (no
batch norms). Fix a bunch of other small bugs. Add in a
temporary latent feed-forward from the bottom of the
switch. Fix several initialization issues.
2020-06-29 20:26:51 -06:00
James Betker
4b82d0815d
NSG improvements
...
- Just use resnet blocks for the multiplexer trunk of the generator
- Every block initializes itself, rather than everything at the end
- Cleans up some messy parts of the architecture, including unnecessary
kernel sizes and places where BN is not used properly.
2020-06-29 10:09:51 -06:00
James Betker
978036e7b3
Add NestedSwitchGenerator
...
An evolution of SwitchedResidualGenerator, this variant nests attention
modules upon themselves to extend the representative capacity of the
model significantly.
2020-06-28 21:22:05 -06:00
James Betker
6f2bc36c61
Distill_torchscript mods
...
Starts down the path of writing a custom trace that works using torch's hook mechanism.
2020-06-27 08:28:09 -06:00
James Betker
db08dedfe2
Add recover_tensorboard_log
...
Generates a tb_logger from raw console output. Useful for colab sessions
that crash.
2020-06-27 08:26:57 -06:00
James Betker
c8a670842e
Missed networks.py in last commit
2020-06-25 18:36:06 -06:00
James Betker
407224eba1
Re-work SwitchedResgen2
...
Got rid of the converged multiplexer bases but kept the configurable architecture. The
new multiplexers look a lot like the old one.
Took some queues from the transformer architecture: translate image to a higher filter-space
and stay there for the duration of the models computation. Also perform convs after each
switch to allow the model to anneal issues that arise.
2020-06-25 18:17:05 -06:00
James Betker
42a10b34ce
Re-enable batch norm on switch processing blocks
...
Found out that batch norm is causing the switches to init really poorly -
not using a significant number of transforms. Might be a great time to
re-consider using the attention norm, but for now just re-enable it.
2020-06-24 21:15:17 -06:00
James Betker
4001db1ede
Add ConfigurableSwitchComputer
2020-06-24 19:49:37 -06:00
James Betker
83c3b8b982
Add parameterized noise injection into resgen
2020-06-23 10:16:02 -06:00
James Betker
0584c3b587
Add negative_transforms switch to resgen
2020-06-23 09:41:12 -06:00
James Betker
dfcbe5f2db
Add capability to place additional conv into discriminator
...
This should allow us to support larger images sizes. May need
to add another one of these.
2020-06-23 09:40:33 -06:00
James Betker
bad33de906
Add simple resize to extract images
2020-06-23 09:39:51 -06:00
James Betker
030648f2bc
Remove batchnorms from resgen
2020-06-22 17:23:36 -06:00
James Betker
68bcab03ae
Add growth channel to switch_growths for flat networks
2020-06-22 10:40:16 -06:00
James Betker
3b81712c49
Remove BN from transforms
2020-06-19 16:52:56 -06:00
James Betker
61364ec7d0
Fix inverse temperature curve logic and add upsample factor
2020-06-19 09:18:30 -06:00
James Betker
0551139b8d
Fix resgen temperature curve below 1
...
It needs to be inverted to maintain a true linear curve
2020-06-18 16:08:07 -06:00
James Betker
efc80f041c
Save & load amp state
2020-06-18 11:38:48 -06:00
James Betker
2e3b6bad77
Log tensorboard directly into experiments directory
2020-06-18 11:33:02 -06:00
James Betker
778e7b6931
Add a double-step to attention temperature
2020-06-18 11:29:31 -06:00
James Betker
d2d5e097d5
Add profiling to SRGAN for testing timings
2020-06-18 11:29:10 -06:00
James Betker
45a900fafe
Misc
2020-06-18 11:28:55 -06:00
James Betker
59b0533b06
Fix attimage step size
2020-06-17 18:45:24 -06:00
James Betker
645d0ca767
ResidualGen mods
...
- Add filters_mid spec which allows a expansion->squeeze for the transformation layers.
- Add scale and bias AFTER the switch
- Remove identity transform (models were converging on this)
- Move attention image generation and temperature setting into new function which gets called every step with a save path
2020-06-17 17:18:28 -06:00
James Betker
6f8406fbdc
Fixed ConfigurableSwitchedGenerator bug
2020-06-16 16:53:57 -06:00
James Betker
7d541642aa
Get rid of SwitchedResidualGenerator
...
Just use the configurable one instead..
2020-06-16 16:23:29 -06:00
James Betker
379b96eb55
Output histograms with SwitchedResidualGenerator
...
This also fixes the initialization weight for the configurable generator.
2020-06-16 15:54:37 -06:00
James Betker
f8b67f134b
Get proper contiguous view for backwards compatibility
2020-06-16 14:27:16 -06:00
James Betker
2def96203e
Mods to SwitchedResidualGenerator_arch
...
- Increased processing for high-resolution switches
- Do stride=2 first in HalvingProcessingBlock
2020-06-16 14:19:12 -06:00
James Betker
70c764b9d4
Create a configurable SwichedResidualGenerator
...
Also move attention image generator out of repo
2020-06-16 13:24:07 -06:00
James Betker
df1046c318
New arch: SwitchedResidualGenerator_arch
...
The concept here is to use switching to split the generator into two functions:
interpretation and transformation. Transformation is done at the pixel level by
relatively simple conv layers, while interpretation is computed at various levels
by far more complicated conv stacks. The two are merged using the switching
mechanism.
This architecture is far less computationally intensive that RRDB.
2020-06-16 11:23:50 -06:00
James Betker
ddfd7f67a0
Get rid of biggan
...
Not really sure it's a great fit for what is being done here.
2020-06-16 11:21:44 -06:00
James Betker
0a714e8451
Fix initialization in mhead switched rrdb
2020-06-15 21:32:03 -06:00
James Betker
be7982b9ae
Add skip heads to switcher
...
These pass through the input so that it can be selected by the attention mechanism.
2020-06-14 12:46:54 -06:00
James Betker
6c27ddc9b5
Misc
2020-06-14 11:03:02 -06:00
James Betker
6c0e9f45c7
Add GPU mem tracing module
2020-06-14 11:02:54 -06:00
James Betker
48532a0a8a
Fix initial_stride on lowdim models
2020-06-14 11:02:16 -06:00
James Betker
532704af40
Multiple modifications for experimental RRDB architectures
...
- Add LowDimRRDB; essentially a "normal RRDB" but the RDB blocks process at a low dimension using PixelShuffle
- Add switching wrappers around it
- Add support for switching on top of multi-headed inputs and outputs
- Moves PixelUnshuffle to arch_util
2020-06-13 11:37:27 -06:00
James Betker
e89f28ead0
Update multirrdb to do HR fixing in the base image dimension.
2020-06-11 08:43:39 -06:00
James Betker
d3b2cbfe7c
Fix loading new state dicts for RRDB
2020-06-11 08:25:57 -06:00
James Betker
5ca53e7786
Add alternative first block for PixShuffleRRDB
2020-06-10 21:45:24 -06:00
James Betker
43b7fccc89
Fix mhead attention integration bug for RRDB
2020-06-10 12:02:33 -06:00
James Betker
12e8fad079
Add serveral new RRDB architectures
2020-06-09 13:28:55 -06:00
James Betker
296135ec18
Add doResizeLoss to dataset
...
doResizeLoss has a 50% chance to resize the LQ image to 50% size,
then back to original size. This is useful to training a generator to
recover these lost pixel values while also being able to do
repairs on higher resolution images during training.
2020-06-08 11:27:06 -06:00
James Betker
786a4288d6
Allow switched RRDBNet to record metrics and decay temperature
2020-06-08 11:10:38 -06:00
James Betker
ae3301c0ea
SwitchedRRDB work
...
Renames AttentiveRRDB to SwitchedRRDB. Moves SwitchedConv to
an external repo (neonbjb/switchedconv). Switchs RDB blocks instead
of conv blocks. Works good!
2020-06-08 08:47:34 -06:00
James Betker
93528ff8df
Merge branch 'gan_lab' of https://github.com/neonbjb/mmsr into gan_lab
2020-06-07 16:59:31 -06:00
James Betker
805bd129b7
Switched conv partial impl
2020-06-07 16:59:22 -06:00
James Betker
9e203d07c4
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-06-07 16:56:12 -06:00
James Betker
299d855b34
Enable forced learning rates
2020-06-07 16:56:05 -06:00
James Betker
efb5b3d078
Add switched_conv
2020-06-07 16:45:07 -06:00
James Betker
063719c5cc
Fix attention conv bugs
2020-06-06 18:31:02 -06:00
James Betker
cbedd6340a
Add RRDB with attention
2020-06-05 21:02:08 -06:00
James Betker
ef5d8a0ed1
Misc
2020-06-05 21:01:50 -06:00
James Betker
318a604405
Allow weighting of input data
...
This essentially allows you to give some datasets more
importance than others for the purposes of reaching a more
refined network.
2020-06-04 10:05:21 -06:00
James Betker
edf0f8582e
Fix rrdb bug
2020-06-02 11:15:55 -06:00
James Betker
dc17545083
Add RRDB Initial Stride
...
Allows downsampling immediately before processing, which reduces network complexity on
higher resolution images but keeps a higher filter count.
2020-06-02 10:47:15 -06:00
James Betker
76a38b6a53
Misc
2020-06-02 09:35:52 -06:00
James Betker
726d1913ac
Allow validating in batches, remove val size limit
2020-06-02 08:41:22 -06:00
James Betker
90125f5bed
Allow blurring to be specified
2020-06-02 08:40:52 -06:00
James Betker
8355f3d1b3
Only log discriminator data when gan is activated
2020-06-01 15:48:16 -06:00
James Betker
f1a1fd14b1
Introduce (untested) colab mode
2020-06-01 15:09:52 -06:00
James Betker
a38dd62489
Only train discriminator/gan losses when gan_w > 0
2020-06-01 15:09:10 -06:00
James Betker
1eb9c5a059
Fix grayscale
2020-05-29 22:04:50 -06:00
James Betker
2b03e40f98
Debug process_video
2020-05-29 20:44:50 -06:00
James Betker
74b313aaa9
Add grayscale downsampling option
2020-05-29 20:34:00 -06:00
James Betker
b123ed8a45
Add attention resnet
...
Not ready for prime time, but is a first draft.
2020-05-29 20:02:10 -06:00
James Betker
5e9da65d81
Fix process_video bugs
2020-05-29 12:47:22 -06:00
James Betker
beac71ad18
Allow minivids to start at any specified number
2020-05-28 20:27:42 -06:00
James Betker
57682ebee3
Separate feature extractors out, add resnet feature extractor
2020-05-28 20:26:30 -06:00
James Betker
156cee240a
Remove reliance on magick, only wait for ffmpeg at last second, fix image ordering issue
2020-05-27 23:09:46 -06:00
James Betker
b551e86adb
Encode videos to HEVC
...
MP4 doesnt support 8K
2020-05-27 20:04:45 -06:00
James Betker
bdeafad8c5
Only run 'convert' not magick
2020-05-27 17:24:10 -06:00
James Betker
41c1efbf56
Add dynamic video processing script
2020-05-27 17:09:11 -06:00
James Betker
f745be9dea
Fix vgg disc arch
2020-05-27 13:31:22 -06:00
James Betker
6962ccb306
Adjust motion blur
...
0 is invalid.
2020-05-27 13:09:46 -06:00
James Betker
f6815df58b
Misc
2020-05-27 08:04:47 -06:00
James Betker
e27a49454e
Enable vertical splitting on inference images to support very large resolutions.
2020-05-27 08:04:35 -06:00
James Betker
96ac26a8b7
Allow injection of random low-amplitude noise & motion blur into generator
2020-05-27 08:04:11 -06:00
James Betker
69cbfa2f0c
Adjust dataset mutations a bit
...
Adjusts the compression artfacts to be more aggressive, and blurring to be less aggressive.
2020-05-26 13:48:34 -06:00
James Betker
2931142458
Allow multiple gt image dirs
2020-05-25 19:21:09 -06:00
James Betker
4e44b8a1aa
Clean up video stuff
2020-05-25 19:20:49 -06:00
James Betker
8464cae168
HQ blurring doesnt actually work right - hq images arent the right size when they are blurred
...
Just revert it and blur the lq images..
2020-05-24 22:32:54 -06:00
James Betker
5fd8749cf2
More updates - need more blurring
2020-05-24 22:13:27 -06:00
James Betker
9627cc2c49
Update HR gaussian blur params
2020-05-24 18:00:31 -06:00
James Betker
2f8b0250b9
Blur HR image before downsizing, when available
2020-05-24 17:18:44 -06:00
James Betker
cc4571eb8d
Randomize blur effect
2020-05-24 12:35:41 -06:00
James Betker
27a548c019
Enable blurring via settings
2020-05-24 11:56:39 -06:00
James Betker
3c2e5a0250
Apply fixes to resgen
2020-05-24 07:43:23 -06:00
James Betker
446322754a
Support generators that don't output intermediary values.
2020-05-23 21:09:54 -06:00
James Betker
987cdad0b6
Misc mods
2020-05-23 21:09:38 -06:00
James Betker
9b44f6f5c0
Add AssistedRRDB and remove RRDBNetXL
2020-05-23 21:09:21 -06:00
James Betker
445e7e7053
Extract subimages mod
2020-05-23 21:07:41 -06:00
James Betker
90073fc761
Update LQ_dataset to support inference on split image videos
2020-05-23 21:05:49 -06:00
James Betker
74bb0fad33
Allow dataset classes to add noise internally
2020-05-23 21:04:24 -06:00
James Betker
af1968f9e5
Allow passthrough discriminator to have passthrough disabled from config
2020-05-19 09:41:16 -06:00
James Betker
67139602f5
Test modifications
...
Allows bifurcating large images put into the test pipeline
This code is fixed and not dynamic. Needs some fixes.
2020-05-19 09:37:58 -06:00
James Betker
6400607fc5
ONNX export support
2020-05-19 09:36:04 -06:00
James Betker
89c71293ce
IDEA update
2020-05-19 09:35:26 -06:00
James Betker
9cde58be80
Make RRDB usable in the current iteration
2020-05-16 18:36:30 -06:00
James Betker
b95c4087d1
Allow an alt_path for saving models and states
2020-05-16 09:10:51 -06:00
James Betker
f911ef0d3e
Add corruptor_usage_probability
...
Governs how often a corruptor is used, vs feeding uncorrupted images.
2020-05-16 09:05:43 -06:00
James Betker
635c53475f
Restore swapout models just before a checkpoint
2020-05-16 07:45:19 -06:00
James Betker
a33ec3e22b
Fix skips & images samples
...
- Makes skip connections between the generator and discriminator more
extensible by adding additional configuration options for them and supporting
1 and 0 skips.
- Places the temp/ directory with sample images from the training process appear
in the training directory instead of the codes/ directory.
2020-05-15 13:50:49 -06:00
James Betker
cdf641e3e2
Remove working options from repo
2020-05-15 07:50:55 -06:00
James Betker
bd4d478572
config changes
2020-05-15 07:41:18 -06:00
James Betker
79593803f2
biggan arch, initial work (not implemented)
2020-05-15 07:40:45 -06:00
James Betker
61ed51d9e4
Improve corruptor logic: switch corruptors randomly
2020-05-14 23:14:32 -06:00
James Betker
d72e154442
Allow more LQ than GT images in corrupt mode
2020-05-14 20:46:20 -06:00
James Betker
8a514b9645
Misc changes
2020-05-14 20:45:38 -06:00
James Betker
a946483f1c
Fix discriminator noise floor
2020-05-14 20:45:06 -06:00
James Betker
c8ab89d243
Add model swapout
...
Model swapout is a feature where, at specified intervals,
a random D and G model will be swapped in place for the
one being trained. After a short period of time, this model
is swapped back out. This is intended to increase training
diversity.
2020-05-13 16:53:38 -06:00
James Betker
c336d32fd3
Config updates
2020-05-13 15:27:49 -06:00
James Betker
5bcf187fb6
Disable LMDB support
...
It doesn't play nice with multiple dataroots and I don't
really see any need to continue support since I'm not
testing it.
2020-05-13 15:27:33 -06:00
James Betker
e36f22e14a
Allow "corruptor" network to be specified
...
This network is just a fixed (pre-trained) generator
that performs a corruption transformation that the
generator-in-training is expected to undo alongside
SR.
2020-05-13 15:26:55 -06:00
James Betker
f389025b53
Change ResGen noise feature
...
It now injects noise directly into the input filters, rather than a
pure noise filter. The pure noise filter was producing really
poor results (and I'm honestly not quite sure why).
2020-05-13 09:22:06 -06:00
James Betker
343af70a8d
Add code for compiling model to torchscript
...
I want to be able to export it to other formats too in the future.
2020-05-13 09:21:13 -06:00
James Betker
585b05e66b
Cap test workers at 10
2020-05-13 09:20:45 -06:00
James Betker
037a5a3cdb
Config updates
2020-05-13 09:20:28 -06:00
James Betker
fc3ec8e3a2
Add a noise floor to th discriminator noise factor
2020-05-13 09:19:22 -06:00
James Betker
5d1b4caabf
Allow noise to be injected at the generator inputs for resgen
2020-05-12 16:26:29 -06:00
James Betker
06d18343f7
Allow noise to be added to discriminator inputs
2020-05-12 16:25:38 -06:00
James Betker
9210a62f58
Add rotating log buffer to trainer
...
Should stabilize stats output.
2020-05-12 10:09:45 -06:00
James Betker
f217216c81
Implement ResGenv2
...
Implements a ResGenv2 architecture which slightly increases the complexity
of the final output layer but causes it to be shared across all skip outputs.
2020-05-12 10:09:15 -06:00
James Betker
1596a98493
Get rid of skip layers from vgg disc
2020-05-12 10:08:12 -06:00
James Betker
c540244789
Config file update
2020-05-12 10:07:58 -06:00
James Betker
62a97c53d1
Handle tuple-returning generators in test
2020-05-11 11:15:26 -06:00
James Betker
f994466289
Initialize test dataloader with a worker count proportional to the batch size.
2020-05-10 10:49:37 -06:00
James Betker
ef48e819aa
Allow resgen to have a conditional number of upsamples applied to it
2020-05-10 10:48:37 -06:00
James Betker
8969a3ce70
Add capability to start at arbitrary frames
2020-05-10 10:48:05 -06:00
James Betker
03351182be
Use amp in SR_model for inference
2020-05-07 21:45:33 -06:00
James Betker
dbca0d328c
Fix multi-lq bug
2020-05-06 23:16:35 -06:00
James Betker
aa0305def9
Resnet discriminator overhaul
...
It's been a tough day figuring out WTH is going on with my discriminators.
It appears the raw FixUp discriminator can get into an "defective" state where
they stop trying to learn and just predict as close to "0" D_fake and D_real as
possible. In this state they provide no feedback to the generator and never
recover. Adding batch norm back in seems to fix this so it must be some sort
of parameterization error.. Should look into fixing this in the future.
2020-05-06 17:27:30 -06:00
James Betker
602f86bfa4
Random config changes
2020-05-06 17:25:48 -06:00
James Betker
574e7e882b
Fix up OOM issues when running a disjoint D update ratio and megabatches
2020-05-06 17:25:25 -06:00
James Betker
eee9d6d9ca
Support skip connections in vgg arch discriminator.
2020-05-06 17:24:34 -06:00
James Betker
5c1832e124
Add support for multiple LQ paths
...
I want to be able to specify many different transformations onto
the target data; the model should handle them all. Do this by
allowing multiple LQ paths to be selected and the dataset class
selects one at random.
2020-05-06 17:24:17 -06:00
James Betker
3cd85f8073
Implement ResGen arch
...
This is a simpler resnet-based generator which performs mutations
on an input interspersed with interpolate-upsampling. It is a two
part generator:
1) A component that "fixes" LQ images with a long string of resnet
blocks. This component is intended to remove compression artifacts
and other noise from a LQ image.
2) A component that can double the image size. The idea is that this
component be trained so that it can work at most reasonable
resolutions, such that it can be repeatedly applied to itself to
perform multiple upsamples.
The motivation here is to simplify what is being done inside of RRDB.
I don't believe the complexity inside of that network is justified.
2020-05-05 11:59:46 -06:00
James Betker
9f4581aacb
Fix megabatch scaling, log low and med-res gen images
2020-05-05 08:34:57 -06:00
James Betker
3b4e54c4c5
Add support for passthrough disc/gen
...
Add RRDBNetXL, which performs processing at multiple image sizes.
Add DiscResnet_passthrough, which allows passthrough of image at different sizes for discrimination.
Adjust the rest of the repo to allow generators that return more than just a single image.
2020-05-04 14:01:43 -06:00
James Betker
44b89330c2
Support inference across batches, support inference on cpu, checkpoint
...
This is a checkpoint of a set of long tests with reduced-complexity networks. Some takeaways:
1) A full GAN using the resnet discriminator does appear to converge, but the quality is capped.
2) Likewise, a combination GAN/feature loss does not converge. The feature loss is optimized but
the model appears unable to fight the discriminator, so the G-loss steadily increases.
Going forwards, I want to try some bigger models. In particular, I want to change the generator
to increase complexity and capacity. I also want to add skip connections between the
disc and generator.
2020-05-04 08:48:25 -06:00
James Betker
9c7debe75c
Add colab option
2020-05-02 17:47:25 -06:00
James Betker
832f3587c5
Turn off EVDR (so we dont need the weird convs)
2020-05-02 17:47:14 -06:00
James Betker
8341bf7646
Enable megabatching
2020-05-02 17:46:59 -06:00
James Betker
61d3040cf5
Add doCrop into LQGT
2020-05-02 17:46:30 -06:00
James Betker
9e1acfe396
Fixup upconv for the next attempt!
2020-05-01 19:56:14 -06:00
James Betker
7eaabce48d
Full resnet corrupt, no BN
...
And it works! Thanks fixup..
2020-04-30 19:17:30 -06:00
James Betker
03258445bc
tblogger..
2020-04-30 12:35:51 -06:00
James Betker
b6e036147a
Add more batch norms to FlatProcessorNet_arch
2020-04-30 11:47:21 -06:00
James Betker
66e91a3d9e
Revert "Enable skip-through connections from disc to gen"
...
This reverts commit b7857f35c3
.
2020-04-30 11:45:07 -06:00
James Betker
f027e888ed
Clear out tensorboard on job restart.
2020-04-30 11:44:53 -06:00
James Betker
b7857f35c3
Enable skip-through connections from disc to gen
2020-04-30 11:30:11 -06:00
James Betker
bf634fc9fa
Make resnet w/ BN discriminator use leaky relus
2020-04-30 11:28:59 -06:00
James Betker
3781ea725c
Add Resnet Discriminator with BN
2020-04-29 20:51:57 -06:00
James Betker
a5188bb7ca
Remover fixup code from arch_util
...
Going into it's own arch.
2020-04-29 15:17:43 -06:00
James Betker
5b8a77f02c
Discriminator part 1
...
New discriminator. Includes spectral norming.
2020-04-28 23:00:29 -06:00
James Betker
2c145c39b6
Misc changes
2020-04-28 11:50:16 -06:00
James Betker
46f550e42b
Change downsample_dataset to do no image modification
...
I'm preprocessing the images myself now. There's no need to have
the dataset do this processing as well.
2020-04-28 11:50:04 -06:00
James Betker
8ab595e427
Add FlatProcessorNet
...
After doing some thinking and reading on the subject, it occurred to me that
I was treating the generator like a discriminator by focusing the network
complexity at the feature levels. It makes far more sense to process each conv
level equally for the generator, hence the FlatProcessorNet in this commit. This
network borrows some of the residual pass-through logic from RRDB which makes
the gradient path exceptionally short for pretty much all model parameters and
can be trained in O1 optimization mode without overflows again.
2020-04-28 11:49:21 -06:00
James Betker
b8f67418d4
Retool HighToLowResNet
...
The receptive field of the original was *really* low. This new one has a
receptive field of 36x36px patches. It also has some gradient issues
that need to be worked out
2020-04-26 01:13:42 -06:00
James Betker
02ff4a57fd
Enable HighToLowResNet to do a 1:1 transform
2020-04-25 21:36:32 -06:00
James Betker
35bd1ecae4
Config changes for discriminator advantage run
...
Still going from high->low, discriminator discerns on low. Next up disc works on high.
2020-04-25 11:24:28 -06:00
James Betker
d95808f4ef
Implement downsample GAN
...
This bad boy is for a workflow where you train a model on disjoint image sets to
downsample a "good" set of images like a "bad" set of images looks. You then
use that downsampler to generate a training set of paired images for supersampling.
2020-04-24 00:00:46 -06:00
James Betker
ea54c7618a
Print error when image read fails
2020-04-23 23:59:32 -06:00
James Betker
e98d92fc77
Allow test to operate on batches
2020-04-23 23:59:09 -06:00
James Betker
8ead9ae183
Lots more config files
2020-04-23 23:58:27 -06:00
James Betker
ea5f432f5a
Log total gen loss
2020-04-22 14:02:10 -06:00
James Betker
79aff886b5
Modifications that allow developer to explicitly specify a different image set for PIX and feature losses
2020-04-22 10:11:14 -06:00
James Betker
12d92dc443
Add GTLQ dataset
2020-04-22 00:40:38 -06:00
James Betker
4d269fdac6
Support independent PIX dataroot
2020-04-22 00:40:13 -06:00
James Betker
05aafef938
Support variant input sizes and scales
2020-04-22 00:39:55 -06:00
James Betker
ebda70fcba
Fix AMP
2020-04-22 00:39:31 -06:00
James Betker
f4b33b0531
Some random fixes/adjustments
2020-04-22 00:38:53 -06:00
James Betker
2538ca9f33
Add my own configs
2020-04-22 00:37:54 -06:00
James Betker
af5dfaa90d
Change GT_size to target_size
2020-04-22 00:37:41 -06:00
James Betker
cc834bd5a3
Support >128px image squares
2020-04-21 16:32:59 -06:00
James Betker
4f6d3f0dfb
Enable AMP optimizations & write sample train images to folder.
2020-04-21 16:28:06 -06:00
PRAGMA
1fb12871fd
Create requirements.txt
2019-11-24 07:48:52 +00:00
XintaoWang
a25ee9464d
test w/o GT
2019-09-01 22:20:10 +08:00
XintaoWang
0098663b6b
SRGAN model supprots dist training
2019-09-01 22:14:29 +08:00
XintaoWang
866a858e59
add deform_conv_cuda_kernel.cu
2019-08-27 17:49:12 +08:00
XintaoWang
037933ba66
mmsr
2019-08-23 21:42:47 +08:00