Commit Graph

92 Commits

Author SHA1 Message Date
James Betker
6f8705e8cb SSGSimpler network 2020-10-15 17:18:44 -06:00
James Betker
24792bdb4f Codebase cleanup
Removed a lot of legacy stuff I have no intent on using again.
Plan is to shape this repo into something more extensible (get it? hah!)
2020-10-13 20:56:39 -06:00
James Betker
17d78195ee Mods to SRG to support returning switch logits 2020-10-13 20:46:37 -06:00
James Betker
2bc5701b10 misc 2020-10-12 10:21:25 -06:00
James Betker
b2c4b2a16d Move gpu_ids out of if statement 2020-10-06 20:40:20 -06:00
James Betker
0e3ea63a14 Misc 2020-10-05 18:01:50 -06:00
James Betker
ffd069fd97 Lots of SSG work
- Checkpointed pretty much the entire model - enabling recurrent inputs
- Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)
2020-10-04 20:48:08 -06:00
James Betker
fc396baf1a Move loaded_options to util
Doesn't seem to work with python 3.6
2020-10-03 20:29:06 -06:00
James Betker
3cbb9ecd45 Misc 2020-10-03 16:15:42 -06:00
James Betker
21d3bb83b2 Use tqdm reporting with validation 2020-10-03 11:16:39 -06:00
James Betker
6c9718ad64 Don't log if you aren't 0 rank 2020-10-03 11:14:13 -06:00
James Betker
19a4075e1e Allow checkpointing to be disabled in the options file
Also makes options a global variable for usage in utils.
2020-10-03 11:03:28 -06:00
James Betker
c9a9e5c525 Prompt user for gpu_id if multiple gpus are detected 2020-10-01 17:24:50 -06:00
James Betker
0b5a033503 spsr7 + cleanup
SPSR7 adds ref onto spsr6, makes more "common sense" mods.
2020-09-29 16:59:26 -06:00
James Betker
eb12b5f887 Misc 2020-09-26 21:27:17 -06:00
James Betker
254cb1e915 More dataset integration work 2020-09-25 22:19:38 -06:00
James Betker
f211575e9d Save models before validation
Validation often fails with OOM, wasting hours of training time.
Save models first.
2020-09-16 08:17:17 -06:00
James Betker
c833bd1eac Misc changes 2020-09-15 20:57:59 -06:00
James Betker
747ded2bf7 Fixes to the spsr3
Some lessons learned:
- Biases are fairly important as a relief valve. They dont need to be everywhere, but
  most computationally heavy branches should have a bias.
- GroupNorm in SPSR is not a great idea. Since image gradients are represented
   in this model, normal means and standard deviations are not applicable. (imggrad
   has a high representation of 0).
- Don't fuck with the mainline of any generative model. As much as possible, all
   additions should be done through residual connections. Never pollute the mainline
   with reference data, do that in branches. It basically leaves the mode untrainable.
2020-09-09 15:28:14 -06:00
James Betker
c04f244802 More mods 2020-09-08 20:36:27 -06:00
James Betker
e6207d4c50 SPSR3 work
SPSR3 is meant to fix whatever is causing the switching units
inside of the newer SPSR architectures to fail and basically
not use the multiplexers.
2020-09-08 15:14:23 -06:00
James Betker
a18ece62ee Add updated spsr net for test 2020-09-07 17:01:48 -06:00
James Betker
e8613041c0 Add novograd optimizer 2020-09-06 17:27:08 -06:00
James Betker
6657a406ac Mods needed to support training a corruptor again:
- Allow original SPSRNet to have a specifiable block increment
- Cleanup
- Bug fixes in code that hasnt been touched in awhile.
2020-09-04 15:33:39 -06:00
James Betker
886d59d5df Misc fixes & adjustments 2020-09-01 07:58:11 -06:00
James Betker
0a9b85f239 Fix vgg_gn input_img_factor 2020-08-31 09:50:30 -06:00
James Betker
4b4d08bdec Enable testing in ExtensibleTrainer, fix it in SRGAN_model
Also compute fea loss for this.
2020-08-31 09:41:48 -06:00
James Betker
623f3b99b2 Stupid pathing.. 2020-08-26 17:58:24 -06:00
James Betker
80aa83bfd2 Try copytree for tb_logger again. 2020-08-26 17:55:02 -06:00
James Betker
b593d8e7c3 Save tb_logger to alt_path 2020-08-26 17:45:07 -06:00
James Betker
f35b3ad28f Fix val behavior for ExtensibleTrainer 2020-08-26 08:44:22 -06:00
James Betker
19487d9bbd Fix distributed launch for large distributed runs 2020-08-25 15:42:59 -06:00
James Betker
a65b07607c Reference network 2020-08-25 11:56:59 -06:00
James Betker
f9276007a8 More fixes to corrupt_fea 2020-08-23 17:52:18 -06:00
James Betker
dffc15184d More ExtensibleTrainer work
It runs now, just need to debug it to reach performance parity with SRGAN. Sweet.
2020-08-23 17:22:45 -06:00
James Betker
afdd93fbe9 Grey feature 2020-08-22 13:41:38 -06:00
James Betker
40bb0597bb misc 2020-08-18 08:50:24 -06:00
James Betker
0c98c61f4a Enable start_step to be specified 2020-08-15 18:34:59 -06:00
James Betker
2d205f52ac Unite spsr_arch switched gens
Found a pretty good basis model.
2020-08-12 17:04:45 -06:00
James Betker
bdaa67deb7 Misc 2020-08-12 08:46:15 -06:00
James Betker
1d5f4f6102 Crossgan 2020-08-07 21:03:39 -06:00
James Betker
3ab39f0d22 Several new spsr nets 2020-08-05 10:01:24 -06:00
James Betker
328afde9c0 Integrate SPSR into SRGAN_model
SPSR_model really isn't that different from SRGAN_model. Rather than continuing to re-implement
everything I've done in SRGAN_model, port the new stuff from SPSR over.

This really demonstrates the need to refactor SRGAN_model a bit to make it cleaner. It is quite the
beast these days..
2020-08-02 12:55:08 -06:00
James Betker
c8da78966b Substantial SPSR mods & fixes
- Added in gradient accumulation via mega-batch-factor
- Added AMP
- Added missing train hooks
- Added debug image outputs
- Cleaned up including removing GradientPenaltyLoss, custom SpectralNorm
- Removed all the custom discriminators
2020-08-02 10:45:24 -06:00
James Betker
f894ba8f98 Add SPSR_module
This is a port from the SPSR repo, it's going to need a lot of work to be properly integrated
but as of this commit it at least runs.
2020-08-01 22:02:54 -06:00
James Betker
eb11a08d1c Enable disjoint feature networks
This is done by pre-training a feature net that predicts the features
of HR images from LR images. Then use the original feature network
and this new one in tandem to work only on LR/Gen images.
2020-07-31 16:29:47 -06:00
James Betker
e37726f302 Add feature_model for training custom feature nets 2020-07-31 11:20:39 -06:00
James Betker
a7541b6d8d Fix illegal tb_logger use in distributed training 2020-07-23 09:14:01 -06:00
James Betker
dbf6147504 Add switched discriminator
The logic is that the discriminator may be incapable of providing a truly
targeted loss for all image regions since it has to be too generic
(basically the same argument for the switched generator). So add some
switches in! See how it works!
2020-07-22 20:52:59 -06:00
James Betker
8a9f215653 Huge set of mods to support progressive generator growth 2020-07-18 14:18:48 -06:00