James Betker
fd97573085
Fixes
2020-11-11 21:49:06 -07:00
James Betker
88f349bdf1
Enable usage of wandb
2020-11-11 21:48:56 -07:00
James Betker
1c065c41b4
Revert "..."
...
This reverts commit 4b92191880
.
2020-11-11 17:24:27 -07:00
James Betker
4b92191880
...
2020-11-11 14:12:40 -07:00
James Betker
12b57bbd03
Add residual blocks to pyramid disc
2020-11-11 13:56:45 -07:00
James Betker
b4136d766a
Back to pyramids, no rrdb
2020-11-11 13:40:24 -07:00
James Betker
42a97de756
Convert PyramidRRDBDisc to RRDBDisc
...
Had numeric stability issues. This probably makes more sense anyways.
2020-11-11 12:14:14 -07:00
James Betker
72762f200c
PyramidRRDB net
2020-11-11 11:25:49 -07:00
James Betker
a1760f8969
Adapt srg2 for video
2020-11-10 16:16:41 -07:00
James Betker
b742d1e5a5
When skipping steps via "every", still run nontrainable injection points
2020-11-10 16:09:17 -07:00
James Betker
91d27372e4
rrdb with adain latent
2020-11-10 16:08:54 -07:00
James Betker
6a2fd5f7d0
Lots of new discriminator nets
2020-11-10 16:06:54 -07:00
James Betker
4e5ba61ae7
SRG2classic further re-integration
2020-11-10 16:06:14 -07:00
James Betker
9e2c96ad5d
More latent work
2020-11-07 20:38:56 -07:00
James Betker
0cf52ef52c
latent work
2020-11-06 20:38:23 -07:00
James Betker
34d319585c
Add srflow arch
2020-11-06 20:38:04 -07:00
James Betker
4469d2e661
More work on RRDB with latent
2020-11-05 22:13:05 -07:00
James Betker
62d3b6496b
Latent work checkpoint
2020-11-05 13:31:34 -07:00
James Betker
fd6cdba88f
RRDB with latent
2020-11-05 10:04:17 -07:00
James Betker
df47d6cbbb
More work in support of training flow networks in tandem with generators
2020-11-04 18:07:48 -07:00
James Betker
658a267bab
More work on SSIM/PSNR approximators
...
- Add a network that accomodates this style of approximator while retaining structure
- Migrate to SSIM approximation
- Add a tool to visualize how these approximators are working
- Fix some issues that came up while doign this work
2020-11-03 08:09:58 -07:00
James Betker
a51daacde2
Fix reporting of d_fake_diff for generators
2020-11-02 08:45:46 -07:00
James Betker
dcfe994fee
Add standalone srg2_classic
...
Trying to investigate how I was so misguided. I *thought* srg2 was considerably
better than RRDB in performance but am not actually seeing that.
2020-10-31 20:55:34 -06:00
James Betker
eb7df63592
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-31 11:09:32 -06:00
James Betker
c2866ad8d2
Disable debugging of comparable pingpong generations
2020-10-31 11:09:10 -06:00
James Betker
7303d8c932
Add psnr approximator
2020-10-31 11:08:55 -06:00
James Betker
565517814e
Restore SRG2
...
Going to try to figure out where SRG lost competitiveness to RRDB..
2020-10-30 14:01:56 -06:00
James Betker
74738489b9
Fixes and additional support for progressive zoom
2020-10-30 09:59:54 -06:00
James Betker
a3918fa808
Tecogan & other fixes
2020-10-30 00:19:58 -06:00
James Betker
b316078a15
Fix tecogan_losses fp16
2020-10-29 23:02:20 -06:00
James Betker
3791f95ad0
Enable RRDB to take in reference inputs
2020-10-29 11:07:40 -06:00
James Betker
7d38381d46
Add scaling to rrdb
2020-10-29 09:48:10 -06:00
James Betker
607ff3c67c
RRDB with bypass
2020-10-29 09:39:45 -06:00
James Betker
1655b9e242
Fix fast_forward teco loss bug
2020-10-28 17:49:54 -06:00
James Betker
515905e904
Add a min_loss that is DDP compatible
2020-10-28 15:46:59 -06:00
James Betker
f133243ac8
Extra logging for teco_resgen
2020-10-28 15:21:22 -06:00
James Betker
2ab5054d4c
Add noise to teco disc
2020-10-27 22:48:23 -06:00
James Betker
4dc16d5889
Upgrade tecogan_losses for speed
2020-10-27 22:40:15 -06:00
James Betker
ac3da0c5a6
Make tecogen functional
2020-10-27 21:08:59 -06:00
James Betker
10da206db6
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-27 20:59:59 -06:00
James Betker
9848f4c6cb
Add teco_resgen
2020-10-27 20:59:55 -06:00
James Betker
543c384a91
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-27 20:59:16 -06:00
James Betker
da53090ce6
More adjustments to support distributed training with teco & on multi_modal_train
2020-10-27 20:58:03 -06:00
James Betker
00bb568956
further checkpointify spsr_arch
2020-10-27 17:54:28 -06:00
James Betker
c2727a0150
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-27 15:24:19 -06:00
James Betker
2a3eec8fd7
Fix some distributed training snafus
2020-10-27 15:24:05 -06:00
James Betker
d923a62ed3
Allow SPSR to checkpoint
2020-10-27 15:23:20 -06:00
James Betker
11a9e223a6
Retrofit SPSR_arch so it is capable of accepting a ref
2020-10-27 11:14:36 -06:00
James Betker
8202ee72b9
Re-add original SPSR_arch
2020-10-27 11:00:38 -06:00
James Betker
231137ab0a
Revert RRDB back to original model
2020-10-27 10:25:31 -06:00
James Betker
1ce863849a
Remove temporary base_model change
2020-10-26 11:13:01 -06:00
James Betker
54accfa693
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-26 11:12:37 -06:00
James Betker
ff58c6484a
Fixes to unified chunk datasets to support stereoscopic training
2020-10-26 11:12:22 -06:00
James Betker
f857eb00a8
Allow tecogan losses to compute at 32px
2020-10-26 11:09:55 -06:00
James Betker
629b968901
ChainedGen 4x alteration
...
Increases conv window for teco_recurrent in the 4x case so all data
can be used.
base_model changes should be temporary.
2020-10-26 10:54:51 -06:00
James Betker
85c07f85d9
Update flownet submodule
2020-10-24 11:59:00 -06:00
James Betker
9c3d059ef0
Updates to be able to train flownet2 in ExtensibleTrainer
...
Only supports basic losses for now, though.
2020-10-24 11:56:39 -06:00
James Betker
1dbcbfbac8
Restore ChainedEmbeddingGenWithStructure
...
Still using this guy, after all
2020-10-24 11:54:52 -06:00
James Betker
7a75d10784
Arch cleanup
2020-10-23 09:35:33 -06:00
James Betker
646d6a621a
Support 4x zoom on ChainedEmbeddingGen
2020-10-23 09:25:58 -06:00
James Betker
e9c0b9f0fd
More adjustments to support multi-modal training
...
Specifically - looks like at least MSE loss cannot handle autocasted tensors
2020-10-22 16:49:34 -06:00
James Betker
76789a456f
Class-ify train.py and workon multi-modal trainer
2020-10-22 16:15:31 -06:00
James Betker
15e00e9014
Finish integration with autocast
...
Note: autocast is broken when also using checkpoint(). Overcome this by modifying
torch's checkpoint() function in place to also use autocast.
2020-10-22 14:39:19 -06:00
James Betker
d7ee14f721
Move to torch.cuda.amp (not working)
...
Running into OOM errors, needs diagnosing. Checkpointing here.
2020-10-22 13:58:05 -06:00
James Betker
3e3d2af1f3
Add multi-modal trainer
2020-10-22 13:27:32 -06:00
James Betker
40dc2938e8
Fix multifaceted chain gen
2020-10-22 13:27:06 -06:00
James Betker
43c4f92123
Collapse progressive zoom candidates into the batch dimension
...
This contributes a significant speedup to training this type of network
since losses can operate on the entire prediction spectrum at once.
2020-10-21 22:37:23 -06:00
James Betker
680d635420
Enable ExtensibleTrainer to skip steps when state keys are missing
2020-10-21 22:22:28 -06:00
James Betker
d1175f0de1
Add FFT injector
2020-10-21 22:22:00 -06:00
James Betker
1ef559d7ca
Add a ChainedEmbeddingGen which can be simueltaneously used with multiple training paradigms
2020-10-21 22:21:51 -06:00
James Betker
931aa65dd0
Allow recurrent losses to be weighted
2020-10-21 16:59:44 -06:00
James Betker
5753e77d67
ChainedGen: Output debugging information on blocks
2020-10-21 16:36:23 -06:00
James Betker
3c6e600e48
Add capacity for models to self-report visuals
2020-10-21 11:08:03 -06:00
James Betker
dca5cddb3b
Add bypass to ChainedEmbeddingGen
2020-10-21 11:07:45 -06:00
James Betker
a63bf2ea2f
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-19 15:26:11 -06:00
James Betker
76e4f0c086
Restore test.py for use as standalone validator
2020-10-19 15:26:07 -06:00
James Betker
1b1ca297f8
Fix recurrent=None bug in ChainedEmbeddingGen
2020-10-19 15:25:12 -06:00
James Betker
b28e4d9cc7
Add spread loss
...
Experimental loss that peaks around 0.
2020-10-19 11:31:19 -06:00
James Betker
981d64413b
Support validation over a custom injector
...
Also re-enable PSNR
2020-10-19 11:01:56 -06:00
James Betker
668cafa798
Push correct patch of recurrent embedding to upstream image, rather than whole thing
2020-10-18 22:39:52 -06:00
James Betker
7df378a944
Remove separated vgg discriminator
...
Checkpointing happens inline instead. Was a dumb idea..
Also fixes some loss reporting issues.
2020-10-18 12:10:24 -06:00
James Betker
c709d38cd5
Fix memory leak with recurrent loss
2020-10-18 10:22:10 -06:00
James Betker
552e70a032
Get rid of excessive checkpointed disc params
2020-10-18 10:09:37 -06:00
James Betker
6a0d5f4813
Add a checkpointable discriminator
2020-10-18 09:57:47 -06:00
James Betker
9ead2c0a08
Multiscale training in!
2020-10-17 22:54:12 -06:00
James Betker
e706911c83
Fix spinenet bug
2020-10-17 20:20:36 -06:00
James Betker
b008a27d39
Spinenet should allow bypassing the initial conv
...
This makes feeding in references for recurrence easier.
2020-10-17 20:16:47 -06:00
James Betker
c1c9c5681f
Swap recurrence
2020-10-17 08:40:28 -06:00
James Betker
6141aa1110
More recurrence fixes for chainedgen
2020-10-17 08:35:46 -06:00
James Betker
cf8118a85b
Allow recurrence to specified for chainedgen
2020-10-17 08:32:29 -06:00
James Betker
fc4c064867
Add recurrent support to chainedgenwithstructure
2020-10-17 08:31:34 -06:00
James Betker
d4a3e11ab2
Don't use several stages of spinenet_arch
...
These are used for lower outputs which I am not using
2020-10-17 08:28:37 -06:00
James Betker
d1c63ae339
Go back to torch's DDP
...
Apex was having some weird crashing issues.
2020-10-16 20:47:35 -06:00
James Betker
d856378b2e
Add ChainedGenWithStructure
2020-10-16 20:44:36 -06:00
James Betker
617d97e19d
Add ChainedEmbeddingGen
2020-10-15 23:18:08 -06:00
James Betker
c4543ce124
Set post_transform_block to None where applicable
2020-10-15 17:20:42 -06:00
James Betker
6f8705e8cb
SSGSimpler network
2020-10-15 17:18:44 -06:00
James Betker
eda75c9779
Cleanup fixes
2020-10-15 10:13:17 -06:00
James Betker
920865defb
Arch work
2020-10-15 10:13:06 -06:00
James Betker
1f20d59c31
Revert big switch back
2020-10-14 11:03:34 -06:00