Commit Graph

877 Commits

Author SHA1 Message Date
James Betker
ef8d5f88c1 Bring split gaussian nll out of split so it can be computed accurately with the rest of the nll component 2020-11-27 13:30:21 -07:00
James Betker
11d2b70bdd Latent space playground work 2020-11-27 12:03:16 -07:00
James Betker
4ab49b0d69 RRDB disc work 2020-11-27 12:03:08 -07:00
James Betker
6de4dabb73 Remove srflow (modified version)
Starting from orig and re-working from there.
2020-11-27 12:02:06 -07:00
James Betker
5f5420ff4a Update to srflow_latent_space_playground 2020-11-26 20:31:21 -07:00
James Betker
fd356580c0 Play with lambdas 2020-11-26 20:30:55 -07:00
James Betker
0c6d7971b9 Dataset documentation 2020-11-26 11:58:39 -07:00
James Betker
45a489110f Fix datasets 2020-11-26 11:50:38 -07:00
James Betker
5edaf085e0 Adjustments to latent_space_playground 2020-11-25 15:52:36 -07:00
James Betker
205c9a5335 Learn how to functionally use srflow networks 2020-11-25 13:59:06 -07:00
James Betker
cb045121b3 Expose srflow rrdb 2020-11-24 13:20:20 -07:00
James Betker
f3c1fc1bcd Dataset modifications 2020-11-24 13:20:12 -07:00
James Betker
f6098155cd Mods to tecogan to allow use of embeddings as input 2020-11-24 09:24:02 -07:00
James Betker
b10bcf6436 Rework stylegan_for_sr to incorporate structure as an adain block 2020-11-23 11:31:11 -07:00
James Betker
519ba6f10c Support 2x RRDB with 4x srflow 2020-11-21 14:46:15 -07:00
James Betker
cad92bada8 Report logp and logdet for srflow 2020-11-21 10:13:05 -07:00
James Betker
c37d3faa58 More adjustments to srflow_orig 2020-11-20 19:38:33 -07:00
James Betker
d51d12a41a Adjustments to srflow to (maybe?) fix training 2020-11-20 14:44:24 -07:00
James Betker
6c8c35ac47 Support training RRDB encoder [srflow] 2020-11-20 10:03:06 -07:00
James Betker
5ccdbcefe3 srflow_orig integration 2020-11-19 23:47:24 -07:00
James Betker
f80acfcab6 Throw if dataset isn't going to work with force_multiple setting 2020-11-19 23:47:00 -07:00
James Betker
2b2d754d8e Bring in an original SRFlow implementation for reference 2020-11-19 21:42:39 -07:00
James Betker
1e0d7be3ce "Clean up" SRFlow 2020-11-19 21:42:24 -07:00
James Betker
d7877d0a36 Fixes to teco losses and translational losses 2020-11-19 11:35:05 -07:00
James Betker
b2a05465fc Fix missing requirements 2020-11-18 10:16:39 -07:00
James Betker
5c10264538 Remove pyramid_disc hard dependencies 2020-11-17 18:34:11 -07:00
James Betker
6b679e2b51 Make grad_penalty available to classical discs 2020-11-17 18:31:40 -07:00
James Betker
8a19c9ae15 Add additive mode to rrdb 2020-11-16 20:45:09 -07:00
James Betker
2a507987df Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-15 16:16:30 -07:00
James Betker
931ed903c1 Allow combined additive loss 2020-11-15 16:16:18 -07:00
James Betker
4b68116977 import fix 2020-11-15 16:15:42 -07:00
James Betker
98eada1e4c More circular dependency fixes + unet fixes 2020-11-15 11:53:35 -07:00
James Betker
e587d549f7 Fix circular imports 2020-11-15 11:32:35 -07:00
James Betker
99f0cfaab5 Rework stylegan2 divergence losses
Notably: include unet loss
2020-11-15 11:26:44 -07:00
James Betker
ea94b93a37 Fixes for unet 2020-11-15 10:38:33 -07:00
James Betker
89f56b2091 Fix another import 2020-11-14 22:10:45 -07:00
James Betker
9af049c671 Import fix for unet 2020-11-14 22:09:18 -07:00
James Betker
5cade6b874 Move stylegan2 around, bring in unet 2020-11-14 22:04:48 -07:00
James Betker
4c6b14a3f8 Allow extract_square_images to work on multiple images 2020-11-14 20:24:05 -07:00
James Betker
125cb16dce Add a FID evaluator for stylegan with structural guidance 2020-11-14 20:16:07 -07:00
James Betker
c9258e2da3 Alter how structural guidance is given to stylegan 2020-11-14 20:15:48 -07:00
James Betker
3397c83447 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-14 09:30:09 -07:00
James Betker
423ee7cb90 Allow attention to be specified for stylegan2 2020-11-14 09:29:53 -07:00
James Betker
ec621c69b5 Fix train bug 2020-11-14 09:29:08 -07:00
James Betker
cdc5ac30e9 oddity 2020-11-13 20:11:57 -07:00
James Betker
f406a5dd4c Mods to support stylegan2 in SR mode 2020-11-13 20:11:50 -07:00
James Betker
9c3d0b7560 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-13 20:10:47 -07:00
James Betker
67bf55495b Allow hq_batched_key to be specified 2020-11-13 20:10:12 -07:00
James Betker
0b96811611 Fix another issue with gpu ids getting thrown all over hte place 2020-11-13 20:05:52 -07:00
James Betker
c47925ae34 New image extractor utility 2020-11-13 11:04:03 -07:00
James Betker
a07e1a7292 Add separate Evaluator module and FID evaluator 2020-11-13 11:03:54 -07:00
James Betker
080ad61be4 Add option to work with nonrandom latents 2020-11-12 21:23:50 -07:00
James Betker
566b99ca75 GP adjustments for stylegan2 2020-11-12 16:44:51 -07:00
James Betker
fc55bdb24e Mods to how wandb are integrated 2020-11-12 15:45:25 -07:00
James Betker
44a19cd37c ExtensibleTrainer mods to support advanced checkpointing for stylegan2
Basically: stylegan2 makes use of gradient-based normalizers. These
make it so that I cannot use gradient checkpointing. But I love gradient
checkpointing. It makes things really, really fast and memory conscious.

So - only don't checkpoint when we run the regularizer loss. This is a
bit messy, but speeds up training by at least 20%.

Also: pytorch: please make checkpointing a first class citizen.
2020-11-12 15:45:07 -07:00
James Betker
db9e9e28a0 Fix an issue where GPU0 was always being used in non-ddp
Frankly, I don't understand how this has ever worked. WTF.
2020-11-12 15:43:01 -07:00
James Betker
2d3449d7a5 stylegan2 in ml art school! 2020-11-12 15:42:05 -07:00
James Betker
fd97573085 Fixes 2020-11-11 21:49:06 -07:00
James Betker
88f349bdf1 Enable usage of wandb 2020-11-11 21:48:56 -07:00
James Betker
1c065c41b4 Revert "..."
This reverts commit 4b92191880.
2020-11-11 17:24:27 -07:00
James Betker
4b92191880 ... 2020-11-11 14:12:40 -07:00
James Betker
12b57bbd03 Add residual blocks to pyramid disc 2020-11-11 13:56:45 -07:00
James Betker
b4136d766a Back to pyramids, no rrdb 2020-11-11 13:40:24 -07:00
James Betker
42a97de756 Convert PyramidRRDBDisc to RRDBDisc
Had numeric stability issues. This probably makes more sense anyways.
2020-11-11 12:14:14 -07:00
James Betker
72762f200c PyramidRRDB net 2020-11-11 11:25:49 -07:00
James Betker
a1760f8969 Adapt srg2 for video 2020-11-10 16:16:41 -07:00
James Betker
b742d1e5a5 When skipping steps via "every", still run nontrainable injection points 2020-11-10 16:09:17 -07:00
James Betker
91d27372e4 rrdb with adain latent 2020-11-10 16:08:54 -07:00
James Betker
6a2fd5f7d0 Lots of new discriminator nets 2020-11-10 16:06:54 -07:00
James Betker
4e5ba61ae7 SRG2classic further re-integration 2020-11-10 16:06:14 -07:00
James Betker
9e2c96ad5d More latent work 2020-11-07 20:38:56 -07:00
James Betker
6be6c92e5d Fix yet ANOTHER OBO error in multi_frame_dataset 2020-11-06 20:38:34 -07:00
James Betker
0cf52ef52c latent work 2020-11-06 20:38:23 -07:00
James Betker
34d319585c Add srflow arch 2020-11-06 20:38:04 -07:00
James Betker
4469d2e661 More work on RRDB with latent 2020-11-05 22:13:05 -07:00
James Betker
62d3b6496b Latent work checkpoint 2020-11-05 13:31:34 -07:00
James Betker
fd6cdba88f RRDB with latent 2020-11-05 10:04:17 -07:00
James Betker
df47d6cbbb More work in support of training flow networks in tandem with generators 2020-11-04 18:07:48 -07:00
James Betker
c21088e238 Fix OBO error in multi_frame_dataset
In some datasets, this meant one frame was included in a sequence where it didn't belong. In datasets with mismatched chunk sizes, this resulted in an error.
2020-11-03 14:32:06 -07:00
James Betker
e990be0449 Improve ignore_first logic 2020-11-03 11:56:32 -07:00
James Betker
658a267bab More work on SSIM/PSNR approximators
- Add a network that accomodates this style of approximator while retaining structure
- Migrate to SSIM approximation
- Add a tool to visualize how these approximators are working
- Fix some issues that came up while doign this work
2020-11-03 08:09:58 -07:00
James Betker
85c545835c Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-02 08:48:15 -07:00
James Betker
f13fdd43ed Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-11-02 08:47:42 -07:00
James Betker
fed16abc22 Report chunking errors 2020-11-02 08:47:18 -07:00
James Betker
a51daacde2 Fix reporting of d_fake_diff for generators 2020-11-02 08:45:46 -07:00
James Betker
3676f26d94 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-31 20:55:45 -06:00
James Betker
dcfe994fee Add standalone srg2_classic
Trying to investigate how I was so misguided. I *thought* srg2 was considerably
better than RRDB in performance but am not actually seeing that.
2020-10-31 20:55:34 -06:00
James Betker
ea8c20c0e2 Fix bug with multiscale_dataset 2020-10-31 20:54:41 -06:00
James Betker
bb39d3efe5 Bump image corruption factor a bit 2020-10-31 20:50:24 -06:00
James Betker
eb7df63592 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-31 11:09:32 -06:00
James Betker
c2866ad8d2 Disable debugging of comparable pingpong generations 2020-10-31 11:09:10 -06:00
James Betker
7303d8c932 Add psnr approximator 2020-10-31 11:08:55 -06:00
James Betker
565517814e Restore SRG2
Going to try to figure out where SRG lost competitiveness to RRDB..
2020-10-30 14:01:56 -06:00
James Betker
b24ff3c88d Fix bug that causes multiscale dataset to crash 2020-10-30 14:01:24 -06:00
James Betker
74738489b9 Fixes and additional support for progressive zoom 2020-10-30 09:59:54 -06:00
James Betker
a3918fa808 Tecogan & other fixes 2020-10-30 00:19:58 -06:00
James Betker
b316078a15 Fix tecogan_losses fp16 2020-10-29 23:02:20 -06:00
James Betker
3791f95ad0 Enable RRDB to take in reference inputs 2020-10-29 11:07:40 -06:00
James Betker
7d38381d46 Add scaling to rrdb 2020-10-29 09:48:10 -06:00
James Betker
607ff3c67c RRDB with bypass 2020-10-29 09:39:45 -06:00
James Betker
1655b9e242 Fix fast_forward teco loss bug 2020-10-28 17:49:54 -06:00
James Betker
25b007a0f5 Increase jpeg corruption & add error 2020-10-28 17:37:39 -06:00
James Betker
796659b0ac Add 'jpeg-normal' corruption 2020-10-28 16:40:47 -06:00
James Betker
515905e904 Add a min_loss that is DDP compatible 2020-10-28 15:46:59 -06:00
James Betker
f133243ac8 Extra logging for teco_resgen 2020-10-28 15:21:22 -06:00
James Betker
2ab5054d4c Add noise to teco disc 2020-10-27 22:48:23 -06:00
James Betker
4dc16d5889 Upgrade tecogan_losses for speed 2020-10-27 22:40:15 -06:00
James Betker
ac3da0c5a6 Make tecogen functional 2020-10-27 21:08:59 -06:00
James Betker
10da206db6 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-27 20:59:59 -06:00
James Betker
9848f4c6cb Add teco_resgen 2020-10-27 20:59:55 -06:00
James Betker
543c384a91 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-27 20:59:16 -06:00
James Betker
da53090ce6 More adjustments to support distributed training with teco & on multi_modal_train 2020-10-27 20:58:03 -06:00
James Betker
00bb568956 further checkpointify spsr_arch 2020-10-27 17:54:28 -06:00
James Betker
c2727a0150 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-27 15:24:19 -06:00
James Betker
2a3eec8fd7 Fix some distributed training snafus 2020-10-27 15:24:05 -06:00
James Betker
d923a62ed3 Allow SPSR to checkpoint 2020-10-27 15:23:20 -06:00
James Betker
11a9e223a6 Retrofit SPSR_arch so it is capable of accepting a ref 2020-10-27 11:14:36 -06:00
James Betker
8202ee72b9 Re-add original SPSR_arch 2020-10-27 11:00:38 -06:00
James Betker
31cf1ac98d Retrofit full_image_dataset to work with new arch. 2020-10-27 10:26:19 -06:00
James Betker
ade0a129da Include psnr in test.py 2020-10-27 10:25:42 -06:00
James Betker
231137ab0a Revert RRDB back to original model 2020-10-27 10:25:31 -06:00
James Betker
1ce863849a Remove temporary base_model change 2020-10-26 11:13:01 -06:00
James Betker
54accfa693 Merge remote-tracking branch 'origin/gan_lab' into gan_lab 2020-10-26 11:12:37 -06:00
James Betker
ff58c6484a Fixes to unified chunk datasets to support stereoscopic training 2020-10-26 11:12:22 -06:00
James Betker
b2f803588b Fix multi_modal_train.py 2020-10-26 11:10:22 -06:00
James Betker
f857eb00a8 Allow tecogan losses to compute at 32px 2020-10-26 11:09:55 -06:00
James Betker
629b968901 ChainedGen 4x alteration
Increases conv window for teco_recurrent in the 4x case so all data
can be used.

base_model changes should be temporary.
2020-10-26 10:54:51 -06:00
James Betker
85c07f85d9 Update flownet submodule 2020-10-24 11:59:00 -06:00
James Betker
327cdbe110 Support configurable multi-modal training 2020-10-24 11:57:39 -06:00
James Betker
9c3d059ef0 Updates to be able to train flownet2 in ExtensibleTrainer
Only supports basic losses for now, though.
2020-10-24 11:56:39 -06:00
James Betker
1dbcbfbac8 Restore ChainedEmbeddingGenWithStructure
Still using this guy, after all
2020-10-24 11:54:52 -06:00
James Betker
8e5b6682bf Add PairedFrameDataset 2020-10-23 20:58:07 -06:00
James Betker
7a75d10784 Arch cleanup 2020-10-23 09:35:33 -06:00
James Betker
646d6a621a Support 4x zoom on ChainedEmbeddingGen 2020-10-23 09:25:58 -06:00
James Betker
8636492db0 Copy train.py mods to train2 2020-10-22 17:16:36 -06:00
James Betker
e9c0b9f0fd More adjustments to support multi-modal training
Specifically - looks like at least MSE loss cannot handle autocasted tensors
2020-10-22 16:49:34 -06:00
James Betker
76789a456f Class-ify train.py and workon multi-modal trainer 2020-10-22 16:15:31 -06:00
James Betker
15e00e9014 Finish integration with autocast
Note: autocast is broken when also using checkpoint(). Overcome this by modifying
torch's checkpoint() function in place to also use autocast.
2020-10-22 14:39:19 -06:00
James Betker
d7ee14f721 Move to torch.cuda.amp (not working)
Running into OOM errors, needs diagnosing. Checkpointing here.
2020-10-22 13:58:05 -06:00
James Betker
3e3d2af1f3 Add multi-modal trainer 2020-10-22 13:27:32 -06:00
James Betker
40dc2938e8 Fix multifaceted chain gen 2020-10-22 13:27:06 -06:00
James Betker
f9dc472f63 Misc nonfunctional mods to datasets 2020-10-22 10:16:17 -06:00
James Betker
43c4f92123 Collapse progressive zoom candidates into the batch dimension
This contributes a significant speedup to training this type of network
since losses can operate on the entire prediction spectrum at once.
2020-10-21 22:37:23 -06:00
James Betker
680d635420 Enable ExtensibleTrainer to skip steps when state keys are missing 2020-10-21 22:22:28 -06:00
James Betker
d1175f0de1 Add FFT injector 2020-10-21 22:22:00 -06:00
James Betker
1ef559d7ca Add a ChainedEmbeddingGen which can be simueltaneously used with multiple training paradigms 2020-10-21 22:21:51 -06:00
James Betker
931aa65dd0 Allow recurrent losses to be weighted 2020-10-21 16:59:44 -06:00
James Betker
5753e77d67 ChainedGen: Output debugging information on blocks 2020-10-21 16:36:23 -06:00
James Betker
b54de69153 Misc 2020-10-21 11:08:21 -06:00
James Betker
71c3820d2d Fix process_video 2020-10-21 11:08:12 -06:00