DL-Art-School

Author	SHA1	Message	Date
James Betker	17555e7d07	misc adjustments for stylegan	2021-04-21 18:14:17 -06:00
James Betker	b687ef4cd0	Misc	2021-04-21 18:09:46 -06:00
James Betker	9fc3df3f5b	Switched conv: add conversion function with allowlist	2021-03-13 10:44:56 -07:00
James Betker	cf9a6da889	Fix some bugs, checkin work on vqvae3	2021-03-02 20:56:19 -07:00
James Betker	f89ea5f1c6	Mods to support lightweight_gan model	2021-03-02 20:51:48 -07:00
James Betker	39fd755baa	New benchmark numbers	2021-02-08 08:09:41 -07:00
James Betker	784b96c059	Misc options to add support for training stylegan2-rosinality models: - Allow image_folder_dataset to normalize inbound images - ExtensibleTrainer can denormalize images on the output path - Support .webp - an output from LSUN - Support logistic GAN divergence loss - Support stylegan2 TF weight extraction for discriminator - New injector that produces latent noise (with separated paths) - Modify FID evaluator to be operable with rosinality-style GANs	2021-02-08 08:09:21 -07:00
James Betker	e7be4bdff3	Revert	2021-02-05 08:43:07 -07:00
James Betker	6dec1f5968	Back to groupnorm	2021-02-05 08:42:11 -07:00
James Betker	336f807c8e	lambda2	2021-02-05 00:00:24 -07:00
James Betker	025a5867c4	Use syncbatchnorm instead	2021-02-04 22:26:36 -07:00
James Betker	bb79fafb89	Fix groupnorm specification	2021-02-04 22:15:38 -07:00
James Betker	43da1f9c4b	Convert lambda coupler to use groupnorm instead of batchnorm	2021-02-04 21:59:44 -07:00
James Betker	7070142805	Make vqvae3_hard more configurable	2021-02-04 09:03:22 -07:00
James Betker	b980028ca8	Add get_debug_values for vqvae_3_hardswitch	2021-02-03 14:12:24 -07:00
James Betker	1405ff06b8	Fix SwitchedConvHardRoutingFunction for current cuda router	2021-02-03 14:11:55 -07:00
James Betker	d7bec392dd	...	2021-02-02 23:50:25 -07:00
James Betker	b0a8fa00bc	Visual dbg in vqvae3hs	2021-02-02 23:50:01 -07:00
James Betker	f5f91850fd	hardswitch variant of vqvae3	2021-02-02 21:00:04 -07:00
James Betker	320edbaa3c	Move switched_conv logic around a bit	2021-02-02 20:41:24 -07:00
James Betker	0dca36946f	Hard Routing mods - Turns out my custom convolution was RIDDLED with backwards bugs, which is why the existing implementation wasn't working so well. - Implements the switch logic from both Mixture of Experts and Switch Transformers for testing purposes.	2021-02-02 20:35:58 -07:00
James Betker	29c1c3bede	Register vqvae3	2021-01-29 15:26:28 -07:00
James Betker	bc20b4739e	vqvae3 Changes VQVAE as so: - Reverts back to smaller codebook - Adds an additional conv layer at the highest resolution for both the encoder & decoder - Uses LeakyReLU on trunk	2021-01-29 15:24:26 -07:00
James Betker	96bc80313c	Add switch norm, up dropout rate, detach selector	2021-01-26 09:31:53 -07:00
James Betker	2cdac6bd09	Add PWCNet for human optical flow	2021-01-25 08:25:44 -07:00
James Betker	51b63b2aa6	Add switched_conv with hard routing and make vqvae use it.	2021-01-25 08:25:29 -07:00
James Betker	ae4ff4a1e7	Enable lambda visualization	2021-01-23 15:53:27 -07:00
James Betker	10ec6bda1d	lambda nets in switched_conv and a vqvae to use it	2021-01-23 14:57:57 -07:00
James Betker	b374dcdd46	update vqvae to double codebook size for bottom quantizer	2021-01-23 13:47:07 -07:00
James Betker	1b8a26db93	New switched_conv	2021-01-23 13:46:30 -07:00
James Betker	d919ae7148	Add VQVAE with no Conv2dTranspose	2021-01-18 08:49:59 -07:00
James Betker	587a4f4050	resnet_unet_3 I'm being really lazy here - these nets are not really different from each other except at which layer they terminate. This one terminates at 2x downsampling, which is simply indicative of a direction I want to go for testing these pixpro networks.	2021-01-15 14:51:03 -07:00
James Betker	038b8654b6	Pixpro: unwrap losses	2021-01-13 11:54:25 -07:00
James Betker	8990801a3f	Fix pixpro stochastic sampling bugs	2021-01-13 11:34:24 -07:00
James Betker	19475a072f	Pixpro: Rather than using a latent square for pixpro, use an entirely stochastic sampling of the pixels	2021-01-13 11:26:51 -07:00
James Betker	d1007ccfe7	Adjustments to pixpro to allow training against networks with arbitrarily large structural latents - The pixpro latent now rescales the latent space instead of using a "coordinate vector", which might have performance implications. - The latent against which the pixel loss is computed can now be a small, randomly sampled patch out of the entire latent, allowing further memory/computational discounts. Since the loss computation does not have a receptive field, this should not alter the loss. - The instance projection size can now be separate from the pixel projection size. - PixContrast removed entirely. - ResUnet with full resolution added.	2021-01-12 09:17:45 -07:00
James Betker	34f8c8641f	Support training imagenet classifier	2021-01-11 20:09:16 -07:00
James Betker	f3db381fa1	Allow uresnet to use pretrained resnet50	2021-01-10 12:57:31 -07:00
James Betker	07168ecfb4	Enable vqvae to use a switched_conv variant	2021-01-09 20:53:14 -07:00
James Betker	5a8156026a	Did anyone ask for k-means clustering? This is so cool...	2021-01-07 22:37:41 -07:00
James Betker	de10c7246a	Add injected noise into bypass maps	2021-01-07 16:31:12 -07:00
James Betker	61a86a3c1e	VQVAE	2021-01-07 10:20:15 -07:00
James Betker	01a589e712	Adjustments to pixpro & resnet-unet I'm not really satisfied with what I got out of these networks on round 1. Lets try again..	2021-01-06 15:00:46 -07:00
James Betker	2f2f87bbea	Styled SR fixes	2021-01-05 20:14:39 -07:00
James Betker	9fed90393f	Add lucidrains pixpro trainer	2021-01-05 20:14:22 -07:00
James Betker	ade2732c82	Transfer learning for styleSR This is a concept from "Lifelong Learning GAN", although I'm skeptical of it's novelty - basically you scale and shift the weights for the generator and discriminator of a pretrained GAN to "shift" into new modalities, e.g. faces->birds or whatever. There are some interesting applications of this that I would like to try out.	2021-01-04 20:10:48 -07:00
James Betker	2c65b6b28e	More mods to support styledsr	2021-01-04 11:32:28 -07:00
James Betker	2225fe6ac2	Undo lucidrains changes for new discriminator This "new" code will live in the styledsr directory from now on.	2021-01-04 10:57:09 -07:00
James Betker	40ec71da81	Move styled_sr into its own folder	2021-01-04 10:54:34 -07:00
James Betker	5916f5f7d4	Misc fixes	2021-01-04 10:53:53 -07:00
James Betker	4d8064c32c	Modifications to allow partially trained stylegan discriminators to be used	2021-01-03 16:37:18 -07:00
James Betker	bdbab65082	Allow optimizers to train separate param groups, add higher dimensional VGG discriminator Did this to support training 512x512px networks off of a pretrained 256x256 network.	2021-01-02 15:10:06 -07:00
James Betker	193cdc6636	Move discriminators to the create_model paradigm Also cleans up a lot of old discriminator models that I have no intention of using again.	2021-01-01 15:56:09 -07:00
James Betker	f39179e85a	styled_sr: fix bug when using initial_stride	2021-01-01 12:13:21 -07:00
James Betker	913fc3b75e	Need init to pick up styled_sr	2021-01-01 12:10:32 -07:00
James Betker	e992e18767	Add initial_stride term to style_sr Also fix fid and a networks.py issue.	2021-01-01 11:59:36 -07:00
James Betker	e214e6ce33	Styled SR model	2020-12-31 20:54:18 -07:00
James Betker	b1fb82476b	Add gp debug (fix)	2020-12-30 15:26:54 -07:00
James Betker	63cf3d3126	Injector auto-registration I love it!	2020-12-29 20:58:02 -07:00
James Betker	a777c1e4f9	Misc script fixes	2020-12-29 20:25:09 -07:00
James Betker	ba543d1152	Glean mods - Fixes fixed upscale factor issues - Refines a few ops to decrease computation & parameterization	2020-12-27 12:25:06 -07:00
James Betker	f9be049adb	GLEAN mod to support custom initial strides	2020-12-26 13:51:14 -07:00
James Betker	3fd627fc62	Mods to support image classification & filtering	2020-12-26 13:49:27 -07:00
James Betker	10fdfa1563	Migrate generators to dynamic model registration	2020-12-24 23:02:10 -07:00
James Betker	29db7c7a02	Further mods to BYOL	2020-12-24 09:28:41 -07:00
James Betker	036684893e	Add LARS optimizer & support for BYOL idiosyncrasies - Added LARS and SGD optimizer variants that support turning off certain features for BN and bias layers - Added a variant of pytorch's resnet model that supports gradient checkpointing. - Modify the trainer infrastructure to support above - Fix bug with BYOL (should have been nonfunctional)	2020-12-23 20:33:43 -07:00
James Betker	1bbcb96ee8	Implement a few changes to support training BYOL networks	2020-12-23 10:50:23 -07:00
James Betker	ae666dc520	Fix bugs with srflow after refactor	2020-12-19 10:28:23 -07:00
James Betker	4328c2f713	Change default ReLU slope to .2 BREAKS COMPATIBILITY This conforms my ConvGnLelu implementation with the generally accepted negative_slope=.2. I have no idea where I got .1. This will break backwards compatibility with some older models but will likely improve their performance when freshly trained. I did some auditing to find what these models might be, and I am not actively using any of them, so probably OK.	2020-12-19 08:28:03 -07:00
James Betker	9377d34ac3	glean mods	2020-12-19 08:26:07 -07:00
James Betker	92f9a129f7	GLEAN!	2020-12-18 16:04:19 -07:00
James Betker	c717765bcb	Notes for lucidrains converter.	2020-12-18 09:55:38 -07:00
James Betker	b4720ea377	Move stylegan to new location	2020-12-18 09:52:36 -07:00
James Betker	1708136b55	Commit my attempt at "conforming" the lucidrains stylegan implementation to the reference spec. Not working. will probably be abandoned.	2020-12-18 09:51:48 -07:00
James Betker	209332292a	Rosinality stylegan fix	2020-12-18 09:50:41 -07:00
James Betker	d875ca8342	More refactor changes	2020-12-18 09:24:31 -07:00
James Betker	5640e4efe4	More refactoring	2020-12-18 09:18:34 -07:00
James Betker	b905b108da	Large cleanup Removed a lot of old code that I won't be touching again. Refactored some code elements into more logical places.	2020-12-18 09:10:44 -07:00
James Betker	3074f41877	Get rosinality model converter to work Mostly, just needed to remove the custom cuda ops, not so bueno on Windows.	2020-12-17 16:03:39 -07:00
James Betker	e838c6e75b	Rosinality stylegan2 port	2020-12-17 14:18:46 -07:00
James Betker	49327b99fe	SRFlow outputs RRDB output	2020-12-16 10:28:02 -07:00
James Betker	c25b49bb12	Clean up of SRFlowNet_arch	2020-12-16 10:27:38 -07:00
James Betker	42ac8e3eeb	Remove unnecessary comment from SRFlowNet	2020-12-16 09:43:07 -07:00
James Betker	09de3052ac	Add softmax to spinenet classification head	2020-12-16 09:42:15 -07:00
James Betker	8661207d57	Merge branch 'gan_lab' of https://github.com/neonbjb/DL-Art-School into gan_lab	2020-12-15 17:16:48 -07:00
James Betker	fc376d34b2	Spinenet with logits head	2020-12-15 17:16:19 -07:00
James Betker	0a19e53df0	BYOL mods	2020-12-14 23:59:11 -07:00
James Betker	ef7eabf457	Allow RRDB to upscale 8x	2020-12-14 23:58:52 -07:00
James Betker	ec0ee25f4b	Structural latents checkpoint	2020-12-11 12:01:09 -07:00
James Betker	26ceca68c0	BYOL with structure!	2020-12-10 15:07:35 -07:00
James Betker	c203cee31e	Allow swapping to torch DDP as needed in code	2020-12-09 15:03:59 -07:00
James Betker	97ff25a086	BYOL! Man, is there anything ExtensibleTrainer can't train? :)	2020-12-08 13:07:53 -07:00
James Betker	bca59ed98a	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-12-07 12:51:04 -07:00
James Betker	ea56eb61f0	Fix DDP errors for discriminator - Don't define training_net in define_optimizers - this drops the shell and leads to problems downstream - Get rid of support for multiple training nets per opt. This was half baked and needs a better solution if needed downstream.	2020-12-07 12:50:57 -07:00
James Betker	88fc049c8d	spinenet latent playground!	2020-12-05 20:30:36 -07:00
James Betker	11155aead4	Directly use dataset keys This has been a long time coming. Cleans up messy "GT" nomenclature and simplifies ExtensibleTraner.feed_data	2020-12-04 20:14:53 -07:00
James Betker	8a83b1c716	Go back to apex DDP, fix distributed bugs	2020-12-04 16:39:21 -07:00
James Betker	7a81d4e2f4	Revert gaussian loss changes	2020-12-04 12:49:20 -07:00
James Betker	711780126e	Cleanup	2020-12-03 23:42:51 -07:00
James Betker	ac7256d4a3	Do tqdm reporting when calculating flow_gaussian_nll	2020-12-03 23:42:29 -07:00
James Betker	dc9ff8e05b	Allow the majority of the srflow steps to checkpoint	2020-12-03 23:41:57 -07:00
James Betker	06d1c62c5a	iGPT support! Sweeeeet	2020-12-03 15:32:21 -07:00
James Betker	c18adbd606	Delete mdcn & panet Garbage, all of it.	2020-12-02 22:25:57 -07:00
James Betker	f2880b33c9	Get rid of mean shift from MDCN	2020-12-02 14:18:33 -07:00
James Betker	8a00f15746	Implement FlowGaussianNll evaluator	2020-12-02 14:09:54 -07:00
James Betker	edf408508c	Fix discriminator	2020-12-01 17:45:56 -07:00
James Betker	9a421a41f4	SRFlow: accomodate mismatches between global scale and flow_scale	2020-12-01 11:11:51 -07:00
James Betker	e343722d37	Add stepped rrdb	2020-12-01 11:11:15 -07:00
James Betker	2e0bbda640	Remove unused archs	2020-12-01 11:10:48 -07:00
James Betker	a1c8300052	Add mdcn	2020-11-30 16:14:21 -07:00
James Betker	1e0f69e34b	extra_conv in gn discriminator, multiframe support in rrdb.	2020-11-29 15:39:50 -07:00
James Betker	da604752e6	Misc RRDB changes	2020-11-29 12:21:31 -07:00
James Betker	a1d4c9f83c	multires rrdb work	2020-11-28 14:35:46 -07:00
James Betker	929cd45c05	Fix for RRDB scale	2020-11-27 21:37:10 -07:00
James Betker	71fa532356	Adjustments to how flow networks set size and scale	2020-11-27 21:37:00 -07:00
James Betker	6f958bb150	Maybe this is necessary after all?	2020-11-27 15:21:13 -07:00
James Betker	ef8d5f88c1	Bring split gaussian nll out of split so it can be computed accurately with the rest of the nll component	2020-11-27 13:30:21 -07:00
James Betker	4ab49b0d69	RRDB disc work	2020-11-27 12:03:08 -07:00
James Betker	6de4dabb73	Remove srflow (modified version) Starting from orig and re-working from there.	2020-11-27 12:02:06 -07:00
James Betker	fd356580c0	Play with lambdas	2020-11-26 20:30:55 -07:00
James Betker	cb045121b3	Expose srflow rrdb	2020-11-24 13:20:20 -07:00
James Betker	f6098155cd	Mods to tecogan to allow use of embeddings as input	2020-11-24 09:24:02 -07:00
James Betker	b10bcf6436	Rework stylegan_for_sr to incorporate structure as an adain block	2020-11-23 11:31:11 -07:00
James Betker	519ba6f10c	Support 2x RRDB with 4x srflow	2020-11-21 14:46:15 -07:00
James Betker	cad92bada8	Report logp and logdet for srflow	2020-11-21 10:13:05 -07:00
James Betker	c37d3faa58	More adjustments to srflow_orig	2020-11-20 19:38:33 -07:00
James Betker	d51d12a41a	Adjustments to srflow to (maybe?) fix training	2020-11-20 14:44:24 -07:00
James Betker	6c8c35ac47	Support training RRDB encoder [srflow]	2020-11-20 10:03:06 -07:00
James Betker	5ccdbcefe3	srflow_orig integration	2020-11-19 23:47:24 -07:00
James Betker	2b2d754d8e	Bring in an original SRFlow implementation for reference	2020-11-19 21:42:39 -07:00
James Betker	1e0d7be3ce	"Clean up" SRFlow	2020-11-19 21:42:24 -07:00
James Betker	d7877d0a36	Fixes to teco losses and translational losses	2020-11-19 11:35:05 -07:00
James Betker	5c10264538	Remove pyramid_disc hard dependencies	2020-11-17 18:34:11 -07:00
James Betker	6b679e2b51	Make grad_penalty available to classical discs	2020-11-17 18:31:40 -07:00
James Betker	8a19c9ae15	Add additive mode to rrdb	2020-11-16 20:45:09 -07:00
James Betker	2a507987df	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-11-15 16:16:30 -07:00
James Betker	931ed903c1	Allow combined additive loss	2020-11-15 16:16:18 -07:00
James Betker	4b68116977	import fix	2020-11-15 16:15:42 -07:00
James Betker	98eada1e4c	More circular dependency fixes + unet fixes	2020-11-15 11:53:35 -07:00
James Betker	e587d549f7	Fix circular imports	2020-11-15 11:32:35 -07:00
James Betker	99f0cfaab5	Rework stylegan2 divergence losses Notably: include unet loss	2020-11-15 11:26:44 -07:00
James Betker	ea94b93a37	Fixes for unet	2020-11-15 10:38:33 -07:00
James Betker	89f56b2091	Fix another import	2020-11-14 22:10:45 -07:00
James Betker	9af049c671	Import fix for unet	2020-11-14 22:09:18 -07:00
James Betker	5cade6b874	Move stylegan2 around, bring in unet	2020-11-14 22:04:48 -07:00
James Betker	125cb16dce	Add a FID evaluator for stylegan with structural guidance	2020-11-14 20:16:07 -07:00
James Betker	c9258e2da3	Alter how structural guidance is given to stylegan	2020-11-14 20:15:48 -07:00
James Betker	3397c83447	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-11-14 09:30:09 -07:00
James Betker	423ee7cb90	Allow attention to be specified for stylegan2	2020-11-14 09:29:53 -07:00
James Betker	f406a5dd4c	Mods to support stylegan2 in SR mode	2020-11-13 20:11:50 -07:00
James Betker	9c3d0b7560	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-11-13 20:10:47 -07:00
James Betker	67bf55495b	Allow hq_batched_key to be specified	2020-11-13 20:10:12 -07:00
James Betker	0b96811611	Fix another issue with gpu ids getting thrown all over hte place	2020-11-13 20:05:52 -07:00
James Betker	a07e1a7292	Add separate Evaluator module and FID evaluator	2020-11-13 11:03:54 -07:00
James Betker	080ad61be4	Add option to work with nonrandom latents	2020-11-12 21:23:50 -07:00
James Betker	566b99ca75	GP adjustments for stylegan2	2020-11-12 16:44:51 -07:00
James Betker	44a19cd37c	ExtensibleTrainer mods to support advanced checkpointing for stylegan2 Basically: stylegan2 makes use of gradient-based normalizers. These make it so that I cannot use gradient checkpointing. But I love gradient checkpointing. It makes things really, really fast and memory conscious. So - only don't checkpoint when we run the regularizer loss. This is a bit messy, but speeds up training by at least 20%. Also: pytorch: please make checkpointing a first class citizen.	2020-11-12 15:45:07 -07:00
James Betker	db9e9e28a0	Fix an issue where GPU0 was always being used in non-ddp Frankly, I don't understand how this has ever worked. WTF.	2020-11-12 15:43:01 -07:00
James Betker	2d3449d7a5	stylegan2 in ml art school!	2020-11-12 15:42:05 -07:00
James Betker	fd97573085	Fixes	2020-11-11 21:49:06 -07:00
James Betker	88f349bdf1	Enable usage of wandb	2020-11-11 21:48:56 -07:00
James Betker	1c065c41b4	Revert "..." This reverts commit `4b92191880`.	2020-11-11 17:24:27 -07:00
James Betker	4b92191880	...	2020-11-11 14:12:40 -07:00
James Betker	12b57bbd03	Add residual blocks to pyramid disc	2020-11-11 13:56:45 -07:00
James Betker	b4136d766a	Back to pyramids, no rrdb	2020-11-11 13:40:24 -07:00
James Betker	42a97de756	Convert PyramidRRDBDisc to RRDBDisc Had numeric stability issues. This probably makes more sense anyways.	2020-11-11 12:14:14 -07:00
James Betker	72762f200c	PyramidRRDB net	2020-11-11 11:25:49 -07:00
James Betker	a1760f8969	Adapt srg2 for video	2020-11-10 16:16:41 -07:00
James Betker	b742d1e5a5	When skipping steps via "every", still run nontrainable injection points	2020-11-10 16:09:17 -07:00
James Betker	91d27372e4	rrdb with adain latent	2020-11-10 16:08:54 -07:00
James Betker	6a2fd5f7d0	Lots of new discriminator nets	2020-11-10 16:06:54 -07:00
James Betker	4e5ba61ae7	SRG2classic further re-integration	2020-11-10 16:06:14 -07:00
James Betker	9e2c96ad5d	More latent work	2020-11-07 20:38:56 -07:00
James Betker	0cf52ef52c	latent work	2020-11-06 20:38:23 -07:00
James Betker	34d319585c	Add srflow arch	2020-11-06 20:38:04 -07:00
James Betker	4469d2e661	More work on RRDB with latent	2020-11-05 22:13:05 -07:00
James Betker	62d3b6496b	Latent work checkpoint	2020-11-05 13:31:34 -07:00
James Betker	fd6cdba88f	RRDB with latent	2020-11-05 10:04:17 -07:00
James Betker	df47d6cbbb	More work in support of training flow networks in tandem with generators	2020-11-04 18:07:48 -07:00
James Betker	658a267bab	More work on SSIM/PSNR approximators - Add a network that accomodates this style of approximator while retaining structure - Migrate to SSIM approximation - Add a tool to visualize how these approximators are working - Fix some issues that came up while doign this work	2020-11-03 08:09:58 -07:00
James Betker	a51daacde2	Fix reporting of d_fake_diff for generators	2020-11-02 08:45:46 -07:00
James Betker	dcfe994fee	Add standalone srg2_classic Trying to investigate how I was so misguided. I thought srg2 was considerably better than RRDB in performance but am not actually seeing that.	2020-10-31 20:55:34 -06:00
James Betker	eb7df63592	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-31 11:09:32 -06:00
James Betker	c2866ad8d2	Disable debugging of comparable pingpong generations	2020-10-31 11:09:10 -06:00
James Betker	7303d8c932	Add psnr approximator	2020-10-31 11:08:55 -06:00
James Betker	565517814e	Restore SRG2 Going to try to figure out where SRG lost competitiveness to RRDB..	2020-10-30 14:01:56 -06:00
James Betker	74738489b9	Fixes and additional support for progressive zoom	2020-10-30 09:59:54 -06:00
James Betker	a3918fa808	Tecogan & other fixes	2020-10-30 00:19:58 -06:00
James Betker	b316078a15	Fix tecogan_losses fp16	2020-10-29 23:02:20 -06:00
James Betker	3791f95ad0	Enable RRDB to take in reference inputs	2020-10-29 11:07:40 -06:00
James Betker	7d38381d46	Add scaling to rrdb	2020-10-29 09:48:10 -06:00
James Betker	607ff3c67c	RRDB with bypass	2020-10-29 09:39:45 -06:00
James Betker	1655b9e242	Fix fast_forward teco loss bug	2020-10-28 17:49:54 -06:00
James Betker	515905e904	Add a min_loss that is DDP compatible	2020-10-28 15:46:59 -06:00
James Betker	f133243ac8	Extra logging for teco_resgen	2020-10-28 15:21:22 -06:00
James Betker	2ab5054d4c	Add noise to teco disc	2020-10-27 22:48:23 -06:00
James Betker	4dc16d5889	Upgrade tecogan_losses for speed	2020-10-27 22:40:15 -06:00
James Betker	ac3da0c5a6	Make tecogen functional	2020-10-27 21:08:59 -06:00
James Betker	10da206db6	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-27 20:59:59 -06:00
James Betker	9848f4c6cb	Add teco_resgen	2020-10-27 20:59:55 -06:00
James Betker	543c384a91	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-27 20:59:16 -06:00
James Betker	da53090ce6	More adjustments to support distributed training with teco & on multi_modal_train	2020-10-27 20:58:03 -06:00
James Betker	00bb568956	further checkpointify spsr_arch	2020-10-27 17:54:28 -06:00
James Betker	c2727a0150	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-27 15:24:19 -06:00
James Betker	2a3eec8fd7	Fix some distributed training snafus	2020-10-27 15:24:05 -06:00
James Betker	d923a62ed3	Allow SPSR to checkpoint	2020-10-27 15:23:20 -06:00
James Betker	11a9e223a6	Retrofit SPSR_arch so it is capable of accepting a ref	2020-10-27 11:14:36 -06:00
James Betker	8202ee72b9	Re-add original SPSR_arch	2020-10-27 11:00:38 -06:00
James Betker	231137ab0a	Revert RRDB back to original model	2020-10-27 10:25:31 -06:00
James Betker	1ce863849a	Remove temporary base_model change	2020-10-26 11:13:01 -06:00
James Betker	54accfa693	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-26 11:12:37 -06:00
James Betker	ff58c6484a	Fixes to unified chunk datasets to support stereoscopic training	2020-10-26 11:12:22 -06:00
James Betker	f857eb00a8	Allow tecogan losses to compute at 32px	2020-10-26 11:09:55 -06:00
James Betker	629b968901	ChainedGen 4x alteration Increases conv window for teco_recurrent in the 4x case so all data can be used. base_model changes should be temporary.	2020-10-26 10:54:51 -06:00
James Betker	85c07f85d9	Update flownet submodule	2020-10-24 11:59:00 -06:00
James Betker	9c3d059ef0	Updates to be able to train flownet2 in ExtensibleTrainer Only supports basic losses for now, though.	2020-10-24 11:56:39 -06:00
James Betker	1dbcbfbac8	Restore ChainedEmbeddingGenWithStructure Still using this guy, after all	2020-10-24 11:54:52 -06:00
James Betker	7a75d10784	Arch cleanup	2020-10-23 09:35:33 -06:00
James Betker	646d6a621a	Support 4x zoom on ChainedEmbeddingGen	2020-10-23 09:25:58 -06:00
James Betker	e9c0b9f0fd	More adjustments to support multi-modal training Specifically - looks like at least MSE loss cannot handle autocasted tensors	2020-10-22 16:49:34 -06:00
James Betker	76789a456f	Class-ify train.py and workon multi-modal trainer	2020-10-22 16:15:31 -06:00
James Betker	15e00e9014	Finish integration with autocast Note: autocast is broken when also using checkpoint(). Overcome this by modifying torch's checkpoint() function in place to also use autocast.	2020-10-22 14:39:19 -06:00
James Betker	d7ee14f721	Move to torch.cuda.amp (not working) Running into OOM errors, needs diagnosing. Checkpointing here.	2020-10-22 13:58:05 -06:00
James Betker	3e3d2af1f3	Add multi-modal trainer	2020-10-22 13:27:32 -06:00
James Betker	40dc2938e8	Fix multifaceted chain gen	2020-10-22 13:27:06 -06:00
James Betker	43c4f92123	Collapse progressive zoom candidates into the batch dimension This contributes a significant speedup to training this type of network since losses can operate on the entire prediction spectrum at once.	2020-10-21 22:37:23 -06:00
James Betker	680d635420	Enable ExtensibleTrainer to skip steps when state keys are missing	2020-10-21 22:22:28 -06:00
James Betker	d1175f0de1	Add FFT injector	2020-10-21 22:22:00 -06:00
James Betker	1ef559d7ca	Add a ChainedEmbeddingGen which can be simueltaneously used with multiple training paradigms	2020-10-21 22:21:51 -06:00
James Betker	931aa65dd0	Allow recurrent losses to be weighted	2020-10-21 16:59:44 -06:00
James Betker	5753e77d67	ChainedGen: Output debugging information on blocks	2020-10-21 16:36:23 -06:00
James Betker	3c6e600e48	Add capacity for models to self-report visuals	2020-10-21 11:08:03 -06:00
James Betker	dca5cddb3b	Add bypass to ChainedEmbeddingGen	2020-10-21 11:07:45 -06:00
James Betker	a63bf2ea2f	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-19 15:26:11 -06:00
James Betker	76e4f0c086	Restore test.py for use as standalone validator	2020-10-19 15:26:07 -06:00
James Betker	1b1ca297f8	Fix recurrent=None bug in ChainedEmbeddingGen	2020-10-19 15:25:12 -06:00
James Betker	b28e4d9cc7	Add spread loss Experimental loss that peaks around 0.	2020-10-19 11:31:19 -06:00
James Betker	981d64413b	Support validation over a custom injector Also re-enable PSNR	2020-10-19 11:01:56 -06:00
James Betker	668cafa798	Push correct patch of recurrent embedding to upstream image, rather than whole thing	2020-10-18 22:39:52 -06:00
James Betker	7df378a944	Remove separated vgg discriminator Checkpointing happens inline instead. Was a dumb idea.. Also fixes some loss reporting issues.	2020-10-18 12:10:24 -06:00
James Betker	c709d38cd5	Fix memory leak with recurrent loss	2020-10-18 10:22:10 -06:00
James Betker	552e70a032	Get rid of excessive checkpointed disc params	2020-10-18 10:09:37 -06:00
James Betker	6a0d5f4813	Add a checkpointable discriminator	2020-10-18 09:57:47 -06:00
James Betker	9ead2c0a08	Multiscale training in!	2020-10-17 22:54:12 -06:00
James Betker	e706911c83	Fix spinenet bug	2020-10-17 20:20:36 -06:00
James Betker	b008a27d39	Spinenet should allow bypassing the initial conv This makes feeding in references for recurrence easier.	2020-10-17 20:16:47 -06:00
James Betker	c1c9c5681f	Swap recurrence	2020-10-17 08:40:28 -06:00
James Betker	6141aa1110	More recurrence fixes for chainedgen	2020-10-17 08:35:46 -06:00
James Betker	cf8118a85b	Allow recurrence to specified for chainedgen	2020-10-17 08:32:29 -06:00
James Betker	fc4c064867	Add recurrent support to chainedgenwithstructure	2020-10-17 08:31:34 -06:00
James Betker	d4a3e11ab2	Don't use several stages of spinenet_arch These are used for lower outputs which I am not using	2020-10-17 08:28:37 -06:00
James Betker	d1c63ae339	Go back to torch's DDP Apex was having some weird crashing issues.	2020-10-16 20:47:35 -06:00
James Betker	d856378b2e	Add ChainedGenWithStructure	2020-10-16 20:44:36 -06:00
James Betker	617d97e19d	Add ChainedEmbeddingGen	2020-10-15 23:18:08 -06:00
James Betker	c4543ce124	Set post_transform_block to None where applicable	2020-10-15 17:20:42 -06:00
James Betker	6f8705e8cb	SSGSimpler network	2020-10-15 17:18:44 -06:00
James Betker	eda75c9779	Cleanup fixes	2020-10-15 10:13:17 -06:00
James Betker	920865defb	Arch work	2020-10-15 10:13:06 -06:00
James Betker	1f20d59c31	Revert big switch back	2020-10-14 11:03:34 -06:00
James Betker	24792bdb4f	Codebase cleanup Removed a lot of legacy stuff I have no intent on using again. Plan is to shape this repo into something more extensible (get it? hah!)	2020-10-13 20:56:39 -06:00
James Betker	e620fc05ba	Mods to support video processing with teco networks	2020-10-13 20:47:05 -06:00
James Betker	17d78195ee	Mods to SRG to support returning switch logits	2020-10-13 20:46:37 -06:00
James Betker	cc915303a5	Fix SPSR calls into SwitchComputer	2020-10-13 10:14:47 -06:00
James Betker	bdf4c38899	Merge remote-tracking branch 'origin/gan_lab' into gan_lab # Conflicts: # codes/models/archs/SwitchedResidualGenerator_arch.py	2020-10-13 10:12:26 -06:00
James Betker	9a5d6162e9	Add the "BigSwitch"	2020-10-13 10:11:10 -06:00
James Betker	8014f050ac	Clear metrics properly Holy cow, what a PITA bug.	2020-10-13 10:07:49 -06:00
James Betker	4d52374e60	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-12 17:43:51 -06:00
James Betker	731700ab2c	checkpoint in ssg	2020-10-12 17:43:28 -06:00
James Betker	ca523215c6	Fix recurrent std in arch	2020-10-12 17:42:32 -06:00
James Betker	05377973bf	Allow initial recurrent input to be specified (optionally)	2020-10-12 17:36:43 -06:00
James Betker	597b6e92d6	Add ssgr1 recurrence	2020-10-12 17:18:19 -06:00
James Betker	d7d7590f3e	Fix constant injector - wasn't working in test	2020-10-12 10:36:30 -06:00
James Betker	ce163ad4a9	Update SSGdeep	2020-10-12 10:22:08 -06:00
James Betker	3409d88a1c	Add PANet arch	2020-10-12 10:20:55 -06:00
James Betker	a9c2e97391	Constant injector and teco fixes	2020-10-11 08:20:07 -06:00
James Betker	e785029936	Mods needed to support SPSR archs with teco gan	2020-10-10 22:39:55 -06:00
James Betker	120072d464	Add constant injector	2020-10-10 21:50:23 -06:00
James Betker	f99812e14d	Fix tecogan_losses errors	2020-10-10 20:30:14 -06:00
James Betker	3a5b23b9f7	Alter teco_losses to feed a recurrent input in as separate	2020-10-10 20:21:09 -06:00
James Betker	0d30d18a3d	Add MarginRemoval injector	2020-10-09 20:35:56 -06:00
James Betker	0011d445c8	Fix loss indexing	2020-10-09 20:20:51 -06:00
James Betker	202eb11fdc	For element loss added	2020-10-09 19:51:44 -06:00
James Betker	fe50d6f9d0	Fix attention images	2020-10-09 19:21:55 -06:00
James Betker	7e777ea34c	Allow tecogan to be used in process_video	2020-10-09 19:21:43 -06:00
James Betker	58d8bf8f69	Add network architecture built for teco	2020-10-09 08:40:14 -06:00
James Betker	afe6af88af	Fix attention print issue	2020-10-08 18:34:00 -06:00
James Betker	4c85ee51a4	Converge SSG architectures into unified switching base class Also adds attention norm histogram to logging	2020-10-08 17:23:21 -06:00
James Betker	1eb516d686	Fix more distributed bugs	2020-10-08 14:32:45 -06:00
James Betker	fba29d7dcc	Move to apex distributeddataparallel and add switch all_reduce Torch's distributed_data_parallel is missing "delay_allreduce", which is necessary to get gradient checkpointing to work with recurrent models.	2020-10-08 11:20:05 -06:00
James Betker	c174ac0fd5	Allow tecogan to support generators that only output a tensor (instead of a list)	2020-10-08 09:26:25 -06:00
James Betker	969bcd9021	Use local checkpoint in SSG	2020-10-08 08:54:46 -06:00
James Betker	c93dd623d7	Tecogan losses work	2020-10-07 23:11:58 -06:00
James Betker	c96f5b2686	Import switched_conv as a submodule	2020-10-07 23:10:54 -06:00
James Betker	c352c8bce4	More tecogan fixes	2020-10-07 12:41:17 -06:00
James Betker	1c44d395af	Tecogan work Its training! There's still probably plenty of bugs though..	2020-10-07 09:03:30 -06:00
James Betker	e9d7371a61	Add concatenate injector	2020-10-07 09:02:42 -06:00
James Betker	8a7e993aea	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-10-06 20:41:58 -06:00
James Betker	1e415b249b	Add tag that can be applied to prevent parameter training	2020-10-06 20:39:49 -06:00
James Betker	2f2e3f33f8	StackedSwitchedGenerator_5lyr	2020-10-06 20:39:32 -06:00
James Betker	6217b48e3f	Fix spsr_arch bug	2020-10-06 20:38:47 -06:00
James Betker	cffc596141	Integrate flownet2 into codebase, add teco visual debugs	2020-10-06 20:35:39 -06:00
James Betker	e4b89a172f	Reduce spsr7 memory usage	2020-10-05 22:05:56 -06:00
James Betker	4111942ada	Support attention deferral in deep ssgr	2020-10-05 19:35:55 -06:00
James Betker	840927063a	Work on tecogan losses	2020-10-05 19:35:28 -06:00
James Betker	2875822024	SPSR9 arch takes some of the stuff I learned with SGSR yesterday and applies it to spsr	2020-10-05 08:47:51 -06:00
James Betker	51044929af	Don't compute attention statistics on multiple generator invocations of the same data	2020-10-05 00:34:29 -06:00
James Betker	e760658fdb	Another fix..	2020-10-04 21:08:00 -06:00
James Betker	a890e3a9c0	Fix geometric loss not handling 0 index	2020-10-04 21:05:01 -06:00
James Betker	c3ef8a4a31	Stacked switches - return a tuple	2020-10-04 21:02:24 -06:00
James Betker	13f97e1e97	Add recursive loss	2020-10-04 20:48:15 -06:00
James Betker	ffd069fd97	Lots of SSG work - Checkpointed pretty much the entire model - enabling recurrent inputs - Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)	2020-10-04 20:48:08 -06:00
James Betker	aca2c7ab41	Full checkpoint-ize SSG1	2020-10-04 18:24:52 -06:00
James Betker	e3294939b0	Revert "SSG: offer option to use BN-based attention normalization" Didn't work. Oh well. This reverts commit `5cd2b37591`.	2020-10-03 17:54:53 -06:00
James Betker	5cd2b37591	SSG: offer option to use BN-based attention normalization Not sure how this is going to work, lets try it.	2020-10-03 16:16:19 -06:00
James Betker	9b4ed82093	Get rid of unused convs in spsr7	2020-10-03 11:36:26 -06:00
James Betker	3561cc164d	Fix up fea_loss calculator (for validation) Not sure how this was working in regular training mode, but it was failing in DDP.	2020-10-03 11:19:20 -06:00
James Betker	6c9718ad64	Don't log if you aren't 0 rank	2020-10-03 11:14:13 -06:00
James Betker	922b1d76df	Don't record visuals when not on rank 0	2020-10-03 11:10:03 -06:00
James Betker	8197fd646f	Don't accumulate losses for metrics when the loss isn't a tensor	2020-10-03 11:03:55 -06:00
James Betker	19a4075e1e	Allow checkpointing to be disabled in the options file Also makes options a global variable for usage in utils.	2020-10-03 11:03:28 -06:00
James Betker	dd9d7b27ac	Add more sophisticated mechanism for balancing GAN losses	2020-10-02 22:53:42 -06:00
James Betker	39865ca3df	TOTAL_loss, dumbo	2020-10-02 21:06:10 -06:00
James Betker	4e44fcd655	Loss accumulator fix	2020-10-02 20:55:33 -06:00
James Betker	567b4d50a4	ExtensibleTrainer - don't compute backward when there is no loss	2020-10-02 20:54:06 -06:00
James Betker	146a9125f2	Modify geometric & translational losses so they can be used with embeddings	2020-10-02 20:40:13 -06:00
James Betker	e30a1443cd	Change sw2 refs	2020-10-02 09:01:18 -06:00
James Betker	e38716925f	Fix spsr8 class init	2020-10-02 09:00:18 -06:00
James Betker	35469f08e2	Spsr 8	2020-10-02 08:58:15 -06:00
James Betker	aa4fd89018	resnext with groupnorm	2020-10-01 15:49:28 -06:00
James Betker	8beaa47933	resnext discriminator	2020-10-01 11:48:14 -06:00
James Betker	55f2764fef	Allow fixup50 to be used as a discriminator	2020-10-01 11:28:18 -06:00
James Betker	7986185fcb	Change 'mod_step' to 'every'	2020-10-01 11:28:06 -06:00
James Betker	d9ae970fd9	SSG update	2020-10-01 11:27:51 -06:00
James Betker	e3053e4e55	Exchange SpsrNet for SpsrNetSimplified	2020-09-30 17:01:04 -06:00
James Betker	66d4512029	Fix up translational equivariance loss so it's ready for prime time	2020-09-30 12:01:00 -06:00
James Betker	896b4f5be2	Revert "spsr7 adjustments" This reverts commit `9fee1cec71`.	2020-09-29 18:30:41 -06:00
James Betker	9fee1cec71	spsr7 adjustments	2020-09-29 17:19:59 -06:00
James Betker	dc8f3b24de	Don't let duplicate keys be used for injectors and losses	2020-09-29 16:59:44 -06:00
James Betker	0b5a033503	spsr7 + cleanup SPSR7 adds ref onto spsr6, makes more "common sense" mods.	2020-09-29 16:59:26 -06:00
James Betker	f9b83176f1	Fix bugs in extensibletrainer	2020-09-28 22:09:42 -06:00
James Betker	db52bec4ab	spsr6 This is meant to be a variant of SPSR5 that harkens back to the simpler earlier architectures that do not have embeddings or ref_ inputs, but do have deep multiplexers. It does, however, use some of the new conjoin mechanisms.	2020-09-28 22:09:27 -06:00
James Betker	7e240f2fed	Recurrent / teco work	2020-09-28 22:06:56 -06:00
James Betker	aeaf185314	Add RCAN	2020-09-27 16:00:41 -06:00
James Betker	4d29b7729e	Model arch cleanup	2020-09-27 11:18:45 -06:00
James Betker	31641d7f63	Add ImagePatchInjector and TranslationalLoss	2020-09-26 21:25:32 -06:00
James Betker	d8621e611a	BackboneSpineNoHead takes ref	2020-09-26 21:25:04 -06:00
James Betker	5a27187c59	More mods to accomodate new dataset	2020-09-25 22:45:57 -06:00
James Betker	6d0490a0e6	Tecogan implementation work	2020-09-25 16:38:23 -06:00
James Betker	ce4613ecb9	Finish up single_image_dataset work Sweet!	2020-09-25 16:37:54 -06:00
James Betker	ea565b7eaf	More fixes	2020-09-24 17:51:52 -06:00
James Betker	553917a8d1	Fix torchvision import bug	2020-09-24 17:38:34 -06:00
James Betker	58886109d4	Update how spsr arches do attention to conform with sgsr	2020-09-24 16:53:54 -06:00
James Betker	9a50a7966d	SiLU doesnt support inplace	2020-09-23 21:09:13 -06:00
James Betker	eda0eadba2	Use custom SiLU Torch didnt have this before 1.7	2020-09-23 21:05:06 -06:00
James Betker	05963157c1	Several things - Fixes to 'after' and 'before' defs for steps (turns out they werent working) - Feature nets take in a list of layers to extract. Not fully implemented yet. - Fixes bugs with RAGAN - Allows real input into generator gan to not be detached by param	2020-09-23 11:56:36 -06:00
James Betker	4ab989e015	try again..	2020-09-22 18:27:52 -06:00
James Betker	3b6c957194	Fix? it again?	2020-09-22 18:25:59 -06:00
James Betker	7b60d9e0d8	Fix? cosine loss	2020-09-22 18:18:35 -06:00
James Betker	2e18c4c22d	Add CosineEmbeddingLoss to F	2020-09-22 17:10:29 -06:00
James Betker	f40beb5460	Add 'before' and 'after' defs to injections, steps and optimizers	2020-09-22 17:03:22 -06:00
James Betker	419f77ec19	Some new backbones	2020-09-21 12:36:49 -06:00
James Betker	9429544a60	Spinenet: implementation without 4x downsampling right off the bat	2020-09-21 12:36:30 -06:00
James Betker	53a5657850	Fix SSGR	2020-09-20 19:07:15 -06:00
James Betker	17c569ea62	Add geometric loss	2020-09-20 16:24:23 -06:00
James Betker	17dd99b29b	Fix bug with discriminator noise addition It wasn't using the scale and was applying the noise to the underlying state variable.	2020-09-20 12:00:27 -06:00
James Betker	3138f98fbc	Allow discriminator noise to be injected at the loss level, cleans up configs	2020-09-19 21:47:52 -06:00
James Betker	e9a39bfa14	Recursively detach all outputs, even if they are nested in data structures	2020-09-19 21:47:34 -06:00
James Betker	fe82785ba5	Add some new architectures to ssg	2020-09-19 21:47:10 -06:00
James Betker	b83f097082	Get rid of get_debug_values from RRDB, rectify outputs	2020-09-19 21:46:36 -06:00
James Betker	e0bd68efda	Add ImageFlowInjector	2020-09-19 10:07:00 -06:00
James Betker	e2a146abc7	Add in experiments hook	2020-09-19 10:05:25 -06:00
James Betker	9a17ade550	Some convenience adjustments to ExtensibleTrainer	2020-09-17 21:05:32 -06:00
James Betker	9963b37200	Add a new script for loading a discriminator network and using it to filter images	2020-09-17 13:30:32 -06:00
James Betker	723754c133	Update attention debugger outputting for SSG	2020-09-16 13:09:46 -06:00
James Betker	0918430572	SSG network This branches off of SPSR. It is identical but substantially reduced in complexity. It's intended to be my long term working arch.	2020-09-15 20:59:24 -06:00
James Betker	6deab85b9b	Add BackboneEncoderNoRef	2020-09-15 16:55:38 -06:00
James Betker	d0321ca5de	Don't load amp state dict if amp is disabled	2020-09-14 15:21:42 -06:00
James Betker	ccf8438001	SPSR5 This is SPSR4, but the multiplexers have access to the output of the transformations for making their decision.	2020-09-13 20:10:24 -06:00
James Betker	5b85f891af	Only log the name of the first network in the total_loss training set	2020-09-12 16:07:09 -06:00
James Betker	fb595e72a4	Supporting infrastructure in ExtensibleTrainer to train spsr4 Need to be able to train 2 nets in one step: the backbone will be entirely separate with its own optimizer (for an extremely low LR). This functionality was already present, just not implemented correctly.	2020-09-11 22:57:06 -06:00
James Betker	4e44bca611	SPSR4 aka - return of the backbone! I'm tired of massively overparameterized generators with pile-of-shit multiplexers. Let's give this another try..	2020-09-11 22:55:37 -06:00
James Betker	19896abaea	Clean up old SwitchedSpsr arch It didn't work anyways, so why not?	2020-09-11 16:09:28 -06:00
James Betker	50ca17bb0a	Feature mode -> back to LR fea	2020-09-11 13:09:55 -06:00
James Betker	1086f0476b	Fix ref branch using fixed filters	2020-09-11 08:58:35 -06:00
James Betker	8c469b8286	Enable memory checkpointing	2020-09-11 08:44:29 -06:00
James Betker	5189b11dac	Add combined dataset for training across multiple datasets	2020-09-11 08:44:06 -06:00
James Betker	313424d7b5	Add new referencing discriminator Also extend the way losses work so that you can pass parameters into the discriminator from the config file	2020-09-10 21:35:29 -06:00
James Betker	9e5aa166de	Report the standard deviation of ref branches This patch also ups the contribution	2020-09-10 16:34:41 -06:00
James Betker	668bfbff6d	Back to best arch for spsr3	2020-09-10 14:58:14 -06:00
James Betker	992b0a8d98	spsr3 with conjoin stage as part of the switch	2020-09-10 09:11:37 -06:00
James Betker	e0fc5eb50c	Temporary commit - noise	2020-09-09 17:12:52 -06:00
James Betker	00da69d450	Temporary commit - ref	2020-09-09 17:09:44 -06:00
James Betker	df59d6c99d	More spsr3 mods - Most branches get their own noise vector now. - First attention branch has the intended sole purpose of raw image processing - Remove norms from joiner block	2020-09-09 16:46:38 -06:00
James Betker	747ded2bf7	Fixes to the spsr3 Some lessons learned: - Biases are fairly important as a relief valve. They dont need to be everywhere, but most computationally heavy branches should have a bias. - GroupNorm in SPSR is not a great idea. Since image gradients are represented in this model, normal means and standard deviations are not applicable. (imggrad has a high representation of 0). - Don't fuck with the mainline of any generative model. As much as possible, all additions should be done through residual connections. Never pollute the mainline with reference data, do that in branches. It basically leaves the mode untrainable.	2020-09-09 15:28:14 -06:00
James Betker	0ffac391c1	SPSR with ref joining	2020-09-09 11:17:07 -06:00
James Betker	3027e6e27d	Enable amp to be disabled	2020-09-09 10:45:59 -06:00
James Betker	c04f244802	More mods	2020-09-08 20:36:27 -06:00
James Betker	dffbfd2ec4	Allow SRG checkpointing to be toggled	2020-09-08 15:14:43 -06:00
James Betker	e6207d4c50	SPSR3 work SPSR3 is meant to fix whatever is causing the switching units inside of the newer SPSR architectures to fail and basically not use the multiplexers.	2020-09-08 15:14:23 -06:00
James Betker	5606e8b0ee	Fix SRGAN_model/fullimgdataset compatibility 1	2020-09-08 11:34:35 -06:00
James Betker	22c98f1567	Move MultiConvBlock to arch_util	2020-09-08 08:17:27 -06:00
James Betker	146ace0859	CSNLN changes (removed because it doesnt train well)	2020-09-08 08:04:16 -06:00
James Betker	f43df7f5f7	Make ExtensibleTrainer compatible with process_video	2020-09-08 08:03:41 -06:00
James Betker	a18ece62ee	Add updated spsr net for test	2020-09-07 17:01:48 -06:00
James Betker	55475d2ac1	Clean up unused archs	2020-09-07 11:38:11 -06:00
James Betker	e8613041c0	Add novograd optimizer	2020-09-06 17:27:08 -06:00
James Betker	b1238d29cb	Fix trainable not applying to discriminators	2020-09-05 20:31:26 -06:00
James Betker	21ae135f23	Allow Novograd to be used as an optimizer	2020-09-05 16:50:13 -06:00
James Betker	912a4d9fea	Fix srg computer bug	2020-09-05 07:59:54 -06:00
James Betker	0dfd8eaf3b	Support injectors that run in eval only	2020-09-05 07:59:45 -06:00
James Betker	44c75f7642	Undo SRG change	2020-09-04 17:32:16 -06:00
James Betker	6657a406ac	Mods needed to support training a corruptor again: - Allow original SPSRNet to have a specifiable block increment - Cleanup - Bug fixes in code that hasnt been touched in awhile.	2020-09-04 15:33:39 -06:00
James Betker	bfdfaab911	Checkpoint RRDB Greatly reduces memory consumption with a low performance penalty	2020-09-04 15:32:00 -06:00
James Betker	696242064c	Use tensor checkpointing to drastically reduce memory usage This comes at the expense of computation, but since we can use much larger batches, it results in a net speedup.	2020-09-03 11:33:36 -06:00
James Betker	365813bde3	Add InterpolateInjector	2020-09-03 11:32:47 -06:00
James Betker	d90c96e55e	Fix greyscale injector	2020-09-02 10:29:40 -06:00
James Betker	8b52d46847	Interpreted feature loss to extensibletrainer	2020-09-02 10:08:24 -06:00
James Betker	886d59d5df	Misc fixes & adjustments	2020-09-01 07:58:11 -06:00
James Betker	0a9b85f239	Fix vgg_gn input_img_factor	2020-08-31 09:50:30 -06:00
James Betker	4b4d08bdec	Enable testing in ExtensibleTrainer, fix it in SRGAN_model Also compute fea loss for this.	2020-08-31 09:41:48 -06:00
James Betker	b2091cb698	feamod fix	2020-08-30 08:08:49 -06:00
James Betker	a56e906f9c	train HR feature trainer	2020-08-29 09:27:48 -06:00
James Betker	0e859a8082	4x spsr ref (not workin)	2020-08-29 09:27:18 -06:00
James Betker	25832930db	Update loss with lr crossgan	2020-08-26 17:57:22 -06:00
James Betker	cbd5e7a986	Support old school crossgan in extensibletrainer	2020-08-26 17:52:35 -06:00
James Betker	8a6a2e6e2e	Rev3 of the full image ref arch	2020-08-26 17:11:01 -06:00
James Betker	f35b3ad28f	Fix val behavior for ExtensibleTrainer	2020-08-26 08:44:22 -06:00
James Betker	434ed70a9a	Wrap vgg disc	2020-08-25 18:14:45 -06:00
James Betker	83f2f8d239	more debugging	2020-08-25 18:12:12 -06:00
James Betker	3f60281da7	Print when wrapping	2020-08-25 18:08:46 -06:00
James Betker	bae18c05e6	wrap disc grad	2020-08-25 17:58:20 -06:00
James Betker	f85f1e21db	Turns out, can't do that	2020-08-25 17:18:52 -06:00
James Betker	935a735327	More dohs	2020-08-25 17:05:16 -06:00
James Betker	53e67bdb9c	Distribute get_grad_no_padding	2020-08-25 17:03:18 -06:00
James Betker	2f706b7d93	I an inept.	2020-08-25 16:42:59 -06:00
James Betker	8bae0de769	ffffffffffffffffff	2020-08-25 16:41:01 -06:00
James Betker	1fe16f71dd	Fix bug reporting spsr gan weight	2020-08-25 16:37:45 -06:00
James Betker	96586d6592	Fix distributed d_grad	2020-08-25 16:06:27 -06:00
James Betker	09a9079e17	Check rank before doing image logging.	2020-08-25 16:00:49 -06:00
James Betker	a1800f45ef	Fix for referencingmultiplexer	2020-08-25 15:43:12 -06:00
James Betker	a65b07607c	Reference network	2020-08-25 11:56:59 -06:00
James Betker	f9276007a8	More fixes to corrupt_fea	2020-08-23 17:52:18 -06:00
James Betker	0005c56cd4	dbg	2020-08-23 17:43:03 -06:00
James Betker	4bb5b3c981	corfea debugging	2020-08-23 17:39:02 -06:00
James Betker	7713cb8df5	Corrupted features in srgan	2020-08-23 17:32:03 -06:00
James Betker	dffc15184d	More ExtensibleTrainer work It runs now, just need to debug it to reach performance parity with SRGAN. Sweet.	2020-08-23 17:22:45 -06:00
James Betker	afdd93fbe9	Grey feature	2020-08-22 13:41:38 -06:00
James Betker	e59e712e39	More ExtensibleTrainer work	2020-08-22 13:08:33 -06:00
James Betker	f40545f235	ExtensibleTrainer work	2020-08-22 08:24:34 -06:00
James Betker	a498d7b1b3	Report l_g_gan_grad before weight multiplication	2020-08-20 11:57:53 -06:00
James Betker	9d77a4db2e	Allow initial temperature to be specified to SPSR net for inference	2020-08-20 11:57:34 -06:00
James Betker	24bdcc1181	Let SwitchedSpsr transform count be specified	2020-08-18 09:10:25 -06:00
James Betker	74cdaa2226	Some work on extensible trainer	2020-08-18 08:49:32 -06:00
James Betker	868d0aa442	Undo early dim reduction on grad branch for SPSR_arch	2020-08-14 16:23:42 -06:00
James Betker	2d205f52ac	Unite spsr_arch switched gens Found a pretty good basis model.	2020-08-12 17:04:45 -06:00
James Betker	3d0ece804b	SPSR LR2	2020-08-12 08:45:49 -06:00
James Betker	ab04ca1778	Extensible trainer (in progress)	2020-08-12 08:45:23 -06:00
James Betker	cb316fabc7	Use LR data for image gradient prediction when HR data is disjoint	2020-08-10 15:00:28 -06:00
James Betker	f0e2816239	Denoise attention maps	2020-08-10 14:59:58 -06:00
James Betker	59aba1daa7	LR switched SPSR arch This variant doesn't do conv processing at HR, which should save a ton of memory in inference. Lets see how it works.	2020-08-10 13:03:36 -06:00
James Betker	4e972144ae	More attention fixes for switched_spsr	2020-08-07 21:11:50 -06:00
James Betker	d02509ef97	spsr_switched missing import	2020-08-07 21:05:29 -06:00
James Betker	887806ffa0	Finish up spsr_switched	2020-08-07 21:03:48 -06:00
James Betker	1d5f4f6102	Crossgan	2020-08-07 21:03:39 -06:00
James Betker	fd7b6ca0a9	Comptue gan_grad_branch....	2020-08-06 12:11:40 -06:00
James Betker	30b16d5235	Update how branch GAN grad is disseminated	2020-08-06 11:13:02 -06:00
James Betker	1f21c02f8b	Add cross-compare discriminator	2020-08-06 08:56:21 -06:00
James Betker	be272248af	More RAGAN fixes	2020-08-05 16:47:21 -06:00
James Betker	26a6a5d512	Compute grad GAN loss against both the branch and final target, simplify pixel loss Also fixes a memory leak issue where we weren't detaching our loss stats when logging them. This stabilizes memory usage substantially.	2020-08-05 12:08:15 -06:00
James Betker	299ee13988	More RAGAN fixes	2020-08-05 11:03:06 -06:00
James Betker	b8a4df0a0a	Enable RAGAN in SPSR, retrofit old RAGAN for efficiency	2020-08-05 10:34:34 -06:00
James Betker	3ab39f0d22	Several new spsr nets	2020-08-05 10:01:24 -06:00
James Betker	3c0a2d6efe	Fix grad branch debug out	2020-08-04 16:43:43 -06:00
James Betker	ec2a795d53	Fix multistep optimizer (feeding from wrong config params)	2020-08-04 16:42:58 -06:00
James Betker	4bfbdaf94f	Don't recompute generator outputs for D in standard operation Should significantly improve training performance with negligible results differences.	2020-08-04 11:28:52 -06:00
James Betker	11b227edfc	Whoops	2020-08-04 10:30:40 -06:00
James Betker	6d25bcd5df	Apply fixes to grad discriminator	2020-08-04 10:25:13 -06:00
James Betker	c7e5d3888a	Add pix_grad_branch loss to metrics	2020-08-03 16:21:05 -06:00
James Betker	0d070b47a7	Add simplified SPSR architecture Basically just cleaning up the code, removing some bad conventions, and reducing complexity somewhat so that I can play around with this arch a bit more easily.	2020-08-03 10:25:37 -06:00
James Betker	47e24039b5	Fix bug that makes feature loss run even when it is off	2020-08-02 20:37:51 -06:00
James Betker	328afde9c0	Integrate SPSR into SRGAN_model SPSR_model really isn't that different from SRGAN_model. Rather than continuing to re-implement everything I've done in SRGAN_model, port the new stuff from SPSR over. This really demonstrates the need to refactor SRGAN_model a bit to make it cleaner. It is quite the beast these days..	2020-08-02 12:55:08 -06:00
James Betker	c8da78966b	Substantial SPSR mods & fixes - Added in gradient accumulation via mega-batch-factor - Added AMP - Added missing train hooks - Added debug image outputs - Cleaned up including removing GradientPenaltyLoss, custom SpectralNorm - Removed all the custom discriminators	2020-08-02 10:45:24 -06:00
James Betker	f894ba8f98	Add SPSR_module This is a port from the SPSR repo, it's going to need a lot of work to be properly integrated but as of this commit it at least runs.	2020-08-01 22:02:54 -06:00
James Betker	f33ed578a2	Update how attention_maps are created	2020-08-01 20:23:46 -06:00
James Betker	c139f5cd17	More torch 1.6 fixes	2020-07-31 17:03:20 -06:00
James Betker	a66fbb32b6	Fix fixed_disc DataParallel issue	2020-07-31 16:59:23 -06:00
James Betker	8dd44182e6	Fix scale torch warning	2020-07-31 16:56:04 -06:00
James Betker	bcebed19b7	Fix pixdisc bugs	2020-07-31 16:38:14 -06:00
James Betker	eb11a08d1c	Enable disjoint feature networks This is done by pre-training a feature net that predicts the features of HR images from LR images. Then use the original feature network and this new one in tandem to work only on LR/Gen images.	2020-07-31 16:29:47 -06:00
James Betker	6e086d0c20	Fix fixed_disc	2020-07-31 15:07:10 -06:00
James Betker	d5fa059594	Add capability to have old discriminators serve as feature networks	2020-07-31 14:59:54 -06:00
James Betker	6b45b35447	Allow multi_step_lr_scheduler to load a new LR schedule when restoring state	2020-07-31 11:21:11 -06:00
James Betker	e37726f302	Add feature_model for training custom feature nets	2020-07-31 11:20:39 -06:00
James Betker	7629cb0e61	Add FDPL Loss New loss type that can replace PSNR loss. Works against the frequency domain and focuses on frequency features loss during hr->lr conversion.	2020-07-30 20:47:57 -06:00
James Betker	85ee64b8d9	Turn down feadisc intensity Honestly - this feature is probably going to be removed soon, so backwards compatibility is not a huge deal anymore.	2020-07-27 15:28:55 -06:00
James Betker	ebb199e884	Get rid of safety valve (probably being encountered in val)	2020-07-26 22:51:59 -06:00
James Betker	d09ed4e5f7	Misc fixes	2020-07-26 22:44:24 -06:00
James Betker	c54784ae9e	Fix feature disc log item error	2020-07-26 22:25:59 -06:00
James Betker	9a8f227501	Allow separate dataset to pushed in for GAN-only training	2020-07-26 21:44:45 -06:00
James Betker	b06e1784e1	Fix SRG4 & switch disc "fix". hehe.	2020-07-25 17:16:54 -06:00
James Betker	e6e91a1d75	Add SRG4 Back to the idea that maybe what we need is a hybrid approach between pure switches and RDB.	2020-07-24 20:32:49 -06:00
James Betker	3320ad685f	Fix mega_batch_factor not set for test	2020-07-24 12:26:44 -06:00
James Betker	c50cce2a62	Add an abstract, configurabler weight scheduling class and apply it to the feature weight	2020-07-23 17:03:54 -06:00
James Betker	9ccf771629	Fix feature validation, wrong device Only shows up in distributed training for some reason.	2020-07-23 10:16:34 -06:00
James Betker	bba283776c	Enable find_unused_parameters for DistributedDataParallel attention_norm has some parameters which are not used to compute grad, which is causing failures in the distributed case.	2020-07-23 09:08:13 -06:00
James Betker	dbf6147504	Add switched discriminator The logic is that the discriminator may be incapable of providing a truly targeted loss for all image regions since it has to be too generic (basically the same argument for the switched generator). So add some switches in! See how it works!	2020-07-22 20:52:59 -06:00
James Betker	106b8da315	Assert that temperature is set properly in eval mode.	2020-07-22 20:50:59 -06:00
James Betker	c74b9ee2e4	Add a way to disable grad on portions of the generator graph to save memory	2020-07-22 11:40:42 -06:00
James Betker	e3adafbeac	Add convert_model.py and a hacky way to add extra layers to a model	2020-07-22 11:39:45 -06:00
James Betker	7f7e17e291	Update feature discriminator further Move the feature/disc losses closer and add a feature computation layer.	2020-07-20 20:54:45 -06:00
James Betker	46aa776fbb	Allow feature discriminator unet to only output closest layer to feature output	2020-07-19 19:05:08 -06:00
James Betker	8a9f215653	Huge set of mods to support progressive generator growth	2020-07-18 14:18:48 -06:00
James Betker	47a525241f	Make attention norm optional	2020-07-18 07:24:02 -06:00
James Betker	ad97a6a18a	Progressive SRG first check-in	2020-07-18 07:23:26 -06:00
James Betker	b08b1cad45	Fix feature decay	2020-07-16 23:27:06 -06:00
James Betker	3e7a83896b	Fix pixgan debugging issues	2020-07-16 11:45:19 -06:00
James Betker	a1bff64d1a	More fixes	2020-07-16 10:48:48 -06:00
James Betker	240f254263	More loss fixes	2020-07-16 10:45:50 -06:00
James Betker	6cfa67d831	Fix featuredisc broadcast error	2020-07-16 10:18:30 -06:00
James Betker	8d061a2687	Add u-net discriminator with feature output	2020-07-16 10:10:09 -06:00
James Betker	0c4c388e15	Remove dualoutputsrg Good idea, didn't pan out.	2020-07-16 10:09:24 -06:00
James Betker	4bcc409fc7	Fix loadSRG2 typo	2020-07-14 10:20:53 -06:00
James Betker	1e4083a35b	Apply temperature mods to all SRG models (Honestly this needs to be base classed at this point)	2020-07-14 10:19:35 -06:00
James Betker	7659bd6818	Fix temperature equation	2020-07-14 10:17:14 -06:00
James Betker	853468ef82	Allow legacy state_dicts in srg2	2020-07-14 10:03:45 -06:00
James Betker	1b1431133b	Add DualOutputSRG Also removes the old multi-return mechanism that Generators support. Also fixes AttentionNorm.	2020-07-14 09:28:24 -06:00
James Betker	a2285ff2ee	Scale anorm by transform count	2020-07-13 08:49:09 -06:00
James Betker	dd0bbd9a7c	Enable AttentionNorm on SRG2	2020-07-13 08:38:17 -06:00
James Betker	4c0f770f2a	Fix inverted temperature curve bug	2020-07-12 11:02:50 -06:00
James Betker	14d23b9d20	Fixes, do fake swaps less often in pixgan discriminator	2020-07-11 21:22:11 -06:00
James Betker	ba6187859a	err5	2020-07-10 23:02:56 -06:00
James Betker	902527dfaa	err4	2020-07-10 23:00:21 -06:00
James Betker	020b3361fa	err3	2020-07-10 22:57:34 -06:00
James Betker	b3a2c21250	err2	2020-07-10 22:52:02 -06:00
James Betker	716433db1f	err1	2020-07-10 22:50:56 -06:00
James Betker	0b7193392f	Implement unet disc The latest discriminator architecture was already pretty much a unet. This one makes that official and uses shared layers. It also upsamples one additional time and throws out the lowest upsampling result. The intent is to delete the old vgg pixdisc, but I'll keep it around for a bit since I'm still trying out a few models with it.	2020-07-10 16:24:42 -06:00
James Betker	812c684f7d	Update pixgan swap algorithm - Swap multiple blocks in the image instead of just one. The discriminator was clearly learning that most blocks have one region that needs to be fixed. - Relax block size constraints. This was in place to gaurantee that the discriminator signal was clean. Instead, just downsample the "loss image" with bilinear interpolation. The result is noisier, but this is actually probably healthy for the discriminator.	2020-07-10 15:56:14 -06:00
James Betker	33ca3832e1	Move ExpansionBlock to arch_util Also makes all processing blocks have a conformant signature. Alters ExpansionBlock to perform a processing conv on the passthrough before the conjoin operation - this will break backwards compatibilty with SRG2.	2020-07-10 15:53:41 -06:00
James Betker	5e8b52f34c	Misc changes	2020-07-10 09:45:48 -06:00
James Betker	5f2c722a10	SRG2 revival Big update to SRG2 architecture to pull in a lot of things that have been learned: - Use group norm instead of batch norm - Initialize the weights on the transformations low like is done in RRDB rather than using the scalar. Models live or die by their early stages, and this ones early stage is pretty weak - Transform multiplexer to use u-net like architecture. - Just use one set of configuration variables instead of a list - flat networks performed fine in this regard.	2020-07-09 17:34:51 -06:00
James Betker	12da993da8	More fixes...	2020-07-08 22:07:09 -06:00
James Betker	7d6eb28b87	More fixes	2020-07-08 22:00:57 -06:00
James Betker	b2507be13c	Fix up pixgan loss and pixdisc	2020-07-08 21:27:48 -06:00
James Betker	26a4a66d1c	Bug fixes and new gan mechanism - Removed a bunch of unnecessary image loggers. These were just consuming space and never being viewed - Got rid of support of artificial var_ref support. The new pixdisc is what i wanted to implement then - it's much better. - Add pixgan GAN mechanism. This is purpose-built for the pixdisc. It is intended to promote a healthy discriminator - Megabatchfactor was applied twice on metrics, fixed that Adds pix_gan (untested) which swaps a portion of the fake and real image with each other, then expects the discriminator to properly discriminate the swapped regions.	2020-07-08 17:40:26 -06:00
James Betker	4305be97b4	Update log metrics They should now be universal regardless of job configuration	2020-07-07 15:33:22 -06:00
James Betker	8a4eb8241d	SRG3 work Operates on top of a pre-trained SpineNET backbone (trained on CoCo 2017 with RetinaNet) This variant is extremely shallow.	2020-07-07 13:46:40 -06:00
James Betker	0acad81035	More SRG2 adjustments..	2020-07-06 22:40:40 -06:00
James Betker	086b2f0570	More bugs	2020-07-06 22:28:07 -06:00
James Betker	d4d4f85fc0	Bug fixes	2020-07-06 22:25:40 -06:00
James Betker	3c31bea1ac	SRG2 architectural changes	2020-07-06 22:22:29 -06:00

... 9 10 11 12 13 ...

1202 Commits