DL-Art-School

Author	SHA1	Message	Date
James Betker	607ff3c67c	RRDB with bypass	2020-10-29 09:39:45 -06:00
James Betker	da53090ce6	More adjustments to support distributed training with teco & on multi_modal_train	2020-10-27 20:58:03 -06:00
James Betker	2a3eec8fd7	Fix some distributed training snafus	2020-10-27 15:24:05 -06:00
James Betker	ff58c6484a	Fixes to unified chunk datasets to support stereoscopic training	2020-10-26 11:12:22 -06:00
James Betker	9c3d059ef0	Updates to be able to train flownet2 in ExtensibleTrainer Only supports basic losses for now, though.	2020-10-24 11:56:39 -06:00
James Betker	8636492db0	Copy train.py mods to train2	2020-10-22 17:16:36 -06:00
James Betker	e9c0b9f0fd	More adjustments to support multi-modal training Specifically - looks like at least MSE loss cannot handle autocasted tensors	2020-10-22 16:49:34 -06:00
James Betker	76789a456f	Class-ify train.py and workon multi-modal trainer	2020-10-22 16:15:31 -06:00
James Betker	3e3d2af1f3	Add multi-modal trainer	2020-10-22 13:27:32 -06:00
James Betker	5753e77d67	ChainedGen: Output debugging information on blocks	2020-10-21 16:36:23 -06:00
James Betker	d8c6a4bbb8	Misc	2020-10-20 12:56:52 -06:00
James Betker	331c40f0c8	Allow starting step to be forced Useful for testing purposes or to force a validation.	2020-10-19 15:23:04 -06:00
James Betker	981d64413b	Support validation over a custom injector Also re-enable PSNR	2020-10-19 11:01:56 -06:00
James Betker	9ead2c0a08	Multiscale training in!	2020-10-17 22:54:12 -06:00
James Betker	d856378b2e	Add ChainedGenWithStructure	2020-10-16 20:44:36 -06:00
James Betker	6f8705e8cb	SSGSimpler network	2020-10-15 17:18:44 -06:00
James Betker	24792bdb4f	Codebase cleanup Removed a lot of legacy stuff I have no intent on using again. Plan is to shape this repo into something more extensible (get it? hah!)	2020-10-13 20:56:39 -06:00
James Betker	17d78195ee	Mods to SRG to support returning switch logits	2020-10-13 20:46:37 -06:00
James Betker	2bc5701b10	misc	2020-10-12 10:21:25 -06:00
James Betker	b2c4b2a16d	Move gpu_ids out of if statement	2020-10-06 20:40:20 -06:00
James Betker	0e3ea63a14	Misc	2020-10-05 18:01:50 -06:00
James Betker	ffd069fd97	Lots of SSG work - Checkpointed pretty much the entire model - enabling recurrent inputs - Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)	2020-10-04 20:48:08 -06:00
James Betker	fc396baf1a	Move loaded_options to util Doesn't seem to work with python 3.6	2020-10-03 20:29:06 -06:00
James Betker	3cbb9ecd45	Misc	2020-10-03 16:15:42 -06:00
James Betker	21d3bb83b2	Use tqdm reporting with validation	2020-10-03 11:16:39 -06:00
James Betker	6c9718ad64	Don't log if you aren't 0 rank	2020-10-03 11:14:13 -06:00
James Betker	19a4075e1e	Allow checkpointing to be disabled in the options file Also makes options a global variable for usage in utils.	2020-10-03 11:03:28 -06:00
James Betker	c9a9e5c525	Prompt user for gpu_id if multiple gpus are detected	2020-10-01 17:24:50 -06:00
James Betker	0b5a033503	spsr7 + cleanup SPSR7 adds ref onto spsr6, makes more "common sense" mods.	2020-09-29 16:59:26 -06:00
James Betker	eb12b5f887	Misc	2020-09-26 21:27:17 -06:00
James Betker	254cb1e915	More dataset integration work	2020-09-25 22:19:38 -06:00
James Betker	f211575e9d	Save models before validation Validation often fails with OOM, wasting hours of training time. Save models first.	2020-09-16 08:17:17 -06:00
James Betker	c833bd1eac	Misc changes	2020-09-15 20:57:59 -06:00
James Betker	747ded2bf7	Fixes to the spsr3 Some lessons learned: - Biases are fairly important as a relief valve. They dont need to be everywhere, but most computationally heavy branches should have a bias. - GroupNorm in SPSR is not a great idea. Since image gradients are represented in this model, normal means and standard deviations are not applicable. (imggrad has a high representation of 0). - Don't fuck with the mainline of any generative model. As much as possible, all additions should be done through residual connections. Never pollute the mainline with reference data, do that in branches. It basically leaves the mode untrainable.	2020-09-09 15:28:14 -06:00
James Betker	c04f244802	More mods	2020-09-08 20:36:27 -06:00
James Betker	e6207d4c50	SPSR3 work SPSR3 is meant to fix whatever is causing the switching units inside of the newer SPSR architectures to fail and basically not use the multiplexers.	2020-09-08 15:14:23 -06:00
James Betker	a18ece62ee	Add updated spsr net for test	2020-09-07 17:01:48 -06:00
James Betker	e8613041c0	Add novograd optimizer	2020-09-06 17:27:08 -06:00
James Betker	6657a406ac	Mods needed to support training a corruptor again: - Allow original SPSRNet to have a specifiable block increment - Cleanup - Bug fixes in code that hasnt been touched in awhile.	2020-09-04 15:33:39 -06:00
James Betker	886d59d5df	Misc fixes & adjustments	2020-09-01 07:58:11 -06:00
James Betker	0a9b85f239	Fix vgg_gn input_img_factor	2020-08-31 09:50:30 -06:00
James Betker	4b4d08bdec	Enable testing in ExtensibleTrainer, fix it in SRGAN_model Also compute fea loss for this.	2020-08-31 09:41:48 -06:00
James Betker	623f3b99b2	Stupid pathing..	2020-08-26 17:58:24 -06:00
James Betker	80aa83bfd2	Try copytree for tb_logger again.	2020-08-26 17:55:02 -06:00
James Betker	b593d8e7c3	Save tb_logger to alt_path	2020-08-26 17:45:07 -06:00
James Betker	f35b3ad28f	Fix val behavior for ExtensibleTrainer	2020-08-26 08:44:22 -06:00
James Betker	19487d9bbd	Fix distributed launch for large distributed runs	2020-08-25 15:42:59 -06:00
James Betker	a65b07607c	Reference network	2020-08-25 11:56:59 -06:00
James Betker	f9276007a8	More fixes to corrupt_fea	2020-08-23 17:52:18 -06:00
James Betker	dffc15184d	More ExtensibleTrainer work It runs now, just need to debug it to reach performance parity with SRGAN. Sweet.	2020-08-23 17:22:45 -06:00

1 2 3

107 Commits