Commit Graph

27 Commits

Author SHA1 Message Date
James Betker
6f8705e8cb SSGSimpler network 2020-10-15 17:18:44 -06:00
James Betker
9a5d6162e9 Add the "BigSwitch" 2020-10-13 10:11:10 -06:00
James Betker
ca523215c6 Fix recurrent std in arch 2020-10-12 17:42:32 -06:00
James Betker
597b6e92d6 Add ssgr1 recurrence 2020-10-12 17:18:19 -06:00
James Betker
ce163ad4a9 Update SSGdeep 2020-10-12 10:22:08 -06:00
James Betker
58d8bf8f69 Add network architecture built for teco 2020-10-09 08:40:14 -06:00
James Betker
4c85ee51a4 Converge SSG architectures into unified switching base class
Also adds attention norm histogram to logging
2020-10-08 17:23:21 -06:00
James Betker
fba29d7dcc Move to apex distributeddataparallel and add switch all_reduce
Torch's distributed_data_parallel is missing "delay_allreduce", which is
necessary to get gradient checkpointing to work with recurrent models.
2020-10-08 11:20:05 -06:00
James Betker
969bcd9021 Use local checkpoint in SSG 2020-10-08 08:54:46 -06:00
James Betker
c96f5b2686 Import switched_conv as a submodule 2020-10-07 23:10:54 -06:00
James Betker
c352c8bce4 More tecogan fixes 2020-10-07 12:41:17 -06:00
James Betker
2f2e3f33f8 StackedSwitchedGenerator_5lyr 2020-10-06 20:39:32 -06:00
James Betker
4111942ada Support attention deferral in deep ssgr 2020-10-05 19:35:55 -06:00
James Betker
51044929af Don't compute attention statistics on multiple generator invocations of the same data 2020-10-05 00:34:29 -06:00
James Betker
c3ef8a4a31 Stacked switches - return a tuple 2020-10-04 21:02:24 -06:00
James Betker
ffd069fd97 Lots of SSG work
- Checkpointed pretty much the entire model - enabling recurrent inputs
- Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)
2020-10-04 20:48:08 -06:00
James Betker
aca2c7ab41 Full checkpoint-ize SSG1 2020-10-04 18:24:52 -06:00
James Betker
e3294939b0 Revert "SSG: offer option to use BN-based attention normalization"
Didn't work. Oh well.

This reverts commit 5cd2b37591.
2020-10-03 17:54:53 -06:00
James Betker
5cd2b37591 SSG: offer option to use BN-based attention normalization
Not sure how this is going to work, lets try it.
2020-10-03 16:16:19 -06:00
James Betker
19a4075e1e Allow checkpointing to be disabled in the options file
Also makes options a global variable for usage in utils.
2020-10-03 11:03:28 -06:00
James Betker
d9ae970fd9 SSG update 2020-10-01 11:27:51 -06:00
James Betker
f40beb5460 Add 'before' and 'after' defs to injections, steps and optimizers 2020-09-22 17:03:22 -06:00
James Betker
53a5657850 Fix SSGR 2020-09-20 19:07:15 -06:00
James Betker
fe82785ba5 Add some new architectures to ssg 2020-09-19 21:47:10 -06:00
James Betker
9a17ade550 Some convenience adjustments to ExtensibleTrainer 2020-09-17 21:05:32 -06:00
James Betker
723754c133 Update attention debugger outputting for SSG 2020-09-16 13:09:46 -06:00
James Betker
0918430572 SSG network
This branches off of SPSR. It is identical but substantially reduced
in complexity. It's intended to be my long term working arch.
2020-09-15 20:59:24 -06:00