James Betker
1eb516d686
Fix more distributed bugs
2020-10-08 14:32:45 -06:00
James Betker
fba29d7dcc
Move to apex distributeddataparallel and add switch all_reduce
...
Torch's distributed_data_parallel is missing "delay_allreduce", which is
necessary to get gradient checkpointing to work with recurrent models.
2020-10-08 11:20:05 -06:00
James Betker
c174ac0fd5
Allow tecogan to support generators that only output a tensor (instead of a list)
2020-10-08 09:26:25 -06:00
James Betker
969bcd9021
Use local checkpoint in SSG
2020-10-08 08:54:46 -06:00
James Betker
c93dd623d7
Tecogan losses work
2020-10-07 23:11:58 -06:00
James Betker
c96f5b2686
Import switched_conv as a submodule
2020-10-07 23:10:54 -06:00
James Betker
c352c8bce4
More tecogan fixes
2020-10-07 12:41:17 -06:00
James Betker
1c44d395af
Tecogan work
...
Its training! There's still probably plenty of bugs though..
2020-10-07 09:03:30 -06:00
James Betker
e9d7371a61
Add concatenate injector
2020-10-07 09:02:42 -06:00
James Betker
8a7e993aea
Merge remote-tracking branch 'origin/gan_lab' into gan_lab
2020-10-06 20:41:58 -06:00
James Betker
1e415b249b
Add tag that can be applied to prevent parameter training
2020-10-06 20:39:49 -06:00
James Betker
2f2e3f33f8
StackedSwitchedGenerator_5lyr
2020-10-06 20:39:32 -06:00
James Betker
6217b48e3f
Fix spsr_arch bug
2020-10-06 20:38:47 -06:00
James Betker
cffc596141
Integrate flownet2 into codebase, add teco visual debugs
2020-10-06 20:35:39 -06:00
James Betker
e4b89a172f
Reduce spsr7 memory usage
2020-10-05 22:05:56 -06:00
James Betker
4111942ada
Support attention deferral in deep ssgr
2020-10-05 19:35:55 -06:00
James Betker
840927063a
Work on tecogan losses
2020-10-05 19:35:28 -06:00
James Betker
2875822024
SPSR9 arch
...
takes some of the stuff I learned with SGSR yesterday and applies it to spsr
2020-10-05 08:47:51 -06:00
James Betker
51044929af
Don't compute attention statistics on multiple generator invocations of the same data
2020-10-05 00:34:29 -06:00
James Betker
e760658fdb
Another fix..
2020-10-04 21:08:00 -06:00
James Betker
a890e3a9c0
Fix geometric loss not handling 0 index
2020-10-04 21:05:01 -06:00
James Betker
c3ef8a4a31
Stacked switches - return a tuple
2020-10-04 21:02:24 -06:00
James Betker
13f97e1e97
Add recursive loss
2020-10-04 20:48:15 -06:00
James Betker
ffd069fd97
Lots of SSG work
...
- Checkpointed pretty much the entire model - enabling recurrent inputs
- Added two new models for test - adding depth (again) and removing SPSR (in lieu of the new losses)
2020-10-04 20:48:08 -06:00
James Betker
aca2c7ab41
Full checkpoint-ize SSG1
2020-10-04 18:24:52 -06:00
James Betker
e3294939b0
Revert "SSG: offer option to use BN-based attention normalization"
...
Didn't work. Oh well.
This reverts commit 5cd2b37591
.
2020-10-03 17:54:53 -06:00
James Betker
5cd2b37591
SSG: offer option to use BN-based attention normalization
...
Not sure how this is going to work, lets try it.
2020-10-03 16:16:19 -06:00
James Betker
9b4ed82093
Get rid of unused convs in spsr7
2020-10-03 11:36:26 -06:00
James Betker
3561cc164d
Fix up fea_loss calculator (for validation)
...
Not sure how this was working in regular training mode, but it
was failing in DDP.
2020-10-03 11:19:20 -06:00
James Betker
6c9718ad64
Don't log if you aren't 0 rank
2020-10-03 11:14:13 -06:00
James Betker
922b1d76df
Don't record visuals when not on rank 0
2020-10-03 11:10:03 -06:00
James Betker
8197fd646f
Don't accumulate losses for metrics when the loss isn't a tensor
2020-10-03 11:03:55 -06:00
James Betker
19a4075e1e
Allow checkpointing to be disabled in the options file
...
Also makes options a global variable for usage in utils.
2020-10-03 11:03:28 -06:00
James Betker
dd9d7b27ac
Add more sophisticated mechanism for balancing GAN losses
2020-10-02 22:53:42 -06:00
James Betker
39865ca3df
TOTAL_loss, dumbo
2020-10-02 21:06:10 -06:00
James Betker
4e44fcd655
Loss accumulator fix
2020-10-02 20:55:33 -06:00
James Betker
567b4d50a4
ExtensibleTrainer - don't compute backward when there is no loss
2020-10-02 20:54:06 -06:00
James Betker
146a9125f2
Modify geometric & translational losses so they can be used with embeddings
2020-10-02 20:40:13 -06:00
James Betker
e30a1443cd
Change sw2 refs
2020-10-02 09:01:18 -06:00
James Betker
e38716925f
Fix spsr8 class init
2020-10-02 09:00:18 -06:00
James Betker
35469f08e2
Spsr 8
2020-10-02 08:58:15 -06:00
James Betker
aa4fd89018
resnext with groupnorm
2020-10-01 15:49:28 -06:00
James Betker
8beaa47933
resnext discriminator
2020-10-01 11:48:14 -06:00
James Betker
55f2764fef
Allow fixup50 to be used as a discriminator
2020-10-01 11:28:18 -06:00
James Betker
7986185fcb
Change 'mod_step' to 'every'
2020-10-01 11:28:06 -06:00
James Betker
d9ae970fd9
SSG update
2020-10-01 11:27:51 -06:00
James Betker
e3053e4e55
Exchange SpsrNet for SpsrNetSimplified
2020-09-30 17:01:04 -06:00
James Betker
66d4512029
Fix up translational equivariance loss so it's ready for prime time
2020-09-30 12:01:00 -06:00
James Betker
896b4f5be2
Revert "spsr7 adjustments"
...
This reverts commit 9fee1cec71
.
2020-09-29 18:30:41 -06:00
James Betker
9fee1cec71
spsr7 adjustments
2020-09-29 17:19:59 -06:00