DL-Art-School/codes/models
James Betker fba29d7dcc Move to apex distributeddataparallel and add switch all_reduce
Torch's distributed_data_parallel is missing "delay_allreduce", which is
necessary to get gradient checkpointing to work with recurrent models.
2020-10-08 11:20:05 -06:00
..
archs Move to apex distributeddataparallel and add switch all_reduce 2020-10-08 11:20:05 -06:00
experiments
flownet2@2e9e010c98
layers
steps
__init__.py
base_model.py Move to apex distributeddataparallel and add switch all_reduce 2020-10-08 11:20:05 -06:00
ExtensibleTrainer.py Move to apex distributeddataparallel and add switch all_reduce 2020-10-08 11:20:05 -06:00
feature_model.py
loss.py
lr_scheduler.py
networks.py
novograd.py
SR_model.py
SRGAN_model.py