forked from mrq/DL-Art-School
44a19cd37c
Basically: stylegan2 makes use of gradient-based normalizers. These make it so that I cannot use gradient checkpointing. But I love gradient checkpointing. It makes things really, really fast and memory conscious. So - only don't checkpoint when we run the regularizer loss. This is a bit messy, but speeds up training by at least 20%. Also: pytorch: please make checkpointing a first class citizen. |
||
---|---|---|
.. | ||
archs | ||
experiments | ||
flownet2@db2b7899ea | ||
steps | ||
__init__.py | ||
base_model.py | ||
ExtensibleTrainer.py | ||
feature_model.py | ||
loss.py | ||
lr_scheduler.py | ||
networks.py |