forked from ecker/DL-Art-School
Basically: stylegan2 makes use of gradient-based normalizers. These make it so that I cannot use gradient checkpointing. But I love gradient checkpointing. It makes things really, really fast and memory conscious. So - only don't checkpoint when we run the regularizer loss. This is a bit messy, but speeds up training by at least 20%. Also: pytorch: please make checkpointing a first class citizen. |
||
|---|---|---|
| .. | ||
| archs | ||
| experiments | ||
| flownet2@db2b7899ea | ||
| steps | ||
| __init__.py | ||
| base_model.py | ||
| ExtensibleTrainer.py | ||
| feature_model.py | ||
| loss.py | ||
| lr_scheduler.py | ||
| networks.py | ||