Disabling bitsandbytes optimization as default for now, in the off chance that it actually produces garbage (which shouldn't happen, there's no chance, if training at float16 from a model at float16 works fine, then this has to work)

pull/3/head
mrq 2023-02-23 03:22:59 +07:00
parent 918473807f
commit 58600274ac
3 changed files with 14 additions and 3 deletions

@ -1,3 +1,14 @@
# (QoL improvements for) Deep Learning Art School
This fork of [neonbjb/DL-Art-School](https://github.com/neonbjb/DL-Art-School/) contains a few fixes and QoL improvements, including but not limited to:
* sanity tidying, like:
- not outputing to `./DL-Art-School/experiments/`
- the custom module loader for networks/injectors getting fixed
- BitsAndBytes integration:
+ working but output untested: Adam/AdamW
+ toggles available in `./codes/torch_indermediary/__init__.py`
---
# Deep Learning Art School
Send your Pytorch model to art class!

@ -13,8 +13,8 @@ from torch.optim.adamw import AdamW
OVERRIDE_LINEAR = False
OVERRIDE_EMBEDDING = False
OVERRIDE_ADAM = True
OVERRIDE_ADAMW = True
OVERRIDE_ADAM = False # True
OVERRIDE_ADAMW = False # True
USE_STABLE_EMBEDDING = True
if OVERRIDE_LINEAR:

@ -258,7 +258,7 @@ class Trainer:
self.logger.info(message)
#### save models and training states
if self.current_step % opt['logger']['save_checkpoint_freq'] == 0:
if self.current_step > 0 and self.current_step % opt['logger']['save_checkpoint_freq'] == 0:
self.model.consolidate_state()
if self.rank <= 0:
self.logger.info('Saving models and training states.')