a-One-Fan
27024a7b38
- Add an argument to use oneAPI when training - Use it in the oneAPI startup - Set an env var when doing so - Initialize distributed training with ccl when doing so Intel does not and will not support non-distributed training. I think that's a good decision. The message that training will happen with oneAPI gets printed twice.
8 lines
300 B
Bash
Executable File
8 lines
300 B
Bash
Executable File
#!/bin/bash
|
|
ulimit -Sn `ulimit -Hn` # ROCm is a bitch
|
|
conda deactivate > /dev/null 2>&1 # Some things with oneAPI happen with conda. Deactivate conda if it is active to avoid spam.
|
|
source ./venv/bin/activate
|
|
source /opt/intel/oneapi/setvars.sh
|
|
ipexrun ./src/main.py "$@" --training-oneapi
|
|
deactivate
|