ai-voice-cloning/src/train.py

import os
import sys
import argparse
import yaml
import datetime

from torch.distributed.run import main as torchrun

# I don't want this invoked from an import
if __name__ != "__main__":
    raise Exception("Do not invoke this from an import")

parser = argparse.ArgumentParser()
parser.add_argument('--yaml', type=str, help='Path to training configuration file.', default='./training/voice/train.yml', nargs='+') # ugh
parser.add_argument('--launcher', choices=['none', 'pytorch'], default='none', help='Job launcher')
args = parser.parse_args()
args.yaml = " ".join(args.yaml) # absolutely disgusting
config_path = args.yaml

with open(config_path, 'r') as file:
    opt_config = yaml.safe_load(file)

# it'd be downright sugoi if I was able to install DLAS as a pip package
sys.path.insert(0, './modules/dlas/codes/')
sys.path.insert(0, './modules/dlas/')

# yucky override
if "bitsandbytes" in opt_config and not opt_config["bitsandbytes"]:
    os.environ['BITSANDBYTES_OVERRIDE_LINEAR'] = '0'
    os.environ['BITSANDBYTES_OVERRIDE_EMBEDDING'] = '0'
    os.environ['BITSANDBYTES_OVERRIDE_ADAM'] = '0'
    os.environ['BITSANDBYTES_OVERRIDE_ADAMW'] = '0'

import torch
from codes import train as tr
from utils import util, options as option

# this is effectively just copy pasted and cleaned up from the __main__ section of training.py
def train(config_path, launcher='none'):
    opt = option.parse(config_path, is_train=True)

    if launcher == 'none' and opt['gpus'] > 1:
        return torchrun([f"--nproc_per_node={opt['gpus']}", "./src/train.py", "--yaml", config_path, "--launcher=pytorch"])

    trainer = tr.Trainer()
    if launcher == 'none':
        opt['dist'] = False
        trainer.rank = -1
        if len(opt['gpu_ids']) == 1:
            torch.cuda.set_device(opt['gpu_ids'][0])
        print('Disabled distributed training.')
    else:
        opt['dist'] = True
        tr.init_dist('nccl', timeout=datetime.timedelta(seconds=5*60))
        trainer.world_size = torch.distributed.get_world_size()
        trainer.rank = torch.distributed.get_rank()
        torch.cuda.set_device(torch.distributed.get_rank())

    trainer.init(config_path, opt, launcher, '')
    trainer.do_training()

try:
    import torch_intermediary
    if torch_intermediary.OVERRIDE_ADAM:
        print("Using BitsAndBytes optimizations")
    else:
        print("NOT using BitsAndBytes optimizations")
except Exception as e:
    pass

train(config_path, args.launcher)
training added, seems to work, need to test it more 2023-02-17 16:29:27 +00:00			`import os`
			`import sys`
huge success 2023-02-23 06:24:54 +00:00			`import argparse`
Added option to disable bitsandbytesoptimizations for systems that do not support it (systems without a Turing-onward Nvidia card), saves use of float16 and bitsandbytes for training into the config json 2023-02-26 01:57:56 +00:00			`import yaml`
;) 2023-03-14 15:48:09 +00:00			`import datetime`
fixed some files not copying for bitsandbytes (I was wrong to assume it copied folders too), fixed stopping generating and training, some other thing that I forgot since it's been slowly worked on in my small free times 2023-02-24 23:13:13 +00:00
;) 2023-03-14 15:48:09 +00:00			`from torch.distributed.run import main as torchrun`
fixed some files not copying for bitsandbytes (I was wrong to assume it copied folders too), fixed stopping generating and training, some other thing that I forgot since it's been slowly worked on in my small free times 2023-02-24 23:13:13 +00:00
;) 2023-03-14 15:48:09 +00:00			`# I don't want this invoked from an import`
			`if __name__ != "__main__":`
			`raise Exception("Do not invoke this from an import")`
Added option to disable bitsandbytesoptimizations for systems that do not support it (systems without a Turing-onward Nvidia card), saves use of float16 and bitsandbytes for training into the config json 2023-02-26 01:57:56 +00:00
;) 2023-03-14 15:48:09 +00:00			`parser = argparse.ArgumentParser()`
			`parser.add_argument('--yaml', type=str, help='Path to training configuration file.', default='./training/voice/train.yml', nargs='+') # ugh`
			`parser.add_argument('--launcher', choices=['none', 'pytorch'], default='none', help='Job launcher')`
			`args = parser.parse_args()`
			`args.yaml = " ".join(args.yaml) # absolutely disgusting`
			`config_path = args.yaml`
fixed some files not copying for bitsandbytes (I was wrong to assume it copied folders too), fixed stopping generating and training, some other thing that I forgot since it's been slowly worked on in my small free times 2023-02-24 23:13:13 +00:00
;) 2023-03-14 15:48:09 +00:00			`with open(config_path, 'r') as file:`
			`opt_config = yaml.safe_load(file)`
cleanup, "injected" dvae.pth to download through tortoise's model loader, so I don't need to keep copying it 2023-02-17 19:06:05 +00:00
;) 2023-03-14 15:48:09 +00:00			`# it'd be downright sugoi if I was able to install DLAS as a pip package`
while I'm breaking things, migrating dependencies to modules folder for tidiness 2023-03-09 04:03:57 +00:00			`sys.path.insert(0, './modules/dlas/codes/')`
			`sys.path.insert(0, './modules/dlas/')`
training added, seems to work, need to test it more 2023-02-17 16:29:27 +00:00
;) 2023-03-14 15:48:09 +00:00			`# yucky override`
			`if "bitsandbytes" in opt_config and not opt_config["bitsandbytes"]:`
			`os.environ['BITSANDBYTES_OVERRIDE_LINEAR'] = '0'`
			`os.environ['BITSANDBYTES_OVERRIDE_EMBEDDING'] = '0'`
			`os.environ['BITSANDBYTES_OVERRIDE_ADAM'] = '0'`
			`os.environ['BITSANDBYTES_OVERRIDE_ADAMW'] = '0'`
cleanup, "injected" dvae.pth to download through tortoise's model loader, so I don't need to keep copying it 2023-02-17 19:06:05 +00:00
huge success 2023-02-23 06:24:54 +00:00			`import torch`
training added, seems to work, need to test it more 2023-02-17 16:29:27 +00:00			`from codes import train as tr`
			`from utils import util, options as option`

cleanup, "injected" dvae.pth to download through tortoise's model loader, so I don't need to keep copying it 2023-02-17 19:06:05 +00:00			`# this is effectively just copy pasted and cleaned up from the __main__ section of training.py`
;) 2023-03-14 15:48:09 +00:00			`def train(config_path, launcher='none'):`
			`opt = option.parse(config_path, is_train=True)`
a bit of UI cleanup, import multiple audio files at once, actually shows progress when importing voices, hides audio metadata / latents if no generated settings are detected, preparing datasets shows its progress, saving a training YAML shows a message when done, training now works within the web UI, training output shows to web UI, provided notebook is cleaned up and uses a venv, etc. 2023-02-18 02:07:22 +00:00
simplified spawning the training process by having it spawn the distributed training processes in the train.py script, so it should work on Windows too 2023-03-11 01:37:00 +00:00			`if launcher == 'none' and opt['gpus'] > 1:`
;) 2023-03-14 15:48:09 +00:00			`return torchrun([f"--nproc_per_node={opt['gpus']}", "./src/train.py", "--yaml", config_path, "--launcher=pytorch"])`
simplified spawning the training process by having it spawn the distributed training processes in the train.py script, so it should work on Windows too 2023-03-11 01:37:00 +00:00
			`trainer = tr.Trainer()`
;) 2023-03-14 15:48:09 +00:00			`if launcher == 'none':`
a bit of UI cleanup, import multiple audio files at once, actually shows progress when importing voices, hides audio metadata / latents if no generated settings are detected, preparing datasets shows its progress, saving a training YAML shows a message when done, training now works within the web UI, training output shows to web UI, provided notebook is cleaned up and uses a venv, etc. 2023-02-18 02:07:22 +00:00			`opt['dist'] = False`
			`trainer.rank = -1`
			`if len(opt['gpu_ids']) == 1:`
			`torch.cuda.set_device(opt['gpu_ids'][0])`
			`print('Disabled distributed training.')`
			`else:`
			`opt['dist'] = True`
big cleanup to make my life easier when i add more parameters 2023-03-09 00:26:47 +00:00			`tr.init_dist('nccl', timeout=datetime.timedelta(seconds=5*60))`
a bit of UI cleanup, import multiple audio files at once, actually shows progress when importing voices, hides audio metadata / latents if no generated settings are detected, preparing datasets shows its progress, saving a training YAML shows a message when done, training now works within the web UI, training output shows to web UI, provided notebook is cleaned up and uses a venv, etc. 2023-02-18 02:07:22 +00:00			`trainer.world_size = torch.distributed.get_world_size()`
			`trainer.rank = torch.distributed.get_rank()`
			`torch.cuda.set_device(torch.distributed.get_rank())`

;) 2023-03-14 15:48:09 +00:00			`trainer.init(config_path, opt, launcher, '')`
a bit of UI cleanup, import multiple audio files at once, actually shows progress when importing voices, hides audio metadata / latents if no generated settings are detected, preparing datasets shows its progress, saving a training YAML shows a message when done, training now works within the web UI, training output shows to web UI, provided notebook is cleaned up and uses a venv, etc. 2023-02-18 02:07:22 +00:00			`trainer.do_training()`

;) 2023-03-14 15:48:09 +00:00			`try:`
			`import torch_intermediary`
			`if torch_intermediary.OVERRIDE_ADAM:`
			`print("Using BitsAndBytes optimizations")`
			`else:`
			`print("NOT using BitsAndBytes optimizations")`
			`except Exception as e:`
			`pass`
removed the logic to toggle BNB capabilities, since I guess I can't do that from outside the module 2023-02-23 07:05:39 +00:00
;) 2023-03-14 15:48:09 +00:00			`train(config_path, args.launcher)`