forked from mrq/tortoise-tts
e650800447
Noticed that the autoregressive batch size was being set off of VRAM size. Adjusted to scale for the VRAM capacity of 90 series GPUs. In this case, 16 -> 32 batches. Using the standard pre-set with ChungusVGAN, I went from 16 steps to 8. Over an average of 3 runs, I achieved an average of 294 seconds with 16 batches, to 234 seconds with 32. Can't complain at a 1.2x speed increase with functionally 2 lines of code. Can't complain. I restarted tortoise each run, and executing ```torch.cuda.empty_cache()``` just before loading the autoregressive model to clean the memory cache each time. |
||
---|---|---|
.. | ||
__init__.py | ||
audio.py | ||
device.py | ||
diffusion.py | ||
stft.py | ||
text.py | ||
tokenizer.py | ||
torch_intermediary.py | ||
typical_sampling.py | ||
wav2vec_alignment.py |