forked from mrq/tortoise-tts
e650800447
Noticed that the autoregressive batch size was being set off of VRAM size. Adjusted to scale for the VRAM capacity of 90 series GPUs. In this case, 16 -> 32 batches. Using the standard pre-set with ChungusVGAN, I went from 16 steps to 8. Over an average of 3 runs, I achieved an average of 294 seconds with 16 batches, to 234 seconds with 32. Can't complain at a 1.2x speed increase with functionally 2 lines of code. Can't complain. I restarted tortoise each run, and executing ```torch.cuda.empty_cache()``` just before loading the autoregressive model to clean the memory cache each time. |
||
---|---|---|
.. | ||
data | ||
models | ||
utils | ||
__init__.py | ||
api.py | ||
do_tts.py | ||
eval.py | ||
get_conditioning_latents.py | ||
is_this_from_tortoise.py | ||
read.py |