forked from ecker/tortoise-tts
Noticed that the autoregressive batch size was being set off of VRAM size. Adjusted to scale for the VRAM capacity of 90 series GPUs. In this case, 16 -> 32 batches. Using the standard pre-set with ChungusVGAN, I went from 16 steps to 8. Over an average of 3 runs, I achieved an average of 294 seconds with 16 batches, to 234 seconds with 32. Can't complain at a 1.2x speed increase with functionally 2 lines of code. Can't complain. I restarted tortoise each run, and executing ```torch.cuda.empty_cache()``` just before loading the autoregressive model to clean the memory cache each time. |
||
|---|---|---|
| .. | ||
| data | ||
| models | ||
| utils | ||
| __init__.py | ||
| api.py | ||
| do_tts.py | ||
| eval.py | ||
| get_conditioning_latents.py | ||
| is_this_from_tortoise.py | ||
| read.py | ||