Commit Graph

7 Commits

Author SHA1 Message Date
deviandice
e650800447 Update 'tortoise/utils/device.py'
Noticed that the autoregressive batch size was being set off of VRAM size. Adjusted to scale for the VRAM capacity of 90 series GPUs. In this case, 16 -> 32 batches. 

Using the standard pre-set with ChungusVGAN, I went from 16 steps to 8.
Over an average of 3 runs, I achieved an average of 294 seconds with 16 batches, to 234 seconds with 32. Can't complain at a 1.2x speed increase with functionally 2 lines of code. Can't complain. 

I restarted tortoise each run, and executing ```torch.cuda.empty_cache()``` just before loading the autoregressive model to clean the memory cache each time.
2023-03-07 14:05:27 +00:00
mrq
d159346572 oops 2023-02-16 13:23:07 +00:00
mrq
eca61af016 actually for real fixed incrementing filenames because i had a regex that actually only worked if candidates or lines>1, cuda now takes priority over dml if you're a nut with both of them installed because you can just specify an override anyways 2023-02-16 01:06:32 +00:00
mrq
ec80ca632b added setting "device-override", less naively decide the number to use for results, some other thing 2023-02-15 21:51:22 +00:00
mrq
729be135ef Added option: listen path 2023-02-09 20:42:38 +00:00
mrq
3f8302a680 I didn't have to suck off a wizard for DirectML support (courtesy of https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7600 for leading the way) 2023-02-09 05:05:21 +00:00
mrq
b23d6b4b4c owari da... 2023-02-09 01:53:25 +00:00