Illegal Instruction after setting quality to anything above ultra-fast? #354

New Issue

dyharlan · 2023-08-28T12:44:21Z

dyharlan commented

2023-08-28 12:44:21 +00:00

Whenever I try to generate a sound using fast preset or above, I now get this error:

Loading voice: CJD with model d1f79232
Loading voice: CJD
Reading from latent: ./voices/CJD//cond_latents_d1f79232.pth
Traceback (most recent call last):
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1075, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/helpers.py", line 587, in tracked_fn
    response = fn(*args)
               ^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/src/webui.py", line 94, in generate_proxy
    raise e
  File "/home/harlan/src/ai-voice-cloning/src/webui.py", line 88, in generate_proxy
    sample, outputs, stats = generate(**kwargs)
                             ^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/src/utils.py", line 350, in generate
    return generate_tortoise(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/src/utils.py", line 1210, in generate_tortoise
    gen, additionals = tts.tts(cut_text, **settings )
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/modules/tortoise-tts/tortoise/api.py", line 704, in tts
    codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens,
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/modules/tortoise-tts/tortoise/models/autoregressive.py", line 513, in inference_speech
    gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/transformers/generation_utils.py", line 1310, in generate
    return self.sample(
           ^^^^^^^^^^^^
  File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/transformers/generation_utils.py", line 1963, in sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: an illegal instruction was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

I'm running an RTX 3060 12gb

Whenever I try to generate a sound using fast preset or above, I now get this error: ``` Loading voice: CJD with model d1f79232 Loading voice: CJD Reading from latent: ./voices/CJD//cond_latents_d1f79232.pth Traceback (most recent call last): File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/routes.py", line 394, in run_predict output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1075, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/blocks.py", line 884, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/gradio/helpers.py", line 587, in tracked_fn response = fn(*args) ^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/src/webui.py", line 94, in generate_proxy raise e File "/home/harlan/src/ai-voice-cloning/src/webui.py", line 88, in generate_proxy sample, outputs, stats = generate(**kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/src/utils.py", line 350, in generate return generate_tortoise(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/src/utils.py", line 1210, in generate_tortoise gen, additionals = tts.tts(cut_text, **settings ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/modules/tortoise-tts/tortoise/api.py", line 704, in tts codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/modules/tortoise-tts/tortoise/models/autoregressive.py", line 513, in inference_speech gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/transformers/generation_utils.py", line 1310, in generate return self.sample( ^^^^^^^^^^^^ File "/home/harlan/src/ai-voice-cloning/venv/lib/python3.11/site-packages/transformers/generation_utils.py", line 1963, in sample next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: CUDA error: an illegal instruction was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. ``` I'm running an RTX 3060 12gb

mrq commented

2023-08-28 16:17:58 +00:00

mmm...

Verify what your batch size is set to in settings. If it's something higher than 16, then the error message might just be a misnomer, and the true issue is that you're just running out of VRAM during the pass. To be safe, keep an eye on VRAM usage as you generate, although I can't remember what batch sizes correlates to, as it's been a while since I've played around with those settings.

But desu, the presets are also misnomers, as "quality" from more AR samples just means that there's more utterances to sample against as the "best", hence the correlation with quality. You can definitely get away with 16, especially with finetuned models.

mmm... Verify what your batch size is set to in settings. If it's something higher than 16, then the error message might just be a misnomer, and the true issue is that you're just running out of VRAM during the pass. To be safe, keep an eye on VRAM usage as you generate, although I can't remember what batch sizes correlates to, as it's been a while since I've played around with those settings. But desu, the presets are also misnomers, as "quality" from more AR samples just means that there's more utterances to sample against as the "best", hence the correlation with quality. You can definitely get away with 16, especially with finetuned models.

Sign in to join this conversation.