ImportError: DLL load failed while importing torch_directml_native: The specified process was not found #260

Open
opened 2023-06-08 21:27:00 +00:00 by Milor123 · 11 comments

Hi guys, i am trying runing this project in windows 11 22H2, with DirectML, AMD rx6700xt

>(venv) C:\Users\NoeXVanitasXJunk\ai-voice-cloning>start.bat

(venv) C:\Users\NoeXVanitasXJunk\ai-voice-cloning>call .\venv\Scripts\activate.bat
Traceback (most recent call last):
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\src\main.py", line 11, in <module>
    from utils import *
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\src\utils.py", line 40, in <module>
    from tortoise.api import TextToSpeech as TorToise_TTS, MODELS, get_model_path, pad_or_truncate
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 16, in <module>
    from tortoise.models.diffusion_decoder import DiffusionTts
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\models\diffusion_decoder.py", line 11, in <module>
    from tortoise.utils.device import get_device_name
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\device.py", line 74, in <module>
    def get_device_vram( name=get_device_name() ):
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\device.py", line 54, in get_device_name
    elif has_dml():
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\device.py", line 36, in has_dml
    import torch_directml
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\torch_directml\__init__.py", line 16, in <module>
    import torch_directml_native
ImportError: DLL load failed while importing torch_directml_native: No se encontró el proceso especificado.
Presione una tecla para continuar . . .

>(venv) C:\Users\NoeXVanitasXJunk\ai-voice-cloning>python --version
Python 3.9.13

Note after edit: test delete venv and reinstall with Python 3.10.6, but get the same error :/

Thank u very much! Please help me. I want use this.

Hi guys, i am trying runing this project in windows 11 22H2, with DirectML, AMD rx6700xt \>(venv) C:\Users\NoeXVanitasXJunk\ai-voice-cloning>start.bat ``` (venv) C:\Users\NoeXVanitasXJunk\ai-voice-cloning>call .\venv\Scripts\activate.bat Traceback (most recent call last): File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\src\main.py", line 11, in <module> from utils import * File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\src\utils.py", line 40, in <module> from tortoise.api import TextToSpeech as TorToise_TTS, MODELS, get_model_path, pad_or_truncate File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 16, in <module> from tortoise.models.diffusion_decoder import DiffusionTts File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\models\diffusion_decoder.py", line 11, in <module> from tortoise.utils.device import get_device_name File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\device.py", line 74, in <module> def get_device_vram( name=get_device_name() ): File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\device.py", line 54, in get_device_name elif has_dml(): File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\device.py", line 36, in has_dml import torch_directml File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\torch_directml\__init__.py", line 16, in <module> import torch_directml_native ImportError: DLL load failed while importing torch_directml_native: No se encontró el proceso especificado. Presione una tecla para continuar . . . ``` \>(venv) C:\Users\NoeXVanitasXJunk\ai-voice-cloning>python --version Python 3.9.13 Note after edit: test delete venv and reinstall with Python 3.10.6, but get the same error :/ Thank u very much! Please help me. I want use this.

Did you run setup-directml.bat ?

Did you run `setup-directml.bat` ?
Author

Yes bro, I've temporal solved it using Python 3.10.10 and installing:

pip install torch==2.0.1
pip install "numpy<1.24"

And now i can download all models, but finally when try run the do.tts.py

python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fasta pruebita no me jodas please" --voice random --preset fast
Hardware acceleration found: dml
KV caching requested but not supported with the DirectML backend, disabling...
Loading tokenizer JSON: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\../tortoise/data/tokenizer.json
Loaded tokenizer
Loading autoregressive model: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\models\tortoise\autoregressive.pth
Loaded autoregressive model
Loaded diffusion model
Loading vocoder model: None
Loading vocoder model: bigvgan_24khz_100band.pth
Removing weight norm...
Loaded vocoder model
Generating autoregressive samples:   0%|                                                        | 0/96 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\do_tts.py", line 57, in <module>
    gen, dbg_state = tts.tts_with_preset(args.text, k=args.candidates, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 574, in tts_with_preset
    return self.tts(text, **settings)
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 697, in tts
    codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens,
  File "c:\users\noexvanitasxjunk\ai-voice-cloning\modules\tortoise-tts\tortoise\models\autoregressive.py", line 513, in inference_speech
    gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token,
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1310, in generate
    return self.sample(
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1905, in sample
    unfinished_sequences = input_ids.new(input_ids.shape[0]).fill_(1)
RuntimeError: new(): expected key in DispatchKeySet(CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1

try search about of this problem, but dont found a solution, what should i do bro? I cant use nothing TTS with my AMD GPU, i want use it for my GPU, i tried made the port of "https://github.com/suno-ai/bark" for use with DirectML but cant goat it xD

Yes bro, I've temporal solved it using Python 3.10.10 and installing: ``` pip install torch==2.0.1 pip install "numpy<1.24" ``` And now i can download all models, but finally when try run the **do.tts.py** ``` python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fasta pruebita no me jodas please" --voice random --preset fast Hardware acceleration found: dml KV caching requested but not supported with the DirectML backend, disabling... Loading tokenizer JSON: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\../tortoise/data/tokenizer.json Loaded tokenizer Loading autoregressive model: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\models\tortoise\autoregressive.pth Loaded autoregressive model Loaded diffusion model Loading vocoder model: None Loading vocoder model: bigvgan_24khz_100band.pth Removing weight norm... Loaded vocoder model Generating autoregressive samples: 0%| | 0/96 [00:00<?, ?it/s] Traceback (most recent call last): File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\do_tts.py", line 57, in <module> gen, dbg_state = tts.tts_with_preset(args.text, k=args.candidates, voice_samples=voice_samples, conditioning_latents=conditioning_latents, File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 574, in tts_with_preset return self.tts(text, **settings) File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 697, in tts codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens, File "c:\users\noexvanitasxjunk\ai-voice-cloning\modules\tortoise-tts\tortoise\models\autoregressive.py", line 513, in inference_speech gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token, File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1310, in generate return self.sample( File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1905, in sample unfinished_sequences = input_ids.new(input_ids.shape[0]).fill_(1) RuntimeError: new(): expected key in DispatchKeySet(CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1 ``` try search about of this problem, but dont found a solution, what should i do bro? I cant use nothing TTS with my AMD GPU, i want use it for my GPU, i tried made the port of "https://github.com/suno-ai/bark" for use with DirectML but cant goat it xD

There was either a problem with copy/paste or you're missing a quotation mark:

python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fasta pruebita no me jodas please" --voice random --preset fast

There was either a problem with copy/paste or you're missing a quotation mark: > python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fasta pruebita no me jodas please" --voice random --preset fast
Author

There was either a problem with copy/paste or you're missing a quotation mark:

python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fasta pruebita no me jodas please" --voice random --preset fast

Oh not bro sorry for the confusion, its a visual bug of my terminal in windows.

My input is:
python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fast

Output:

Hardware acceleration found: dml
KV caching requested but not supported with the DirectML backend, disabling...
Loading tokenizer JSON: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\../tortoise/data/tokenizer.json
Loaded tokenizer
Loading autoregressive model: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\models\tortoise\autoregressive.pth
Loaded autoregressive model
Loaded diffusion model
Loading vocoder model: None
Loading vocoder model: bigvgan_24khz_100band.pth
Removing weight norm...
Loaded vocoder model
Generating autoregressive samples:   0%|                                                                          | 0/96 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\do_tts.py", line 57, in <module>
    gen, dbg_state = tts.tts_with_preset(args.text, k=args.candidates, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 574, in tts_with_preset
    return self.tts(text, **settings)
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 697, in tts
    codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens,
  File "c:\users\noexvanitasxjunk\ai-voice-cloning\modules\tortoise-tts\tortoise\models\autoregressive.py", line 513, in inference_speech
    gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token,
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1310, in generate
    return self.sample(
  File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1905, in sample
    unfinished_sequences = input_ids.new(input_ids.shape[0]).fill_(1)
RuntimeError: new(): expected key in DispatchKeySet(CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1

Note: I've tested use CMD with .bat and Terminal with .PS1, dont worry for the visual bug, really i get the same error bug in both cases RuntimeError: new(): expected key in DispatchKeySet(CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1

> There was either a problem with copy/paste or you're missing a quotation mark: > > > python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fasta pruebita no me jodas please" --voice random --preset fast Oh not bro sorry for the confusion, its a visual bug of my terminal in windows. My input is: `python .\modules\tortoise-tts\tortoise\do_tts.py --text "Esto es una pruebita no me jodas please" --voice random --preset fast` Output: ``` Hardware acceleration found: dml KV caching requested but not supported with the DirectML backend, disabling... Loading tokenizer JSON: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\../tortoise/data/tokenizer.json Loaded tokenizer Loading autoregressive model: C:\Users\NoeXVanitasXJunk\ai-voice-cloning\models\tortoise\autoregressive.pth Loaded autoregressive model Loaded diffusion model Loading vocoder model: None Loading vocoder model: bigvgan_24khz_100band.pth Removing weight norm... Loaded vocoder model Generating autoregressive samples: 0%| | 0/96 [00:00<?, ?it/s] Traceback (most recent call last): File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\do_tts.py", line 57, in <module> gen, dbg_state = tts.tts_with_preset(args.text, k=args.candidates, voice_samples=voice_samples, conditioning_latents=conditioning_latents, File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 574, in tts_with_preset return self.tts(text, **settings) File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\modules\tortoise-tts\tortoise\api.py", line 697, in tts codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens, File "c:\users\noexvanitasxjunk\ai-voice-cloning\modules\tortoise-tts\tortoise\models\autoregressive.py", line 513, in inference_speech gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token, File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1310, in generate return self.sample( File "C:\Users\NoeXVanitasXJunk\ai-voice-cloning\venv\lib\site-packages\transformers\generation_utils.py", line 1905, in sample unfinished_sequences = input_ids.new(input_ids.shape[0]).fill_(1) RuntimeError: new(): expected key in DispatchKeySet(CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1 ``` Note: I've tested use CMD with .bat and Terminal with .PS1, dont worry for the visual bug, really i get the same error bug in both cases `RuntimeError: new(): expected key in DispatchKeySet(CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1`

Have you also tried with the web UI? The discussion from seems to imply that CLI generation isn't fully implemented.

Have you also tried with the web UI? The discussion from #242 seems to imply that CLI generation isn't fully implemented.
Author

Ohh nope, but i get the same result now: Uwu
image

Ohh nope, but i get the same result now: Uwu ![image](/attachments/c9261778-c1db-43ae-b3f0-41455fbeeb88)
300 KiB

I've never run into that error but I use GeForce+CUDA so I can't be much help from here. All I can suggest is to try on Linux (even WSL through Windows), according to the Wiki DirectML support is iffy at best.

I've never run into that error but I use GeForce+CUDA so I can't be much help from here. All I can suggest is to try on Linux (even WSL through Windows), [according to the Wiki](https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Installation#user-content-note-on-directml-support) DirectML support is iffy at best.
Author

I've never run into that error but I use GeForce+CUDA so I can't be much help from here. All I can suggest is to try on Linux (even WSL through Windows), according to the Wiki DirectML support is iffy at best.

Understand, thank u very much bro, if you need a tester for try fix it. i could help !!

> I've never run into that error but I use GeForce+CUDA so I can't be much help from here. All I can suggest is to try on Linux (even WSL through Windows), [according to the Wiki](https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Installation#user-content-note-on-directml-support) DirectML support is iffy at best. Understand, thank u very much bro, if you need a tester for try fix it. i could help !!
Owner

You might need to use either an older version of torch-directml or transformers. I don't have my previous venv of DirectML still around unfortunately, but you can start with doing (sourced from the frozen requirements.txt from 152334H/DL-Art-School:

pip3 install --force-reinstall transformers==4.26.1
You might need to use either an older version of torch-directml or transformers. I don't have my previous venv of DirectML still around unfortunately, but you can start with doing (sourced from the frozen requirements.txt from [152334H/DL-Art-School](https://github.com/152334H/DL-Art-School/blob/master/codes/requirements_frozen_only_use_if_something_broken.txt): ``` pip3 install --force-reinstall transformers==4.26.1 ```

Its a mess, I fixed the error. By changing the pytorch source code and rebuilding from version 2.0. Just have to add "PrivateUse1" to the DispatchKeySet. But now I get another error. I feel like this was not really tested on none cuda hardware.

Its a mess, I fixed the error. By changing the pytorch source code and rebuilding from version 2.0. Just have to add "PrivateUse1" to the DispatchKeySet. But now I get another error. I feel like this was not really tested on none cuda hardware.

I feel like this was not really tested on none cuda hardware.

IIRC it's been tested extensively on AMD graphics cards but with ROCm (Linux), not DirectML (Windows), as noted on the Wiki:

PyTorch-DirectML is very, very experimental and is still not production quality. There's some headaches with the need for hairy kludgy patches.

These patches rely on transfering the tensor between the GPU and CPU as a hotfix, so performance is definitely harmed.

Both the conditional latent computation and the vocoder pass have to be done on the CPU entirely because of some quirks with DirectML.

On my 6800XT, VRAM usage climbs almost the entire 16GiB, so be wary if you OOM somehow. Low VRAM flags may NOT have any additional impact from the constant copying anyways.

For AMD users, I still might suggest using Linux+ROCm as it's (relatively) headache free, but I had stability problems.

Training is currently very, very improbably, due to how integrated it seems to be with CUDA. If you're fiending to train on your AMD card, please use Linux+ROCm, but I have not tested this myself.

> I feel like this was not really tested on none cuda hardware. IIRC it's been tested extensively on AMD graphics cards but with ROCm (Linux), not DirectML (Windows), [as noted on the Wiki](https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Installation#user-content-note-on-directml-support): > PyTorch-DirectML is very, very experimental and is still not production quality. There's some headaches with the need for hairy kludgy patches. > > These patches rely on transfering the tensor between the GPU and CPU as a hotfix, so performance is definitely harmed. > > Both the conditional latent computation and the vocoder pass have to be done on the CPU entirely because of some quirks with DirectML. > > On my 6800XT, VRAM usage climbs almost the entire 16GiB, so be wary if you OOM somehow. Low VRAM flags may NOT have any additional impact from the constant copying anyways. > > For AMD users, I still might suggest using Linux+ROCm as it's (relatively) headache free, but I had stability problems. > > Training is currently very, very improbably, due to how integrated it seems to be with CUDA. If you're fiending to train on your AMD card, please use Linux+ROCm, but I have not tested this myself.
Sign in to join this conversation.
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#260
No description provided.