Training model #11

Open
opened 2024-04-22 13:50:37 +00:00 by Hanz · 2 comments

Hello,

I try to quickly test the command python -m vall_e.models.ar_nar --yaml="./data/config.yaml" but it fails.

[rank0]: File "./vall_e/models/arch/llama.py", line 125, in forward
[rank0]: attn_output = torch.nn.functional.scaled_dot_product_attention(
[rank0]: RuntimeError: No available kernel. Aborting execution.

I am running on cuda 11.8 and torch-2.3.0+cu118. I also attempt to check on different torch version but the problem persists

Hello, I try to quickly test the command python -m vall_e.models.ar_nar --yaml="./data/config.yaml" but it fails. [rank0]: File "./vall_e/models/arch/llama.py", line 125, in forward [rank0]: attn_output = torch.nn.functional.scaled_dot_product_attention( [rank0]: RuntimeError: No available kernel. Aborting execution. I am running on cuda 11.8 and torch-2.3.0+cu118. I also attempt to check on different torch version but the problem persists
Author

Which version did you run successfully? Please suggest a solution to fix this error?
Thank you.

Which version did you run successfully? Please suggest a solution to fix this error? Thank you.
Owner

[rank0]: attn_output = torch.nn.functional.scaled_dot_product_attention(
[rank0]: RuntimeError: No available kernel. Aborting execution.

Oops, don't know how I never caught this. I must've made too much of an assumption that sdpa is available all the time (since it works on my V100s and I consider them old).

Should be resolved in commit 949339a3fa. If it doesn't, in ./data/config.yaml, set it to attention: eager.

> `[rank0]: attn_output = torch.nn.functional.scaled_dot_product_attention(` > `[rank0]: RuntimeError: No available kernel. Aborting execution.` Oops, don't know how I never caught this. I must've made too much of an assumption that `sdpa` is available all the time (since it works on my V100s and I consider them old). Should be resolved in commit 949339a3facfbe6d45ca1a1e4b0d9da20a1e5c31. If it doesn't, in `./data/config.yaml`, set it to `attention: eager`.
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/vall-e#11
No description provided.