You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
github-actionsbot
added
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
labels
Mar 12, 2026
I still see the error that this fixes on b8284 Vulkan: layer 0 is assigned to device Vulkan0 but the fused Gated Delta Net tensor is assigned to device CPU (usually due to missing support)
Will there be a separate PR for that?
Test b8284 This PR
pp2000 208.98 409.55
pp20000 43.09 52.21
tg300 41.85 47.68
I still see the error that this fixes on b8284 Vulkan: layer 0 is assigned to device Vulkan0 but the fused Gated Delta Net tensor is assigned to device CPU (usually due to missing support) Will there be a separate PR for that?
Thank you to share the test result!
SYCL and Vulkan backends are different.
Vulkan need another PR.
Your log shows done_getting_tensors: tensor 'token_embd.weight' (q4_K) (and 0 others) cannot be used with preferred buffer type SYCL_Host, using CPU instead.
I see the same in my logs. I expect that issue is beyond the scope of this PR.
Your log shows done_getting_tensors: tensor 'token_embd.weight' (q4_K) (and 0 others) cannot be used with preferred buffer type SYCL_Host, using CPU instead. I see the same in my logs. I expect that issue is beyond the scope of this PR.
Yes, this is another issue.
I will check it later.
It's great if create another issue to track this issue.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
documentationImprovements or additions to documentationggmlchanges relating to the ggml tensor library for machine learningSYCLhttps://en.wikipedia.org/wiki/SYCL - GPU programming language
5 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix issue: #20423
Add OP GATED_DELTA_NET.
All UT cases are passed.
Update the ops.md.
All OPs run on GPU.
Here is the performance result: