CUDA: GDN hide memory latency by am17an · Pull Request #20537 · ggml-org/llama.cpp

am17an · 2026-03-14T06:06:51Z

#20448 got closed because #20443 got merged. @IMbackK could you please check if this is not causing regressions on HIP?

IMbackK · 2026-03-15T20:02:55Z

This pr makes no measurable difference in performance on CDNA

master

model	size	params	backend	ngl	n_ubatch	fa	test	t/s
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	1	1	pp2048	76.96 ± 0.22
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	64	1	pp2048	347.21 ± 2.68
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	512	1	pp2048	953.69 ± 2.77
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	2048	1	pp2048	1528.38 ± 4.24

pr

model	size	params	backend	ngl	n_ubatch	fa	test	t/s
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	1	1	pp2048	77.38 ± 0.50
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	64	1	pp2048	348.25 ± 2.66
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	512	1	pp2048	954.84 ± 2.60
qwen35moe 35B.A3B Q8_0	28.21 GiB	34.66 B	ROCm	99	2048	1	pp2048	1528.30 ± 4.71

passes op test too.

CUDA: GDN hide memory latency

0b60e3f

ggerganov approved these changes Mar 14, 2026

View reviewed changes

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Mar 14, 2026

am17an merged commit 34818ea into ggml-org:master Mar 16, 2026
81 of 82 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: GDN hide memory latency#20537

CUDA: GDN hide memory latency#20537
am17an merged 1 commit intoggml-org:masterfrom
am17an:cuda_gdn_load2

am17an commented Mar 14, 2026 •

edited

Loading

Uh oh!

IMbackK commented Mar 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

am17an commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IMbackK commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

am17an commented Mar 14, 2026 •

edited

Loading

IMbackK commented Mar 15, 2026 •

edited

Loading