Eval bug: Regression when trying to load a big model

### Name and Version

./build_vulkan/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV STRIX_HALO) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
version: 8183 (66d65ec29) (ignore that, it was the last good commit as I bisected, bisected from HEAD which was d63aa398de455cf8243802c5099dc30a2e26edcc at the time)
built with GNU 15.2.1 for Linux x86_64


### Operating systems

Linux

### GGML backends

Vulkan

### Hardware

Ryzen AI Max 395+

### Models

Issue encountered with big models(specifically in my case Qwen3.5-122B-A10B-UD-Q6_K_XL and NVIDIA-Nemotron-3-Super-120B-A12B-UD-Q6_K)

### Problem description & steps to reproduce

command line used:
```
./build_vulkan/bin/llama-server --temp 0.6 --top-p 0.95 --min-p 0.0 --top-k 20 -fitt 5120 -ngl 999 --no-mmap --host 192.168.1.105 --port 3334 -m ./models/Qwen3.5-122B-A10B-UD-Q6_K_XL-00001-of-00004.gguf
```

After the commit specified below, models fail to load, the whole system nearly hangs, I have to killall on console terminal which takes a while to trigger. I've experienced this in the past when memory usage was too high(as ram and vram are shared on this mini PC).


### First Bad Commit

Bisected to precisely:
\# first bad commit: [319146247e643695f94a558e8ae686277dd4f8da] vulkan: improve partial offloading performance on AMD (#19976)


### Relevant log output

Nothing relevant in dmesg beyond journalctl complaining about memory pressure. Nothing in journalctl and llama-server log just shows it around midway loading the model with dots.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Regression when trying to load a big model #20439

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: Regression when trying to load a big model #20439

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions