llama: fix llama-model-saver by JohannesGaessler · Pull Request #20503 · ggml-org/llama.cpp

JohannesGaessler · 2026-03-13T11:55:25Z

This PR fixes llama-model-saver and makes the --output argument of test-llama-archs functional (the models themselves are still broken though because they lack tokenizers).

The first issue fixed in this PR is that llama-model-saver is simply unmaintained: a lot of new KV values were added since I implemented it and those were not being saved correctly. I simply went through the KV values again, added the missing ones and checked where the corresponding information can be extracted from.

The second issue fixed in this PR is that on master several archs have broken tensor names: typically what happens is that in llama_model::load_tensors tensors are being created without a corresponding entry in llm_get_tensor_names. As a consequence LLM_TN_IMPL::str then doesn't use the provided arguments to format the tensor name with e.g. the layer index. So you end up with multiple, different tensors that have names like blk.%d.attn_q. Since a GGUF context is populated by tensor name this leads to conflicts and the model cannot be saved correctly. To me it is now clear why we have llm_get_tensor_names in the first place. I think it would make more sense to just check in LLM_TN_IMPL::str() whether suffix, bid, and/or xid are set and to use them in those cases. Also add a warning in cases where the tensor name template and the provided arguments don't match. I would implement this refactor in this PR.

CISC · 2026-03-13T12:28:09Z

It would be useful to have a simple little CI that checks that KV values in llama-arch.h are handled in llama-model-saver whenever updated. Perhaps also check gguf-py to ensure everything is in sync.

JohannesGaessler · 2026-03-13T12:32:03Z

I agree. I'm thinking it would make sense to implement a roundtrip like manual GGUF context -> llama_model -> tmpfile -> llama_model in test-llama-archs. #20402 could be related, I haven't reviewed it yet.

JohannesGaessler added 2 commits March 13, 2026 12:30

fix tensor names

ca00988

fix llama-model-saver

64c9b8a

JohannesGaessler requested review from CISC and ggerganov March 13, 2026 11:55

github-actions bot added the testing Everything test related label Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama: fix llama-model-saver#20503

llama: fix llama-model-saver#20503
JohannesGaessler wants to merge 2 commits intoggml-org:masterfrom
JohannesGaessler:llama-fix-model-saver

JohannesGaessler commented Mar 13, 2026

Uh oh!

CISC commented Mar 13, 2026

Uh oh!

JohannesGaessler commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JohannesGaessler commented Mar 13, 2026

Uh oh!

CISC commented Mar 13, 2026

Uh oh!

JohannesGaessler commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants