Guard against sumq2 being 0 in IQ4_NL resulting in nan values by bartowski1182 · Pull Request #20460 · ggml-org/llama.cpp

bartowski1182 · 2026-03-12T14:58:03Z

With IQ4_NL on several recent models there have been issues where during quantization NaN blocks are being found which crashes the quant

It seems to be stemming from a scenario where sumq2 is 0 for a given block, likely from not having imatrix data for some obscure expert, or the weights themselves being 0 as we've seen with some recent Qwen models

This change guards against dividing by 0, instead setting d to 0, which would then just set the block of weights to 0, which seems appropriate

Most recently observed in nvidia's Nemotron-3-Super-120B-A12B, blk.3.ffn_down_exps has a block that never sees imatrix data, and that results in sumq2 being 0

I added some debug information to look at what those weights are like (to ensure that setting them to 0 wouldn't present a new issue) and the amax at this time is in the order of 1e-11 so they're likely rounding down to 0 during imatrix calculations anyways

Guard against sumq2 being 0 in IQ4_NL resulting in nan values

0e926c3

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 12, 2026

bartowski1182 marked this pull request as ready for review March 12, 2026 15:04

bartowski1182 requested a review from ggerganov as a code owner March 12, 2026 15:04

compilade approved these changes Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guard against sumq2 being 0 in IQ4_NL resulting in nan values#20460

Guard against sumq2 being 0 in IQ4_NL resulting in nan values#20460
bartowski1182 wants to merge 1 commit intoggml-org:masterfrom
bartowski1182:iq4_nl_fix

bartowski1182 commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bartowski1182 commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants