Skip to content

common : rework gpt-oss parser#20393

Open
aldehir wants to merge 2 commits intoggml-org:masterfrom
aldehir:rework-gpt-oss
Open

common : rework gpt-oss parser#20393
aldehir wants to merge 2 commits intoggml-org:masterfrom
aldehir:rework-gpt-oss

Conversation

@aldehir
Copy link
Collaborator

@aldehir aldehir commented Mar 11, 2026

Rework the gpt-oss parser.

  • Tighten up the grammar, gpt-oss is very good at following its own Harmony spec.
  • Allow any sequence of analysis/preamble.
  • Clean up the trigger rules, gpt-oss may sometimes invoke a builtin function if not constrained when emitting the tool namespace.
  • Include fix from common : fix gpt-oss Jinja error with content and thinking on tool-call messages #19704.
  • Fix response_format not being enforced.
  • Remove parallel call logic, gpt-oss is trained to only emit single tool calls.
  • Removed support for reasoning-format = none. It makes no sense for gpt-oss. Users can choose to ignore reasoning_content.
  • Remove invalid test cases.

fixes #20344

@github-actions github-actions bot added the testing Everything test related label Mar 11, 2026

auto analysis = p.rule("analysis", p.literal("<|channel|>analysis<|message|>") + p.reasoning(content) + end);
auto preamble = p.rule("preamble", p.literal("<|channel|>commentary<|message|>") + p.content(content) + end);
auto final = p.rule("final", p.literal("<|channel|>final<|message|>") + p.content(content));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final is a keyword, does this work correctly? I'd change it anyway just in case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, which is why I kept it. But I'll change it. Most syntax highlighters are not semantic aware so it'll highlight it as a keyword.

Copy link
Contributor

@pwilkin pwilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trust you on this one :)

@pwilkin
Copy link
Contributor

pwilkin commented Mar 11, 2026

Just two issues:
-> support for reasoning_format: none should stay, since some people want just raw template output to further process with their own custom parsers / apps (I think SillyTavern does stuff like that) and not outputting the reasoning in content breaks this
-> since you're saying this fixes structured output, better add a test for structured output :)

@aldehir
Copy link
Collaborator Author

aldehir commented Mar 12, 2026

I'll add the logic back in, but it truly makes no sense. For one, the official template will throw an exception if it sees any harmony tags in the message. Ultimately, this model is incapable of not reasoning. Even when constrained, it will leak reasoning traces inside the final response if it can.

That said, if clients depend on this then I guess 🤷‍♂️

@pwilkin
Copy link
Contributor

pwilkin commented Mar 12, 2026

@aldehir reasoning_format: none is not enable_thinking: false, it just means you don't process the reasoning tags and you spill them all in the content.

@aldehir
Copy link
Collaborator Author

aldehir commented Mar 12, 2026

@aldehir reasoning_format: none is not enable_thinking: false, it just means you don't process the reasoning tags and you spill them all in the content.

I'm fully aware of the difference. See point 2. The responsibility is pushed to the client to strip the tags. Currently there is a hack to remove the exception, but then the model will in-context learn and start to emit bad harmony output, thus breaking parsing.

I've been down this road when I implemented the original parsing.

Edit: I can see my phrasing conflated the two. Ignore the second part.

@pwilkin
Copy link
Contributor

pwilkin commented Mar 12, 2026

@aldehir I know this is generally non-feasible, but there exists a small but vocal group of people who use their own parsing tools for whatever reasons and they really like to get the raw unprocessed contents :)

But I just realized you can not change it and instead approve my #20289 to satisfy them :)

@aldehir
Copy link
Collaborator Author

aldehir commented Mar 12, 2026

But I just realized you can not change it and instead approve my #20289 to satisfy them :)

Yes, I rather just give them the whole harmony output. I've seen complaints about "missing tokens" too, it's never ending!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Generating structured output according to a given JSON schema fails with a 500 server error using gpt-oss-120b

2 participants