Skip to content

Misc. bug: Generating structured output according to a given JSON schema fails with a 500 server error using gpt-oss-120b #20344

@henrygouk

Description

@henrygouk

Name and Version

$ ./llama-server --version
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 122502 MiB):
  Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes, VRAM: 122502 MiB (7625 MiB free)
version: 8269 (0cd4f4720)
built with GNU 13.3.0 for Linux aarch64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

# Server
./llama-server --gpt-oss-120b-default --host 0.0.0.0


# Client
curl http://10.126.190.221:8013/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "x-api-key: 123456" \
    -d '{
        "model": "ggml-org/gpt-oss-120b-GGUF",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant that generates JSON according to a given JSON Schema."
            },
            {
                "role": "user",
                "content": "Generate a JSON object that conforms to the following JSON Schema: {\"type\": \"object\", \"properties\": {\"name\": {\"type\": \"string\"}, \"age\": {\"type\": \"integer\"}}, \"required\": [\"name\", \"age\"]}"
            }
        ],
        "response_format": {
            "type": "json_object",
            "json_schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"}
                },
                "required": ["name", "age"]
            }
        }
    }'

Problem description & steps to reproduce

When trying to generate structured output by specifying a json schema, I get 500 server errors. This is happening on a DGX Spark running Ubuntu.

Compile the relevant version of llama.cpp and run the two commands above, substituting the IP address.

The output on the client side is:

{"error":{"code":500,"message":"Failed to parse input at pos 416: <|start|>assistant<|channel|>final<|message|>{\n  \"name\": \"John Doe\",\n  \"age\": 30\n}","type":"server_error"}}

First Bad Commit

Not clear

Relevant log output

Logs
srv    operator(): got exception: {"error":{"code":500,"message":"Failed to parse input at pos 416: <|start|>assistant<|channel|>final<|message|>{\n  \"name\": \"John Doe\",\n  \"age\": 30\n}","type":"server_error"}}

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions