Skip to content

Console logs on stdout corrupt JSON output in benchmark scripts #498

@juanmichelini

Description

@juanmichelini

When running benchmarks (e.g., swebenchmultimodal), scripts often expect pure JSON output on stdout to be piped to tools like jq.

However, benchmarks/utils/console_logging.py currently configures the console handler to write to stdout when rich logging is enabled (or defaulted).

This causes issues when libraries (like OpenTelemetry) log errors or warnings. For example, a Failed to detach context error from OpenTelemetry is printed to stdout, which then causes jq to fail with:
jq: parse error: Invalid numeric literal

This happens because the log message is mixed with the JSON output.

Fix: Ensure all console logs are written to stderr, leaving stdout exclusively for the script's intended output (JSON).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions