Skip to content

Python: [Python] Large AgentResponse in entity set_result() causes .NET host to roll back entire entity state #3884

@droideronline

Description

@droideronline

Bug Description

When a Durable Functions entity produces a large AgentResponse (e.g. from MCP tool calls returning verbose API payloads), context.set_result(result.to_dict()) in _entities.py causes the .NET Durable Functions host to fail with:

System.ArgumentException: Out of proc orchestrators must return a valid JSON schema.

Because the .NET host treats set_state() and set_result() as a single atomic transaction, this failure rolls back both — including the response already written to entity state by entity.run()set_state(). The HTTP polling endpoint then never finds the response, and the request times out.

Root Cause

In _entities.py line 86:

result = await entity.run(request)
context.set_result(result.to_dict())

result.to_dict() serializes the entire AgentResponse, which for MCP tool calls can include deeply nested JSON objects, large text blobs, or complex structures from external APIs. The .NET Durable Functions host receives this over gRPC and fails to process it, rejecting the entire entity execution.

Impact

  • "Say hello" and other simple queries work fine (small response → set_result() succeeds → transaction commits).
  • MCP tool calls or any agent interaction that produces a large response fails silently — the agent actually completes successfully (visible in traces), but the result is lost due to the state rollback.
  • The HTTP polling endpoint returns a timeout, giving the user no indication that the agent actually succeeded.

Steps to Reproduce

  1. Deploy an AgentFunctionApp with an agent that uses MCPStdioTool (e.g. mcp-atlassian for Jira)
  2. Send a request that triggers a tool call returning a large payload (e.g. "List all Jira projects")
  3. Observe in Application Insights:
    • Agent traces show successful execution with full MCP tool response
    • Immediately followed by: Function 'dafx-<agent> (Entity)' failed with an error. Reason: Internal error: System.ArgumentException: Out of proc orchestrators must return a valid JSON schema
  4. HTTP polling returns timeout — entity state was rolled back

Proposed Fix

Pass only a compact summary to set_result(). The HTTP trigger path does not depend on set_result() — it reads from read_entity_state(), which uses set_state(). Only orchestrations using call_entity() consume set_result().

result = await entity.run(request)
# Full response is already persisted via set_state() inside entity.run().
# Pass a compact result to avoid .NET host serialization failures.
compact_result = {
    "status": "ok",
    "text": (result.text or "")[:4000],
}
context.set_result(compact_result)

Trade-off: Orchestrations calling the entity via call_entity() would receive only the compact result instead of the full AgentResponse. They could be updated to read the full response from entity state if needed.

Environment

  • Package: agent-framework-azurefunctions (PyPI)
  • Python: 3.11
  • Platform: Azure Functions Linux Consumption Plan
  • Durable Task Scheduler: Consumption SKU
  • File: python/packages/azurefunctions/agent_framework_azurefunctions/_entities.py, line 86

Labels

bug, python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions