Skip to content

Add max_tokens parameter support to runWithTools#16

Open
koogunmo wants to merge 1 commit intocloudflare:mainfrom
koogunmo:feat/support-max-tokens
Open

Add max_tokens parameter support to runWithTools#16
koogunmo wants to merge 1 commit intocloudflare:mainfrom
koogunmo:feat/support-max-tokens

Conversation

@koogunmo
Copy link

@koogunmo koogunmo commented Oct 3, 2025

Description:

This adds support for controlling the maximum number of tokens in AI responses by passing an optional max_tokens parameter to runWithTools.

Problem:
The default token limit of 256 causes response truncation for longer outputs. There's currently no way to configure this.

Solution:

  • Add max_tokens parameter to runWithTools input options
  • Pass max_tokens to both AI.run() calls (initial and final response)
  • Create shared ModelName type (keyof AiModels) to replace non-existent BaseAiTextGenerationModels
  • Update type references in runWithTools.ts and utils.ts

Usage:

  const response = await runWithTools(
    env.AI,
    '@cf/meta/llama-3.3-70b-instruct-fp8-fast',
    {
      messages,
      tools: [searchTool],
      max_tokens: 2048,  // Now supported
    }
  );

Testing:
Tested in production with personal application. Prevents truncation of longer AI responses while maintaining backward
compatibility (parameter is optional).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant