Skip to content

refactor(llm): add LiteLLM-backed provider abstraction#2363

Draft
enyst wants to merge 7 commits intomainfrom
openhands/llm-provider-abstraction
Draft

refactor(llm): add LiteLLM-backed provider abstraction#2363
enyst wants to merge 7 commits intomainfrom
openhands/llm-provider-abstraction

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Mar 9, 2026

Summary

Refactor SDK LLM provider handling around an internal LLMProvider helper backed by LiteLLM, so provider-specific logic stops re-splitting raw provider/model strings across the SDK.

This PR now takes the simpler direction discussed in issue #2274 and PR review:

  • accept the full model string at the SDK boundary
  • parse it once with LiteLLM
  • use LiteLLM's parsed provider + model view for LiteLLM-facing runtime paths
  • initialize LiteLLM transport provider parsing once per LLM instance instead of refreshing provider state during each transport call

Concretely, this PR:

  • adds openhands.sdk.llm.utils.litellm_provider.LLMProvider as a thin wrapper around LiteLLM provider parsing, provider model info lookup, API base inference, and call kwargs generation
  • keeps LLMProvider focused on LiteLLM's parsed runtime shape (provider, parsed model, resolved API base) rather than storing duplicate requested_* fields
  • updates LLM chat/responses transport calls to use provider-owned LiteLLM kwargs generation (model + custom_llm_provider, plus Bedrock-aware api_key forwarding) instead of repeatedly passing or re-splitting the full string
  • moves Bedrock-specific LiteLLM api_key handling into LLMProvider so provider-sensitive transport behavior lives alongside provider parsing
  • updates telemetry cost calculation to reuse the same parsed provider/model view
  • keeps capability and model_features lookups on the canonical model string (model_canonical_name or model) instead of introducing a second provider abstraction
  • initializes transport provider parsing once during LLM setup, so chat/responses transport paths reuse the same resolved provider metadata for the lifetime of the instance
  • removes dead provider inference wrappers that became redundant after the simplification
  • adds focused tests for the slimmer helper and the updated transport/canonical capability lookup behavior

Closes #2274.

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works? (N/A: no example changes)
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name? (N/A: internal refactor)
  • Is the github CI passing? (pending)

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:fab6f99-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-fab6f99-python \
  ghcr.io/openhands/agent-server:fab6f99-python

All tags pushed for this build

ghcr.io/openhands/agent-server:fab6f99-golang-amd64
ghcr.io/openhands/agent-server:fab6f99-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:fab6f99-golang-arm64
ghcr.io/openhands/agent-server:fab6f99-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:fab6f99-java-amd64
ghcr.io/openhands/agent-server:fab6f99-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:fab6f99-java-arm64
ghcr.io/openhands/agent-server:fab6f99-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:fab6f99-python-amd64
ghcr.io/openhands/agent-server:fab6f99-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:fab6f99-python-arm64
ghcr.io/openhands/agent-server:fab6f99-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:fab6f99-golang
ghcr.io/openhands/agent-server:fab6f99-java
ghcr.io/openhands/agent-server:fab6f99-python

About Multi-Architecture Support

  • Each variant tag (e.g., fab6f99-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., fab6f99-python-amd64) are also available if needed

Introduce an internal LLMProvider helper that wraps LiteLLM provider parsing,
provider model info lookup, and provider-aware API base inference.

Use the helper in LLM and Telemetry so Bedrock auth handling and
cost-calculation model/provider routing stop manually splitting model
strings. Extend model_features to accept an LLMProvider so provider-aware
rules can operate on parsed provider/model data while keeping the raw model
string available.

Add focused tests for the new helper and the updated feature lookup paths.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

API breakage checks (Griffe)

Result: Passed

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

Agent server REST API breakage checks (OpenAPI)

Result: Passed

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/llm
   llm.py4767783%432, 485, 706, 812, 814–815, 843, 890, 901–903, 907–911, 919–921, 931–933, 936–937, 941, 943–944, 946, 1139–1140, 1337–1338, 1347, 1360, 1362–1367, 1369–1386, 1389–1393, 1395–1396, 1402–1411, 1462, 1464
openhands-sdk/openhands/sdk/llm/utils
   litellm_provider.py881187%77–78, 89–90, 127–128, 133–136, 140
   model_features.py40197%32
   telemetry.py1751392%133, 158, 164–165, 175, 187–188, 227, 347, 349, 363, 369, 374
TOTAL19809577170% 

Rework the new LLMProvider helper so it no longer keeps both raw and
parsed model strings internally.

Accept full model strings at the SDK boundary, then immediately normalize to
LiteLLM's parsed provider/model pair for transport, telemetry, and feature
checks. Unknown models still remain opaque strings, but the provider
abstraction itself now avoids duplicate raw-vs-parsed state.

Also route chat/responses calls through parsed provider kwargs and tighten the
associated tests.

Co-authored-by: openhands <openhands@all-hands.dev>
Keep LLMProvider for LiteLLM-facing transport logic only and use the canonical model string for capability/feature lookups.

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactoring Proposal: Large Classes Identified for Decomposition

2 participants