Skip to content

feat(llm): add LLM profiles#1843

Draft
enyst wants to merge 109 commits intomainfrom
agent-sdk-18-profile-manager
Draft

feat(llm): add LLM profiles#1843
enyst wants to merge 109 commits intomainfrom
agent-sdk-18-profile-manager

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Jan 27, 2026

HUMAN:
LLM Profiles behavior

  • integrated with LLMRegistry
    • at any point, the registry knows all profiles and which is used for which usage_id
    • profile_id is set by the user, and corresponds to llm_profiles/profile_id.json in persistence_dir (if set)
    • defines a small API for profiles:
      • load_profile()
      • save_profile()
      • validate_profile()
      • list_profiles()
      • get_profile_path()
  • uses LLM_PROFILES_DIR

Summary

  • Integrate LLM profile persistence into LLMRegistry, exposing list/load/save/register/validate helpers with configurable profile directories
  • Fix docs example checker to detect nested example paths (avoids false CI failures)

Testing

uv run pytest tests/sdk/llm/test_llm_registry_profiles.py
uv run pytest tests/sdk/conversation/local/test_state_serialization.py

Related


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:b4c376a-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-b4c376a-python \
  ghcr.io/openhands/agent-server:b4c376a-python

All tags pushed for this build

ghcr.io/openhands/agent-server:b4c376a-golang-amd64
ghcr.io/openhands/agent-server:b4c376a-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:b4c376a-golang-arm64
ghcr.io/openhands/agent-server:b4c376a-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:b4c376a-java-amd64
ghcr.io/openhands/agent-server:b4c376a-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:b4c376a-java-arm64
ghcr.io/openhands/agent-server:b4c376a-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:b4c376a-python-amd64
ghcr.io/openhands/agent-server:b4c376a-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:b4c376a-python-arm64
ghcr.io/openhands/agent-server:b4c376a-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:b4c376a-golang
ghcr.io/openhands/agent-server:b4c376a-java
ghcr.io/openhands/agent-server:b4c376a-python

About Multi-Architecture Support

  • Each variant tag (e.g., b4c376a-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., b4c376a-python-amd64) are also available if needed

openhands-agent and others added 30 commits October 18, 2025 16:18
…authored-by: openhands <openhands@all-hands.dev>
…sation startup\n\n- ProfileManager manages ~/.openhands/llm-profiles/*.json (load/save/list/register)\n- LocalConversation now calls ProfileManager.register_all to eagerly populate LLMRegistry\n\nCo-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
- embed profile lifecycle APIs into the registry
- update persistence helpers, docs, and examples to use registry
- replace profile manager tests with registry profile coverage

Co-authored-by: openhands <openhands@all-hands.dev>
- note that LLMRegistry is the unified entry point for disk and runtime profiles
- mention how to override the profile directory when embedding the SDK

Co-authored-by: openhands <openhands@all-hands.dev>
- rename payload helpers to resolve_llm_profiles/compact_llm_profiles
- update conversation state to use clearer helper names
- drop the optional agent_settings convenience module and its tests

Co-authored-by: openhands <openhands@all-hands.dev>
- replace the _transform flag with dedicated _compact/_resolve helpers
- make compact_llm_profiles/resolve_llm_profiles easier to follow by delegating to the new helpers

Co-authored-by: openhands <openhands@all-hands.dev>
Bring in new package layout and port LLM profile switching support.
Revert the in-progress switch_llm helpers and tests; agent-sdk-18 branch now only contains LLM profile persistence.
Example 25 now performs a read/write/delete workflow and verifies the persisted profile reference.
- move inline/profile compaction into LLM serializer/validator
- use model_dump_json context in ConversationState persistence
- add persistence settings module and cover profile reference tests
- document persistence comparison and recommendations
@OpenHands OpenHands deleted a comment from openhands-ai bot Jan 28, 2026
@enyst enyst marked this pull request as ready for review February 2, 2026 23:53
@enyst
Copy link
Collaborator Author

enyst commented Feb 2, 2026

@OpenHands CI is failing. Please check what we can do for those two jobs, and fix it.

@openhands-ai
Copy link

openhands-ai bot commented Feb 2, 2026

I'm on it! enyst can track my progress at all-hands.dev

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comprehensive implementation of LLM profiles with good test coverage. Found several security and maintainability concerns that should be addressed.

Comment on lines 262 to +322
except FileNotFoundError:
base_text = None

context: dict[str, object] = {}
registry = llm_registry
if registry is None:
from openhands.sdk.llm.llm_registry import LLMRegistry

registry = LLMRegistry()
context["llm_registry"] = registry

# Ensure we have a registry available during both dump and validate.
#
# We do NOT implicitly write profile files here. Instead, persistence will
# store a profile reference only when the runtime LLM already has an
# explicit ``profile_id``.

# ---- Resume path ----
if base_text:
# Use cipher context for decrypting secrets if provided
context = {"cipher": cipher} if cipher else None
state = cls.model_validate(json.loads(base_text), context=context)
base_payload = json.loads(base_text)
# Add cipher context for decrypting secrets if provided
if cipher:
context["cipher"] = cipher

# Restore the conversation with the same id
if state.id != id:
persisted_id = ConversationID(base_payload.get("id"))
if persisted_id != id:
raise ValueError(
f"Conversation ID mismatch: provided {id}, "
f"but persisted state has {state.id}"
f"but persisted state has {persisted_id}"
)

persisted_agent_payload = base_payload.get("agent")
if persisted_agent_payload is None:
raise ValueError("Persisted conversation is missing agent state")

# Attach event log early so we can read history for tool verification
event_log = EventLog(file_store, dir_path=EVENTS_DIR)

persisted_agent = AgentBase.model_validate(
persisted_agent_payload,
context={"llm_registry": registry},
)
agent.verify(persisted_agent, events=event_log)

# Use runtime-provided Agent directly (PR #1542 / issue #1451)
#
# Persist LLMs as profile references only when an explicit profile_id is
# set on the runtime LLM.
agent_payload = agent.model_dump(
mode="json",
exclude_none=True,
context={"expose_secrets": True},
)
llm_payload = agent_payload.get("llm")
if isinstance(llm_payload, dict) and llm_payload.get("profile_id"):
llm = agent.llm
agent_payload["llm"] = llm.to_profile_ref()

base_payload["agent"] = agent_payload
base_payload["workspace"] = workspace.model_dump(mode="json")
base_payload["max_iterations"] = max_iterations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: The create() method has become quite complex with the profile reference logic. Consider extracting the resume logic into a separate _resume_from_persistence() method to improve readability.

The multiple payload mutations (expanding profile refs, injecting runtime agent, converting back to profile refs) make this hard to follow and maintain.

Comment on lines +10 to +14
Set ``LLM_PROFILE_NAME`` to choose which profile file to load.

Notes on credentials:
- New profiles include API keys by default when saved
- To omit secrets on disk, pass include_secrets=False to LLMRegistry.save_profile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: This documentation is misleading and dangerous. The current default behavior (include_secrets=True) could lead users to accidentally commit API keys.

Suggested change
Set ``LLM_PROFILE_NAME`` to choose which profile file to load.
Notes on credentials:
- New profiles include API keys by default when saved
- To omit secrets on disk, pass include_secrets=False to LLMRegistry.save_profile
Set ``LLM_PROFILE_NAME`` to choose which profile file to load.
Security Best Practice:
- Profiles should be saved WITHOUT secrets (include_secrets=False)
- Provide API keys via environment variables (LLM_API_KEY, AWS_ACCESS_KEY_ID, etc.)
- Never commit profile files containing secrets to version control
- Add *.json to .gitignore if storing profiles in your project directory

Copy link
Collaborator Author

@enyst enyst Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below

_PROFILE_ID_PATTERN = re.compile(r"^[A-Za-z0-9._-]+$")


class _LLMProfileStore:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of an attempt to separate two responsibilities:

LLMRegistry remains primarily a runtime registry; persistence behavior is explicitly contained.

Personally, I'm not sure it's necessary. They're close enough, while separating them means that below (line 220 and below) we are going to need to delegate to the LLMProfileStore methods, because the public API is still in the LLMRegistry.

context={"expose_secrets": True},
)
llm_payload = agent_payload.get("llm")
if isinstance(llm_payload, dict) and llm_payload.get("profile_id"):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands Why are we using profile_id, didn't we have a kind profile_ref? Please investigate the entire diff, and understand the code patterns and fix.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! enyst can track my progress at all-hands.dev

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final update:

Summary (new since last update):

  • No additional changes since the previous summary; branch is up to date and already pushed.

Checklist:

  • Confirmed the request is fully addressed.
  • Verified there are no extra or unrelated changes (diff vs. remote is empty).
  • Changes are concise and already pushed to the PR branch.

If you want any further adjustments or additional tests run, just let me know.

View full conversation

enyst and others added 4 commits February 3, 2026 00:21
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Break the main function into script-level code to save a level of
indentation. Also document this preference in AGENTS.md.

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Feb 4, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • [Optional] Docs example
    • Agent Server

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1843 at branch `agent-sdk-18-profile-manager`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@all-hands-bot
Copy link
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @enyst, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

1 similar comment
@all-hands-bot
Copy link
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @enyst, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@enyst enyst marked this pull request as draft February 10, 2026 15:25
@enyst
Copy link
Collaborator Author

enyst commented Feb 10, 2026

This PR has been the oldest experiment I think, with LLM profiles. Meanwhile, I've integrated profiles in a fun typescript version of the SDK:
LLM profiles spec

That is in a sense a full-featured version (switch on the fly, define on the fly, use multiple LLMs for tiny user-friendly functionality during the run). But also, a somewhat simplified design (e.g. always have a default profile, usage_ids are hardcoded, and some unused).

Maybe we should simplify here too, though maybe not so much, we'll see as we move forward. @VascoSch92 has added the first version of profiles, and we'll look at the next steps. Maybe this comment is something we'll get back to. We'll see!

@enyst enyst added behavior-initiative This is related to the system prompt sections and LLM steering. and removed behavior-initiative This is related to the system prompt sections and LLM steering. labels Feb 14, 2026
@enyst
Copy link
Collaborator Author

enyst commented Feb 21, 2026

@OpenHands Look at the llm profiles on main branch and understand the feature as it is today. (you can look for 'profile')

Look also for the PR adding switch llm. Read the PRs.

Make a plan of adding the behavior in those two PRs to main. Take your time to investigate and split into chunks that work all the time. Post your plan as a comment to this PR. Note that it will be rendered as markdown.

@openhands-ai
Copy link

openhands-ai bot commented Feb 21, 2026

I'm on it! enyst can track my progress at all-hands.dev

Copy link
Collaborator Author

enyst commented Feb 21, 2026

Investigation notes (current main)

What “LLM profiles” means on main today

On main, “profiles” exist as a standalone on-disk store:

  • openhands.sdk.llm.llm_profile_store.LLMProfileStore
    • default dir: ~/.openhands/profiles
    • concurrency: FileLock + atomic write via temp file + Path.replace()
    • API: list(), save(name, llm, include_secrets=False), load(name), delete(name)
    • stored payload: full LLM JSON (secrets optionally omitted)
  • The SDK does not currently:
    • store a profile_id on the LLM model
    • persist “profile references” in ConversationState (base_state.json)
    • integrate profiles into LLMRegistry

So: profiles are usable as a utility (and by fallback), but they’re not part of conversation persistence or runtime switching.


PRs investigated

PR #1843 (this PR): “feat(llm): add LLM profiles”

This PR’s intended behavior (from code + tests) is:

  • Add LLM.profile_id and a profile reference persistence format (e.g. {kind: profile_ref, profile_id: ...})
  • Add LLMRegistry.{list/load/save/validate}_profile* helpers and a configurable profile dir (via LLM_PROFILES_DIR)
  • Update conversation persistence so base_state.json can store a profile ref rather than an inline LLM payload

“Switch LLM” PR(s)

There is a stacked “switch LLM” line of PRs:

  • Switch LLM profiles #1049 “Switch LLM profiles” (base: agent-sdk-18-profile-manager)

    • SDK-local API: LocalConversation.switch_llm(profile_id)
    • enforces safety: only allow when conversation is idle
    • (in that PR) gating based on an inline/non-inline persistence mode
  • [agent-sdk-j2s] agent-server runtime LLM switching #1544 “[agent-sdk-j2s] agent-server runtime LLM switching” (base: Switch LLM profiles #1049)

    • extends to agent-server REST:
      • POST /api/conversations/{id}/llm (inline payload or profile_id)
      • POST /api/conversations/{id}/llm/switch
    • changes persistence/restore semantics so a switched LLM can persist across restart

Proposed landing plan (split into always-green chunks)

The key to making this land safely is to port the behavior onto current main without regressing main’s existing LLMProfileStore, FallbackStrategy, LLMRegistry metrics-independence behavior, or the current conversation persistence lifecycle.

I propose landing this in a small stack of PRs (each rebased onto latest main, each CI-green):

PR 1 — “LLM: add profile_id + profile-ref encoding/decoding (no persistence wiring yet)”

Goal: introduce the model-level primitives with minimal surface area.

  • Add LLM.profile_id: str | None (default None)
  • Add LLM.to_profile_ref() and a pydantic serializer gated by context={"persist_profile_ref": True}
  • Add a validator that can expand {kind: "profile_ref", profile_id: "..."} when context contains an llm_registry
  • Backward compatibility:
    • old inline LLM payloads continue to validate unchanged
    • existing LLMProfileStore files (full LLM JSON) continue to load unchanged

CI gates: unit tests only (new tests for serialization + “expand ref requires context”).


PR 2 — “LLMRegistry: profile helpers that wrap existing LLMProfileStore (no new store class)”

Goal: avoid two competing profile stores.

  • Keep openhands.sdk.llm.llm_profile_store.LLMProfileStore as the canonical on-disk implementation (locking, atomic write, default dir, etc.)
  • Add thin convenience methods on LLMRegistry:
    • list_profiles(), load_profile(), save_profile(), get_profile_path()
    • directory selection rules:
      • explicit profile_dir arg wins
      • else LLM_PROFILES_DIR
      • else default to the current LLMProfileStore default (~/.openhands/profiles) to preserve behavior
  • Preserve current LLMRegistry invariants on main:
    • usage_to_llm read-only mapping
    • metrics independence logic (don’t regress it while adding profile helpers)

CI gates: add/port tests similar to tests/sdk/llm/test_llm_registry_profiles.py but adapted to the existing store + directory rules.


PR 3 — “Conversation persistence: optionally persist profile refs + expand on restore”

Goal: teach ConversationState to round-trip profile refs safely.

  • When saving base_state.json:
    • if agent.llm.profile_id is set, dump it as a profile ref (no inline secrets)
    • else keep existing inline behavior (still protected by cipher/redaction as today)
  • When restoring:
    • pass llm_registry into model_validate(..., context=...) so {kind: profile_ref,...} can expand
    • make the registry available from LocalConversation (and agent-server) before state restore

Key design constraint to make behavior stable:

  • Restoring should be deterministic even if the caller supplies a runtime Agent.
    • Option A (minimal): keep “runtime Agent is source of truth” and only use persisted agent for tool verification; switching will require the caller to re-supply the desired profile on restart.
    • Option B (what [agent-sdk-j2s] agent-server runtime LLM switching #1544 aims for): persisted LLM selection wins by default so a runtime switch persists across restart; runtime agent may override only when explicitly requested.

I recommend Option B if the goal is “switch LLM and it sticks”, especially for agent-server.

CI gates: state serialization tests + at least one end-to-end persistence lifecycle test.


PR 4 — “SDK runtime switching: LocalConversation.switch_llm() (+ optionally set_llm())

Goal: expose a safe API to mutate the active LLM at runtime.

  • Add AgentBase._clone_with_llm(llm) helper so swapping doesn’t drop tool wiring
  • Add ConversationState.switch_agent_llm(profile_id, registry=...)
    • enforce: conversation must not be RUNNING (idle/paused only)
    • validate profile exists / is loadable
    • persist base_state immediately after switch
  • Add LocalConversation.switch_llm(profile_id) wrapper
  • (If we want parity with [agent-sdk-j2s] agent-server runtime LLM switching #1544 remote flows) also add LocalConversation.set_llm(llm) for “client sends inline config”

CI gates: unit tests for:

  • requires idle
  • missing profile rejected
  • persistence reflects switched profile ref

PR 5 — “agent-server switching API (from #1544)”

Goal: land the REST endpoints + persistence-across-restart behavior.

  • Add server routes + service methods:
    • POST /api/conversations/{id}/llm/switch (profile_id)
    • POST /api/conversations/{id}/llm (either profile_id or inline llm payload)
  • Ensure restart restores the switched LLM (requires PR 3 Option B or equivalent)
  • Integration tests (wsproto) to validate:
    • switch endpoint updates active LLM
    • state persists across restart

CI gates: agent-server integration tests + docs/examples checks.


Why this chunking should stay green

  • PR 1–2 are purely additive to model + registry, and should not touch conversation lifecycle.
  • PR 3 introduces persistence wiring but can be done in a fully backward-compatible way (inline payloads still supported).
  • PR 4–5 depend on PR 3 and are “feature surface” PRs that can be reviewed/rolled back independently.

Open questions / decisions to lock early

  1. Profile directory + env var: should LLM_PROFILES_DIR default to ~/.openhands/profiles (current) or a new dir? I recommend preserving the current default for compatibility.
  2. Profile ref shape: keep {kind: "profile_ref", profile_id: ...} (explicit) vs a minimal {profile_id: ...}. Explicit kind is safer for schema evolution.
  3. Restore precedence: do we want “persisted LLM wins” (needed for agent-server “switch sticks”), or keep today’s “runtime agent wins” behavior?

If we align on these three, the implementation becomes mostly mechanical.

@openhands-ai
Copy link

openhands-ai bot commented Feb 21, 2026

I investigated the current “LLM profiles” feature on main and the two “switch LLM” PRs, then produced (and posted) a markdown plan for landing their combined behavior onto main in reliable, CI-green chunks.

What I did:

  • Reviewed origin/main for existing profile support:
    • Found that main currently has openhands.sdk.llm.llm_profile_store.LLMProfileStore (default ~/.openhands/profiles, file locking + atomic writes) and FallbackStrategy using it.
    • Confirmed main does not have LLM.profile_id, conversation persistence of profile references, nor profile integration in LLMRegistry.
  • Identified and read the relevant “switch LLM” PRs:
  • Posted a detailed, markdown-rendered landing plan as a comment on this PR (feat(llm): add LLM profiles #1843), split into 5 always-green steps:
    1. Add LLM.profile_id + profile-ref serialization/expansion (no persistence wiring).
    2. Add LLMRegistry profile helpers wrapping the existing LLMProfileStore (avoid duplicate stores; preserve metrics-independence behavior on main).
    3. Update ConversationState persistence to optionally store profile refs and expand them on restore (decide restore precedence).
    4. Add SDK runtime switching (LocalConversation.switch_llm and optionally set_llm), with idle/paused-only safety checks.
    5. Add agent-server switching endpoints and restart-persistence integration tests.

Where the plan is posted:

No code changes were made in the repository for this request; I only performed investigation and posted the plan comment.

@all-hands-bot
Copy link
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @enyst, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants