Open
Conversation
Link to the pending `/ai-transport` overview page.
Add intro describing the pattern, its properties, and use cases.
Includes continuous token streams, correlating tokens for distinct responses, and explicit start/end events.
Splits each token streaming approach into distinct patterns and shows both the publish and subscribe side behaviour alongside one another.
Includes hydration with rewind and hydration with persisted history + untilAttach. Describes the pattern for handling in-progress live responses with complete responses loaded from the database.
Add doc explaining streaming tokens with appendMessage and update compaction allowing message-per-response history.
Unifies the token streaming nav for token streaming after rebase.
Refines the intro copy in message-per-response to have structural similarity with the message-per-token page.
Refine the Publishing section of the message-per-response docs. - Include anchor tags on title - Describe the `serial` identifier - Align with stream pattern used in message-per-token docs - Remove duplicate example
Refine the Subscribing section of the message-per-response docs. - Add anchor tag to heading - Describes each action upfront - Uses RANDOM_CHANNEL_NAME
Refine the rewind section of the message-per-response docs. - Include description of allowed rewind paameters - Tweak copy
Refines the history section for the message-per-response docs. - Adds anchor to heading - Uses RANDOM_CHANNEL_NAME - Use message serial in code snippet instead of ID - Tweaks copy
Fix the hydration of in progress responses via rewind by using the responseId in the extras to correlate messages with completed responses loaded from the database.
Fix the hydration of in progress responses using history by obtaining the timestamp of the last completed response loaded from the database and paginating history forwards from that point.
Removes the headers/metadata section, as this covers the specific semantics of extras.headers handling with appends, which is better addressed by the (upcoming) message append pub/sub docs. Instead, a callout is used to describe header mixin semantics in the appropriate place insofar as it relates to the discussion at hand.
Update the token streaming with message per token docs to include a callout describing resume behaviour in case of transient disconnection.
Fix the message per token docs headers to include anchors and align with naming in the message per response page.
Adds an overview page for a Sessions & Identity section which describes the channel-oriented session model and its benefits over the traditional connection-oriented model. Describes how identity relates to session management and how this works in the context of channel-oriented sessions. Shows how to use identified clients to assign a trusted identity to users and obtain this identity from the agent side. Shows how to use Ably capabilities to control which operations authenticated users can perform on which channels. Shows how to use authenticated user claims to associated a role or other attribute with a user. Updates the docs to describe how to handle authentication, capabilities, identity and roles/attributes for agents separately from end users. Describes how to use presence to mark users and agents as online/offline. Includes description of synthetic leaves in the event of abrupt disconnection. Describe how to subscribe to presence to see who is online, and take action when a user is offline across all devices. Add docs for resuming user and agent sessions, linking to hydration patterns for different token streaming approaches for user resumes and describing agent resume behaviour with message catch up.
Adds a guide for using the OpenAI SDK to consume streaming events from the Responses API and publish them over Ably using the message per token pattern.
- Uses a further-reading callout instead of note - Removes repeated code initialising Ably client (OpenAI client already instantiated)
Adds an anchor tag to the "Client hydration" heading
Similar to the open ai message per token guide, but using the message per response pattern with appends.
Documents patterns for exposing reasoning output from models along with final output.
Overview page for token streaming in AI Transport --------- Co-authored-by: matt423 <matthew.a423@gmail.com> Co-authored-by: Fiona Corden <fiona.corden@ably.com> Co-authored-by: Paddy Byers <paddy.byers@gmail.com>
Document how to implement human oversight of AI agent actions using Ably channels and capabilities for authorization workflows.
Document how users send prompts to AI agents over Ably channels, including identified clients, message correlation, and handling concurrent prompts.
Co-authored-by: Paddy Byers <paddy.byers@gmail.com>
Details the message-per-response pattern using Ably `appendMessage` for Anthropic SDK.
Adds a page to the Messaging section that describes sending tool calls and results to users over channels. Indicates ability to build generative user interfaces or implement human in the loop workflows.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
paddybyers
requested changes
Jan 16, 2026
|
|
||
| ### Handling append failures <a id="append-failures"/> | ||
|
|
||
| When appending without awaiting, it is possible for an intermediate append to fail while subsequent appends succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. |
Member
There was a problem hiding this comment.
Suggested change
| When appending without awaiting, it is possible for an intermediate append to fail while subsequent appends succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. | |
| The examples above append successive tokens to a response message by pipelining the append operations - that is, the agent will publish an append operation without waiting for prior operations to complete. This is necessary in order to avoid the append rate being capped by the round-trip time from the agent to the Ably endpoint. However, this means that the agent does not await the outcome of each append operation, and that can result in the agent continuing to submit append operations after an earlier operation has failed. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. | |
| The agent needs to obtain the outcome of each append operation, and take corrective action in the event that any operation failed for some reason. A simple but effective way to do this is simply to ensure that, if streaming of a response fails for any reason, then the message is updated with the final complete response text once it is available. This means that although the streaming experience is disrupted in the case of failure, there is no consistency problem with the final result once the response completes. |
|
|
||
| When appending without awaiting, it is possible for an intermediate append to fail while subsequent appends succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. | ||
|
|
||
| To detect failures, keep a reference to each append operation and check for rejections after the stream completes: |
Member
There was a problem hiding this comment.
Suggested change
| To detect failures, keep a reference to each append operation and check for rejections after the stream completes: | |
| To detect append failures, keep a reference to each append operation and check for rejections after the stream completes: |
|
|
||
| ### Handling publish failures <a id="publish-failures"/> | ||
|
|
||
| When publishing without awaiting, it is possible for an intermediate publish to fail while subsequent publishes succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single token may be rejected while the following tokens continue to be accepted. |
Member
There was a problem hiding this comment.
I suggest updating this text in a similar way to proposed above
20df5cb to
01ab0f8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Context: https://ably.atlassian.net/browse/AIT-238
Overall the questions I think need to be answers in these docs sections are: