-
Notifications
You must be signed in to change notification settings - Fork 0
Document zero-copy semantics #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: low-level-api
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -16,10 +16,11 @@ At its core, `ezmsg` is a publish/subscribe messaging system. | |||||
| - A **publisher** can send messages to **many subscribers**. | ||||||
| - A **subscriber** can listen to **many publishers**. | ||||||
| - **Channels** route messages between publishers and subscribers; they are **created and managed automatically** when you connect endpoints. | ||||||
| - Messages can be any Python object. `AxisArray` is optional, not required. | ||||||
| - Messages can be any Python object. `AxisArray` is optional, not required (but strongly encouraged when it is a good fit). | ||||||
|
|
||||||
| The low-level API gives you direct access to these primitives. You decide **when** to publish, **how** to receive, and **how** to schedule your own control flow. This makes the low-level API a good fit when you want to integrate messaging into an existing application structure instead of adopting the full `ezmsg` pipeline runtime. | ||||||
|
|
||||||
| For a detailed breakdown of runtime components, transport selection, backpressure, and zero-copy semantics, see :doc:`transport-messaging-internals`. | ||||||
|
|
||||||
| |ezmsg_logo_small| Relationship to the High-level API | ||||||
| ****************************************************** | ||||||
|
|
@@ -46,6 +47,8 @@ The **high-level API** is a good fit when: | |||||
| - You benefit from the pipeline tooling (graph visualization, CLI integration, etc.). | ||||||
| - You want a structured way to scale across threads/processes without managing it yourself. | ||||||
|
|
||||||
| .. important:: The low-level API is not any more performant than the high-level API. There is no meaningful performance hit when using the high-level API. | ||||||
|
||||||
| .. important:: The low-level API is not any more performant than the high-level API. There is no meaningful performance hit when using the high-level API. | |
| .. important:: The low-level API is not more performant than the high-level API. There is no meaningful performance hit when using the high-level API. |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,85 @@ | ||||||
| Transport and Messaging Internals | ||||||
| ################################# | ||||||
|
|
||||||
| This page documents the runtime components and data-transport behavior used by `ezmsg`. These internals apply equally to the low-level and high-level APIs. | ||||||
|
|
||||||
| GraphServer, Publisher, Subscriber, Channel | ||||||
| =========================================== | ||||||
|
|
||||||
| - **GraphServer**: a lightweight TCP service that tracks the topic DAG, keeps a registry of publishers/subscribers, and notifies subscribers when their upstream publishers change. It also brokers shared-memory segment creation and attachment. | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A few of the terms here like TCP and DAG might need explanation. My solution: link to a glossary for terms like this (otherwise it would only clutter the text). No need to do this here - for a future PR. |
||||||
| - **Publisher**: a client that registers a topic with the GraphServer, opens a channel server (TCP listener), and broadcasts messages. It owns shared-memory buffers and enforces backpressure so buffers are not reused until all subscribers have released a message. | ||||||
| - **Subscriber**: a client that registers a topic with the GraphServer, receives updates that list the upstream publisher IDs it should listen to, and maintains local channels for those publishers. | ||||||
| - **Channel**: a process-local "middle-man" for a single publisher. It receives message telemetry from the publisher (or direct local messages), caches the most recent messages, and notifies all local subscribers in that process. | ||||||
|
|
||||||
| Connection Lifecycle (Publisher -> GraphServer -> Subscriber -> Channel) | ||||||
| ======================================================================== | ||||||
|
|
||||||
| 1. A **Publisher** connects to the **GraphServer**, allocates shared-memory buffers, and registers its topic. It starts a small TCP server so Channels can connect back to it, then reports that server's address to the GraphServer. | ||||||
| 2. A **Subscriber** connects to the **GraphServer** and registers its topic. The GraphServer computes upstream publishers from the topic DAG and sends the Subscriber an `UPDATE` list of publisher IDs. | ||||||
| 3. For each publisher ID, the Subscriber asks the local **ChannelManager** to register a **Channel**. If a Channel does not yet exist in this process, it is created by: | ||||||
| 4. The **Channel** requesting a channel allocation from the GraphServer for the given publisher ID. The GraphServer returns a new channel ID plus the publisher's server address. | ||||||
|
||||||
| 4. The **Channel** requesting a channel allocation from the GraphServer for the given publisher ID. The GraphServer returns a new channel ID plus the publisher's server address. | |
| 4. The **Channel** requests a channel allocation from the GraphServer for the given publisher ID. The GraphServer returns a new channel ID plus the publisher's server address. |
Copilot
AI
Feb 18, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section uses single-backtick interpreted text for protocol constants and code identifiers (e.g., put_local, MessageCache, MessageMarshal, TX_SHM, TX_TCP). In reStructuredText, inline literals should use double backticks so these render consistently as code and don't get treated as title references.
Copilot
AI
Feb 18, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s an internal inconsistency about MessageCache scope: earlier it says "Every Channel maintains a fixed-size MessageCache", but later the doc describes a "module-scoped MessageCache". Please reconcile/clarify whether the cache is per-Channel or shared at module/process scope to avoid confusing readers about ownership and isolation.
| Publishers in `ezmsg` serialize messages to shared memory, and eventually to a module-scoped **MessageCache** which is shared by several Subscribers. Subscribers receive a "zero-copy" view of this message that is: | |
| Publishers in `ezmsg` serialize messages to shared memory, and eventually into the process-local `MessageCache` owned by each Channel. That Channel-level cache is shared by all Subscribers attached to that Channel in the same process. Subscribers receive a "zero-copy" view of this message that is: |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the right place for this (important) note. At some point in the near future, I think I will expand on the high-level design and move that there. I expect that I will additionally refactor that document to have a separate high-level API explanation (to mirror the low-level API document).