Conversation
| transport-messaging-internals | ||
| axisarray | ||
|
|
||
| .. important:: `ezmsg` delivers subscriber messages with zero-copy semantics in all cases. Treat incoming messages as immutable, and copy data before mutating or republishing. See :doc:`transport-messaging-internals` for details and examples. |
There was a problem hiding this comment.
I'm not sure this is the right place for this (important) note. At some point in the near future, I think I will expand on the high-level design and move that there. I expect that I will additionally refactor that document to have a separate high-level API explanation (to mirror the low-level API document).
| GraphServer, Publisher, Subscriber, Channel | ||
| =========================================== | ||
|
|
||
| - **GraphServer**: a lightweight TCP service that tracks the topic DAG, keeps a registry of publishers/subscribers, and notifies subscribers when their upstream publishers change. It also brokers shared-memory segment creation and attachment. |
There was a problem hiding this comment.
A few of the terms here like TCP and DAG might need explanation.
My solution: link to a glossary for terms like this (otherwise it would only clutter the text). No need to do this here - for a future PR.
There was a problem hiding this comment.
Pull request overview
Adds new documentation explaining ezmsg transport internals and clarifies that subscriber delivery is always effectively zero-copy, emphasizing immutability requirements for received messages.
Changes:
- Adds a new “Transport and Messaging Internals” explainer page describing runtime components, transport selection, and backpressure.
- Updates pipeline Unit guidance and overview/explainer pages to highlight always-zero-copy subscriber semantics and link to the new explainer.
- Tweaks low-level API explainer to add context links and clarify performance expectations.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| docs/source/how-tos/pipeline/unit.rst | Replaces the old zero_copy bullet with an “always zero-copy” immutability warning and links to internals. |
| docs/source/explanations/transport-messaging-internals.rst | New detailed explainer for GraphServer/Publisher/Subscriber/Channel, transport paths, backpressure, and zero-copy implications. |
| docs/source/explanations/low-level-api.rst | Adds link to internals page and a performance clarification note. |
| docs/source/explanations/content-explanations.rst | Adds the new internals page to the toctree and an overview warning about zero-copy semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 1. A **Publisher** connects to the **GraphServer**, allocates shared-memory buffers, and registers its topic. It starts a small TCP server so Channels can connect back to it, then reports that server's address to the GraphServer. | ||
| 2. A **Subscriber** connects to the **GraphServer** and registers its topic. The GraphServer computes upstream publishers from the topic DAG and sends the Subscriber an `UPDATE` list of publisher IDs. | ||
| 3. For each publisher ID, the Subscriber asks the local **ChannelManager** to register a **Channel**. If a Channel does not yet exist in this process, it is created by: | ||
| 4. The **Channel** requesting a channel allocation from the GraphServer for the given publisher ID. The GraphServer returns a new channel ID plus the publisher's server address. |
There was a problem hiding this comment.
Step 4 is missing a verb ("The Channel requesting...") and reads ungrammatically. Consider changing it to something like "The Channel requests a channel allocation..." for clarity.
| 4. The **Channel** requesting a channel allocation from the GraphServer for the given publisher ID. The GraphServer returns a new channel ID plus the publisher's server address. | |
| 4. The **Channel** requests a channel allocation from the GraphServer for the given publisher ID. The GraphServer returns a new channel ID plus the publisher's server address. |
| `ezmsg` uses the fastest transport available per Publisher/Channel pair: | ||
|
|
||
| - **Local transport (same process)**: the Publisher pushes the object directly into the Channel (`put_local`), and the Channel stores it in the `MessageCache` without serialization. This is the lowest-overhead path. | ||
| - **Shared memory (different process, SHM OK)**: the Publisher serializes the object using `MessageMarshal` (pickle protocol 5 with buffer support), writes it into a ring of shared-memory buffers, and notifies the Channel with a `TX_SHM` message. The Channel reads from shared memory using the message ID and caches the deserialized object. | ||
| - **TCP (fallback or forced)**: if SHM is unavailable (attach failed, remote host) or `force_tcp=True`, the Publisher sends a `TX_TCP` payload (header + serialized buffers) directly over the channel socket. The Channel deserializes it and caches the result. |
There was a problem hiding this comment.
This section uses single-backtick interpreted text for protocol constants and code identifiers (e.g., put_local, MessageCache, MessageMarshal, TX_SHM, TX_TCP). In reStructuredText, inline literals should use double backticks so these render consistently as code and don't get treated as title references.
| Zero-copy Semantics and Message Ownership | ||
| ========================================= | ||
|
|
||
| Publishers in `ezmsg` serialize messages to shared memory, and eventually to a module-scoped **MessageCache** which is shared by several Subscribers. Subscribers receive a "zero-copy" view of this message that is: |
There was a problem hiding this comment.
There’s an internal inconsistency about MessageCache scope: earlier it says "Every Channel maintains a fixed-size MessageCache", but later the doc describes a "module-scoped MessageCache". Please reconcile/clarify whether the cache is per-Channel or shared at module/process scope to avoid confusing readers about ownership and isolation.
| Publishers in `ezmsg` serialize messages to shared memory, and eventually to a module-scoped **MessageCache** which is shared by several Subscribers. Subscribers receive a "zero-copy" view of this message that is: | |
| Publishers in `ezmsg` serialize messages to shared memory, and eventually into the process-local `MessageCache` owned by each Channel. That Channel-level cache is shared by all Subscribers attached to that Channel in the same process. Subscribers receive a "zero-copy" view of this message that is: |
| - You benefit from the pipeline tooling (graph visualization, CLI integration, etc.). | ||
| - You want a structured way to scale across threads/processes without managing it yourself. | ||
|
|
||
| .. important:: The low-level API is not any more performant than the high-level API. There is no meaningful performance hit when using the high-level API. |
There was a problem hiding this comment.
Minor grammar/formatting: "not any more performant" is typically written "not more performant", and there’s a double space between sentences here. Tightening the phrasing will read more professionally.
| .. important:: The low-level API is not any more performant than the high-level API. There is no meaningful performance hit when using the high-level API. | |
| .. important:: The low-level API is not more performant than the high-level API. There is no meaningful performance hit when using the high-level API. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Adds descriptions of backend transport and notes the zero-copy behavior of messaging (see ezmsg-org/ezmsg#209), highlighting the importance of treating incoming messages as immutable.