Refactor MIR graph rendering to use a unified traversal via GraphBuilder by 0xh4ty · Pull Request #132 · runtimeverification/stable-mir-json

0xh4ty · 2026-03-02T09:57:24Z

Summary

This PR introduces a format-agnostic MIR graph traversal based on a new GraphBuilder abstraction. Traversal semantics are centralized in a single implementation, while individual renderers are responsible only for presentation.

As an initial consumer, the D2 renderer has been ported to this new traversal behind a parallel code path. The legacy D2 implementation is retained for now to allow safe comparison and validation.

What changed

Introduced a GraphBuilder trait describing semantic graph events
Added a single generic traversal (render_graph) that owns MIR graph structure and traversal order
Refactored the D2 renderer to implement GraphBuilder
Moved all semantic stringification to GraphContext
Traversal now passes full MIR structures (statements, terminators, call arguments) to renderers
Wired the new D2 path in parallel to the legacy one for validation
Verified that legacy and new D2 outputs are byte-for-byte identical across the test suite
Code passes both cargo fmt and cargo clippy

Why this change

Previously, each renderer duplicated traversal logic, which made the code harder to reason about and maintain. Centralizing traversal ensures:

a single source of truth for graph semantics
easier maintenance and future refactors
cleaner renderer implementations
safer extension to additional output formats

Validation and testing

The new traversal can be exercised by enabling the experimental D2 path:

SMIR_D2_NEW=1 cargo run -- --d2 tests/integration/programs/fibonacci.rs

To compare outputs against the legacy renderer:

git diff --no-index output-d2/fibonacci.smir.d2 fibonacci.smir.d2

No differences were observed across the existing integration test suite.

Status and follow-ups

The legacy D2 renderer is intentionally kept for now
Once this approach is validated, the D2 renderer can be switched to use the new traversal by default in follow-up commits.
Refactoring the DOT renderer to the same model will follow in a separate PR

Notes for reviewers

This PR is a structural refactor only. Traversal semantics and output behavior are unchanged, and the parallel wiring is intended to make review and validation straightforward.

Note on function IDs
The legacy D2 renderer uses short_name-based function IDs, which are prone to collisions in cases such as multiple monomorphizations of the same generic.

This behavior is intentionally preserved for now so that the new traversal can be validated via byte-for-byte output comparison. Collision avoidance will be addressed in follow-up commits by incorporating body hashes into function IDs.

Stevengre · 2026-03-03T09:19:49Z

LGTM! Would you remove the previous implementation before merging? Or just in another PR?

@dkcumming what do you think?

0xh4ty · 2026-03-03T09:24:30Z

Thanks! I’ll remove the legacy D2 implementation in this PR itself.

Stevengre · 2026-03-03T09:33:39Z

src/mk_graph/traverse.rs

+/// Format-agnostic MIR graph traversal.
+/// Owns traversal order and graph semantics, delegates rendering to `GraphBuilder`.
+pub fn render_graph<B: GraphBuilder>(smir: &SmirJson, mut builder: B) -> B::Output {
+    let ctx = GraphContext::from_smir(smir);


pub fn to_d2_file_new(&self) -> String { let ctx = GraphContext::from_smir(self); render_graph(self, D2Builder::new(&ctx)) }

It seems not efficient. It builds GraphContext twice.

Good catch. I’ll move GraphContext construction out of render_graph so it is built only once and passed through.

Stevengre · 2026-03-03T09:58:13Z

src/mk_graph/traverse.rs

+) {
+    let fn_id = short_name(name);
+    let label = name_lines(name);
+    let is_local = true;


Should be body.is_some() ?

Yes, that should be body.is_some(). I’ll fix that.

Stevengre · 2026-03-03T10:07:28Z

src/mk_graph/traverse.rs

+pub trait GraphBuilder {
+    type Output;
+
+    fn begin_graph(&mut self, name: &str);


I'd like to have some description here.

Got it. I’ll expand the doc comment.

dkcumming

@0xh4ty this is great work! I love what you have done so far! I think it's time to go all in and get the old stuff out and hook up the full implementation of d2 and dot (okay if that is another PR if necessary). I think isolating the shared logic out in the traverse module is a great improvement. Do you think you are fine to do the full conversion? Also I think Jianhong had some great feedback too

cds-amal

Good work @0xh4ty . @Stevengre gave some excellent feedback, and I added a few notes on the trait boundary design.

The traversal extraction is the right idea, and the hard part (generic traversal order, clean separation of items/blocks/edges) is already solid. The main thing to tighten up is where the format-agnostic boundary actually sits: a couple of the trait methods (block, block_edge) still carry D2-specific assumptions, which means the next renderer would inherit those assumptions rather than being free to do its own thing.

The comments above walk through the specifics. Once those methods pass structured data instead of raw MIR types and format-specific conventions, this will be a clean foundation for DOT, Mermaid, and whatever else comes next.

I also want to see more comments, as I want this project to be usable via cargo doc --open, one more sign of a mature Cargo library.

Finally, my vote is to rip out the old, and use this as the way forward. Though, you will have to make sure the operational parts remain consistent so we can generate graphs :)

cds-amal · 2026-03-03T23:34:26Z

src/mk_graph/traverse.rs

+) {
+    let fn_id = short_name(name);
+    let label = name_lines(name);
+    let is_local = true;


this true assignment is unconditional; is it needed?

This will be removed.

cds-amal · 2026-03-03T23:55:33Z

src/mk_graph/output/d2.rs

+        self.buf.push_str(&format!("  bb{}: \"{}\"\n", idx, label));
+    }
+
+    fn block_edge(&mut self, _fn_id: &str, from: usize, to: usize, _label: Option<&str>) {


Same issue here: block_edge bakes D2's arrow syntax into the implementation, and the Option<&str> label parameter (which it looks like D2Builder ignores?) is a tell that the signature was shaped around what D2 needs today, with a speculative parameter tacked on for a future renderer.

This is the second method where format-specific concerns leak through the trait boundary, so it's worth stepping back and asking: what's the trait actually buying us?

A well-drawn trait boundary gives you two things: the code above it can change without touching the code below (new traversal logic doesn't require touching every renderer), and the code below can change without touching the code above (new output format, same traversal). But that only works if the boundary is at the right level of abstraction. When trait methods receive raw MIR types or carry format-specific assumptions, every new renderer has to understand the same internals, and the trait becomes ceremony rather than insulation.

The fix is the same as for block: have the traversal own the "what" (which blocks connect, what the edge represents) and pass structured, pre-rendered data. Let each renderer own the "how" (syntax, escaping, layout). That way the trait boundary actually earns its keep.

Thanks, that makes sense. The label parameter was speculative and is not used by the current renderers, so I will remove it to avoid leaking format assumptions into the trait. I will keep traversal responsible for edge semantics and pass structured, pre-rendered data to builders, while the renderers handle syntax and layout.

cds-amal · 2026-03-04T07:34:28Z

Hey @0xh4ty , I had the opportunity to think this through some more and, well, here you go :)

First: the GraphBuilder trait is the right abstraction. Separating "walk the graph" from "emit format-specific syntax" is exactly where we want to end up, and the incremental rollout (old path kept, new path behind an env var) is a clean way to get there without breaking anything.

The question I want to think through is: what happens when we go to add DOT, Mermaid, Markdown, or other formats on top of this? Right now, each builder holds a &GraphContext, calls ctx.render_stmt() / ctx.render_terminator() / ctx.render_operand() itself, and carries a lifetime parameter. That means every new format re-implements the same rendering logic. And the trait's surface area will grow as formats discover they need additional hooks (locals_table, empty_body, external_function, etc.); each of those becomes a new method that every builder has to implement, even if it's a no-op.

The thing is, the current GraphBuilder is quietly conflating two jobs: (1) "what is the graph structure?" (nodes, edges, containment) and (2) "how do I turn a Statement or Terminator into a human-readable string?" Every builder currently does both. If we pull those apart, adding a new format becomes "just format pre-rendered strings into your syntax" rather than "re-implement MIR rendering and format-specific syntax."

So here's a concrete idea: instead of the driver pushing raw stable_mir types to the builder via a sequence of granular calls (begin_function / block / block_edge / call_edge / end_function), the driver produces a RenderedFunction struct that the builder receives as one cohesive unit:

/// A single basic block with pre-rendered content and structural edges.
struct RenderedBlock<'a> {
    idx: usize,
    stmts: Vec<String>,              // pre-rendered via GraphContext
    terminator: String,               // pre-rendered
    raw_terminator: &'a Terminator,   // escape hatch for structural inspection
    cfg_edges: Vec<(usize, Option<String>)>,  // (target_block, optional label)
}

/// A fully analyzed function, ready for format-specific rendering.
struct RenderedFunction<'a> {
    id: String,                       // stable ID (short_name + body hash)
    display_name: String,             // name_lines() output for labels
    is_local: bool,
    locals: Vec<(usize, String)>,     // (index, type_with_layout string)
    blocks: Vec<RenderedBlock<'a>>,
    call_edges: Vec<CallEdge>,        // resolved callee IDs and pre-rendered args
}

struct CallEdge {
    block_idx: usize,
    callee_id: String,
    callee_name: String,
    rendered_args: String,
}

The traversal driver does all the GraphContext work (rendering statements, resolving callees, computing function IDs) and hands RenderedFunction values to the builder. The trait itself simplifies:

trait GraphBuilder {
    type Output;

    fn begin_graph(&mut self, name: &str);
    fn alloc_legend(&mut self, lines: &[String]);
    fn type_legend(&mut self, lines: &[String]);
    fn external_function(&mut self, id: &str, name: &str);
    fn render_function(&mut self, func: &RenderedFunction);
    fn static_item(&mut self, id: &str, name: &str);
    fn asm_item(&mut self, id: &str, content: &str);
    fn finish(self) -> Self::Output;
}

What does this actually buy us? A few things:

No lifetime on builders. RenderedFunction owns its strings, so the builder doesn't need &GraphContext or a lifetime parameter. D2Builder becomes a plain struct, not D2Builder<'a>.

Rendering logic written once. The driver calls ctx.render_stmt(), ctx.render_terminator(), ctx.render_operand(). No builder ever touches GraphContext. You write the rendering code once; every format gets it for free.

Still flexible where it matters. The raw_terminator field (and potentially raw_stmts if we add it later) is an escape hatch for builders that need structural information; say, coloring blocks differently based on terminator kind, or visually distinguishing Call terminators from Assert. Builders that don't need this just ignore it.

Fewer trait methods. The granular push-based sequence (begin_function / block / block_edge / end_function) collapses into one render_function(&RenderedFunction) call. Builders that want the push style can still iterate func.blocks internally; the trait just doesn't force a particular traversal order.

New hooks emerge as data, not methods. Instead of adding locals_table(), empty_body(), label_to_entry_edge() as separate trait methods that every builder must implement (even as no-ops), these become fields on RenderedFunction that builders can inspect or ignore. func.locals is empty if there are no locals; func.blocks is empty if there's no body. The builder decides how to handle each case in its own render_function impl. No trait churn.

One more thing worth noting: function IDs. The current branch uses short_name(name) for graph node IDs, which works for simple cases but will collide when two monomorphizations of the same generic (say, foo::<i32> and foo::<u64>) produce the same short name. A make_fn_id helper that incorporates a body hash as a tiebreaker fixes this; with the RenderedFunction approach, that helper lives in the driver and produces func.id, so no format has to worry about it:

/// Generate a stable, unique ID from a symbol name and body.
/// Handles the case where multiple monomorphizations of the same
/// generic produce the same short_name by incorporating a body hash.
fn make_fn_id(name: &str, body: &Body) -> String { ... }

In terms of concrete next steps, I'd suggest:

Introduce RenderedBlock, RenderedFunction, and CallEdge structs (could live in traverse.rs alongside the driver).
Have render_graph produce RenderedFunction values and pass them to the builder.
Simplify D2Builder: drop the &'a GraphContext field and lifetime parameter, implement the narrower render_function(&RenderedFunction) method.
Verify the D2 output is identical (should be a pure refactor; no output change).
As a stretch goal: try implementing a second format (DOT or Mermaid) on top of the new trait to validate that it's genuinely easy to add.

Of course, feel free to push back, if you have a different roadmap in mind.

0xh4ty · 2026-03-04T15:21:06Z

Thanks for the detailed explanation. That separation between graph structure and MIR string rendering makes sense. I will refactor the driver to produce RenderedFunction, RenderedBlock and CallEdge data with pre-rendered strings and simplify the GraphBuilder trait to operate on that representation. This should also remove the need for GraphContext inside builders and eliminate the lifetime parameter. I’ll update the PR accordingly.

…ll args

…ctures

Add detailed Rustdoc comments for GraphBuilder, RenderedFunction, RenderedBlock, CallEdge, and traversal entry points. The documentation explains the separation between MIR traversal and format-specific rendering, and clarifies the responsibilities of each structure. This improves discoverability via `cargo doc` and makes the graph rendering architecture easier to understand for future contributors.

Resolve compilation issues introduced after rebasing onto upstream master. Remove an unused import, update the SmirJson impl to match the new upstream definition without lifetime parameters, and implement Default for D2Builder to satisfy Clippy's new_without_default lint. Run cargo fmt to normalize formatting.

Remove the unused is_local field from RenderedFunction since the local/external distinction is already represented structurally in the GraphBuilder API. Add raw_stmts to RenderedBlock as an escape hatch for renderers that need access to the underlying MIR statements in addition to the pre-rendered strings.

0xh4ty · 2026-03-05T07:48:08Z

I pushed an update addressing the review feedback.

Main changes:

• Introduced RenderedFunction, RenderedBlock, and CallEdge as the traversal output structures.
• The traversal layer now performs MIR rendering using GraphContext and constructs these structures.
• GraphBuilder implementations now receive pre-rendered data and are responsible only for format-specific formatting.
• Builders no longer depend on GraphContext, removing the lifetime parameter and avoiding duplicated MIR rendering logic across renderers.
• Function IDs now incorporate a body hash to avoid collisions between monomorphizations.
• The legacy D2 implementation has been removed and the renderer now uses the new traversal by default.

The D2 output remains the same except for the expected function ID change caused by the body hash used for collision avoidance.

cds-amal

Thanks @0xh4ty, this has come a long way! The RenderedFunction / RenderedBlock / CallEdge structs are exactly what I had in mind, and the D2 builder is clean and easy to follow. The hard part (extracting traversal from rendering, getting the data shapes right) is done.

A few things to get us across the finish line. I left inline comments on cfg_edges, and hash_body collision; the rest I'll cover here.

raw_stmts and raw_terminator: on second thought, let's not! I want to own this one, because my earlier comment sent you in the wrong direction. I proposed raw_terminator as an escape hatch "for builders that need structural information; say, coloring blocks differently based on terminator kind." Having now seen how the implementation shakes out, I think we're better off without it.

Here's the tell: raw_stmts and raw_terminator are the only reason RenderedBlock and RenderedFunction need a lifetime parameter. Remove them, and the structs become fully owned, which is cleaner for everyone. D2Builder already ignores both fields. And the use case I imagined (coloring based on terminator kind) doesn't need the full MIR type; a simple TerminatorTag enum (just the discriminant, no payload) would do the job if we ever actually need it.

The deeper issue: these fields quietly re-introduce the coupling we're trying to eliminate. The whole point of the RenderedFunction design is that builders see pre-rendered data and don't need to understand MIR internals. An escape hatch that hands them raw &[Statement] and &Terminator undermines that; it's the same boundary leak as the old block(...) method, just made optional instead of required. Let's take them out; if a future builder genuinely needs something the rendered data doesn't provide, that's a signal to enrich the rendered data (like the cfg_edges labels in my inline comment), not to punch through the abstraction.

(My apologies for the whiplash; sometimes we have to see the implementation to realize a design idea doesn't carry its weight.)

external_function is declared but never called: render_graph never calls it. The DOT renderer needs this: it creates red-colored nodes for functions that appear as call targets but aren't in the items list. The traversal has everything it needs to compute this. Either wire it up, or remove it from the trait until it's needed. Dead code in a trait is worse than dead code in a function, because every new impl has to carry it.

I would suggest porting DOT to the new traversal; this is the real proof that the abstraction works if DOT ports cleanly, we know the trait boundary is in the right place. If it doesn't, we'll learn something useful about what's missing. Either way, we come out ahead.

cds-amal · 2026-03-11T23:38:50Z

src/mk_graph/util.rs

+/// Used to avoid fn_id collisions between monomorphizations.
+pub fn hash_body(body: &Body) -> u64 {


It seems strange to hash a function and not use its mangled name, which is guaranteed to be unique. The hash seems to be structurally based and I wonder if multiple generic functions could collide. @dkcumming , what do you think?

cds-amal · 2026-03-11T23:56:07Z

src/mk_graph/traverse.rs

+
+            let cfg_edges = terminator_targets(&block.terminator)
+                .into_iter()
+                .map(|t| (t, None))


This is where information gets lost. The traversal has the full TerminatorKind right here; it knows whether an edge is a SwitchInt branch (with a discriminant value), a cleanup unwind, or a normal successor. But terminator_targets flattens all of that into a plain Vec<usize>, and then this code wraps each target in (t, None), discarding the labels entirely.

The Option<String> in cfg_edges exists precisely for this data. Look at what the DOT renderer needs: SwitchInt edges carry discriminant values like "0", "1", cleanup edges get "Cleanup", the otherwise branch gets "other". That information is available right now, in block.terminator.kind, at this exact point in the traversal. One scope exit later, it's gone; a builder that needs it would have to reach into raw_terminator to re-derive what the traversal already knew.

That's the pattern we're trying to break: the traversal owns the "what" (which blocks connect and what each edge means), and the builder owns the "how" (syntax, escaping, visual style). When the traversal discards edge semantics, it forces builders to do the traversal's job, and the trait boundary stops earning its keep.

I would explore matching on the terminator kind here instead of delegating to terminator_targets.

Got it. I’ll add matching logic on terminator.kind in the traversal and populate cfg_edges with the appropriate labels

0xh4ty · 2026-03-12T07:36:20Z

Thanks for the clarification, that makes sense. I’ll remove raw_stmts and raw_terminator so the rendered structs become fully owned. I’ll also wire up external_function in the traversal to fix the dead code.

0xh4ty requested a review from a team March 2, 2026 09:57

Stevengre reviewed Mar 3, 2026

View reviewed changes

dkcumming reviewed Mar 3, 2026

View reviewed changes

cds-amal self-requested a review March 3, 2026 23:25

cds-amal requested changes Mar 4, 2026

View reviewed changes

0xh4ty added 16 commits March 5, 2026 12:28

mk_graph: introduce GraphBuilder trait

1f0a4aa

d2: extract item traversal helper

bb42c14

d2: introduce D2Builder struct

6abfb4a

d2: implement GraphBuilder for D2Builder

5282380

mk_graph: add generic graph traversal

543a515

d2: wire GraphBuilder traversal behind flag

7c36303

mk_graph: make traversal independent of D2 formatting

52e550e

mk_graph: add comments clarifying traversal responsibilities

245d129

mk_graph: extend GraphBuilder to pass statements, terminators, and ca…

286bb97

…ll args

d2: elide explicit lifetime in GraphBuilder impl

cc7b44c

mk_graph: introduce RenderedFunction, RenderedBlock and CallEdge stru…

b737cd5

…ctures

mk_graph: refactor traversal and builders to use RenderedFunction model

36eb8b6

mk_graph: add body hash to function IDs to avoid collisions

2c3d19a

d2: remove SMIR_D2_NEW flag and use new traversal by default

9bfbd90

0xh4ty force-pushed the refactor/unified-graph-traversal branch from dbb514a to 9552849 Compare March 5, 2026 07:26

0xh4ty requested review from Stevengre, cds-amal and dkcumming March 5, 2026 07:44

cds-amal requested changes Mar 12, 2026

View reviewed changes

		/// Used to avoid fn_id collisions between monomorphizations.
		pub fn hash_body(body: &Body) -> u64 {

Conversation

0xh4ty commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Why this change

Validation and testing

Status and follow-ups

Notes for reviewers

Uh oh!

Stevengre commented Mar 3, 2026

Uh oh!

0xh4ty commented Mar 3, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkcumming left a comment

Choose a reason for hiding this comment

Uh oh!

cds-amal left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cds-amal commented Mar 4, 2026

Uh oh!

0xh4ty commented Mar 4, 2026

Uh oh!

0xh4ty commented Mar 5, 2026

Uh oh!

cds-amal left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

0xh4ty commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

0xh4ty commented Mar 2, 2026 •

edited

Loading

cds-amal left a comment •

edited

Loading