Conversation
Instead of cloning the entire accumulator collection on every yield iteration, use take_register + Rc::make_mut to get exclusive ownership and mutate in-place. This reduces comprehension yield from O(n²) to O(n) for both run-to-completion and suspendable execution modes. - Add RegoVM::take_register() helper that swaps register with Undefined - Comprehension yield now takes the accumulator, mutates via Rc::make_mut, and writes back — avoiding deep clones when refcount == 1 Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
These instructions were cloning the container register (bumping Rc to 2), then calling as_object_mut/as_array_mut/as_set_mut which invokes Rc::make_mut — deep-cloning the entire collection since refcount > 1. Use take_register instead so the Rc refcount stays at 1, making Rc::make_mut a no-op and allowing in-place mutation. Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
- execute_call_rule_common: move final_value into cache instead of cloning, since it is not used afterwards - finalize_rule_frame_data: add comment clarifying the clone is needed because the value is both cached and returned - Remove unnecessary .clone() on result_from_rule when setting register Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
Replace RuleInfo.clone() (which heap-allocates name, destructuring_blocks, and potentially function_info) with a cheap Arc<Program> clone (atomic refcount bump) followed by borrowing &RuleInfo from the local Arc. This eliminates per-rule-call heap allocations. Sites changed: - execute_call_rule_common: Arc clone + borrow - execute_call_rule_suspendable: Arc clone + borrow - finalize_rule_frame_data: Arc clone + borrow - handle_rule_break_event: inline Arc clone + borrow (was get_rule_info) - handle_rule_error_event: inline Arc clone + borrow (was get_rule_info) - Removed now-unused get_rule_info method Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
Remove unlinked bincode dependency. Use postcard (already a dep for rvm feature) for all binary serialization/deserialization in program serialization and tests. Also adds rvm_benchmark benchmark. Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
Every builtin call was allocating a Source (via from_contents), a Span, and N Ref<Expr> wrappers just to satisfy the builtin function signature. These dummy values are only used for error reporting context. Cache the dummy Span and Vec<Ref<Expr>> on the RegoVM struct. The Source and Span are created once on first builtin call; dummy Expr entries grow as needed and are reused across calls via mem::take/put-back pattern. This eliminates per-builtin-call heap allocations for Source (Rc + String + Vec<lines>), Span clones, and Rc<Expr> wrappers. Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
…rtual data - Cache builtin args Vec on RegoVM (mem::take/clear/put-back pattern) - Restructure builtins_cache as two-level map for clone-free lookup - Use IndexMap::get_index() in execute_entry_point_by_index - Use mutable Vec path stack in traverse_rule_tree_subobject (push/pop) - Walk data tree and rule-result paths by reference, clone only leaf - Use mem::replace in resume() instead of cloning ExecutionState Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
e4d1b10 to
fd44af7
Compare
There was a problem hiding this comment.
Pull request overview
This PR focuses on runtime performance optimizations in the Rego Virtual Machine (RVM), primarily by reducing cloning/allocation in hot paths and adding a comprehensive benchmark suite to measure VM performance across workloads and configurations.
Changes:
- Reduce intermediate
Valuecloning by walking trees/register paths by reference and cloning only leaf values. - Optimize mutation-heavy instructions by introducing
take_registerto keepRcrefcounts low and enableRc::make_mutin-place updates. - Add a new
rvm_benchmarkCriterion benchmark covering cold/hot execution, compilation, serialization, startup, and end-to-end scenarios.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/rvm/vm/virtual_data.rs | Avoid repeated clones during virtual data lookup and cache path construction. |
| src/rvm/vm/rules.rs | Reduce RuleInfo cloning by borrowing from a cloned Arc<Program>. |
| src/rvm/vm/machine.rs | Add take_register plus cached dummy span/expr/args buffers for builtin calls. |
| src/rvm/vm/functions.rs | Rework builtin-call execution to reuse buffers and reduce per-call allocations. |
| src/rvm/vm/execution.rs | Avoid allocating an entry-point list; optimize resume state handling. |
| src/rvm/vm/dispatch.rs | Use take_register for in-place collection mutations without deep clones. |
| src/rvm/vm/comprehension.rs | Use take_register + Rc::make_mut to mutate comprehension results in place. |
| benches/rvm_benchmark.rs | Add a comprehensive benchmark suite for RVM performance measurement. |
| Cargo.toml | Register the new rvm_benchmark bench behind the rvm feature. |
Comments suppressed due to low confidence (1)
src/rvm/vm/execution.rs:216
- In the invalid-resume-state branch,
current_stateis already owned (moved out ofold_state), but the code clones it to restoreself.execution_state. You can avoid the potentially expensive clone by formatting the debug string first (borrowingcurrent_state) and then movingcurrent_stateback intoself.execution_state.
current_state => {
self.execution_state = current_state.clone();
return Err(VmError::InvalidResumeState {
state: alloc::format!("{:?}", current_state),
pc: self.pc,
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
d45d96c to
0c633ca
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Restore cached_builtin_args on all error/early-return paths in execute_builtin_call to preserve allocation reuse - Use 1-based line/col and \"<builtin>\" filename in dummy span for clearer diagnostics - Restore result register before returning errors in comprehension mode-mismatch branches (both run-to-completion and suspendable) - Avoid clone in resume() invalid-state error path by formatting debug string before moving state back
0c633ca to
bbe4cf2
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
No description provided.