Add unified benchmarking harness (iris.bench) #368
+1,278
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Benchmarking code across
examples/andbenchmark/reimplements identical warmup loops, timing, statistics, and printing—~100 lines of boilerplate per file. This addsiris.bench, a shared infrastructure to eliminate duplication and standardize measurements.Changes
Core module (
iris/bench.py)BenchmarkResult: dataclass storing mean/median/p50/p99/min/max with JSON exportBenchmarkRunner: context manager for parameter sweeps with barrier support@benchmark: decorator for simple function benchmarkingtorch_dtype_from_str(),compute_bandwidth_gbps()Integration
iris.benchin__init__.pyiris.do_benchfor timingTesting & Documentation
test_bench.py: full suite (GPU required)test_bench_basic.py: unit tests (no GPU)Usage
Before (~100 lines):
After (~50 lines):
Enables consistent CI performance tracking and reduces maintenance burden by centralizing benchmark infrastructure.
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.