Harden benchmark workflow: retry builds, proactive clean, robust monitoring by sbryngelson · Pull Request #1170 · MFlowCode/MFC

sbryngelson · 2026-02-19T20:01:39Z

User description

Summary

Wrap bench builds in nick-fields/retry with 3 attempts and automatic ./mfc.sh clean between retries
Add proactive ./mfc.sh clean at start of all build scripts to prevent cross-compiler contamination from stale artifacts on persistent runners
Improve monitor_slurm_job.sh with better state detection and heartbeats
Add concurrency group to prevent duplicate bench runs per branch
Reduce timeout from 1400 to 480 minutes

Test plan

Bench workflow triggers correctly on push from authorized users
Parallel PR vs master builds complete successfully
monitor_slurm_job.sh correctly detects job completion/failure
Proactive clean prevents stale artifact linker errors

🤖 Generated with Claude Code

Summary by CodeRabbit

Chores
- Strengthened SLURM job monitoring with robust state tracking, enhanced error handling, and automatic cleanup on abnormal exits.
- Updated benchmark workflow triggers to activate on pull requests and code reviews instead of test completion.
- Implemented automatic job requeuing on preemption to improve CI/CD pipeline resilience and reduce job failures.

CodeAnt-AI Description

Harden benchmark CI: cancel orphaned cluster jobs, robust job monitoring, and retry builds with proactive clean

What Changed

The SLURM monitor now detects job state via squeue and sacct, recognizes terminal states (including PREEMPTED/REVOKED), prints periodic heartbeats, waits for the job output file to stabilize, and cancels the cluster job if the monitor exits abnormally so orphaned jobs are stopped.
Benchmark workflow runs only for relevant PRs/reviews on the same branch, uses a concurrency group keyed by branch to cancel duplicate runs, and lowers job timeout to 480 minutes.
Build steps are wrapped in a 3-attempt retry that runs a clean between attempts and uses a shorter per-retry timeout so retries fit within the job timeout; SBATCH submissions are set to auto-requeue on preemption.

Impact

✅ Fewer orphaned SLURM jobs
✅ Fewer CI failures from stale build artifacts
✅ Fewer duplicate benchmark runs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

…toring - Wrap bench builds in nick-fields/retry with 3 attempts and automatic ./mfc.sh clean between retries - Add proactive ./mfc.sh clean at start of all build scripts to prevent cross-compiler contamination from stale artifacts on persistent runners - Improve monitor_slurm_job.sh with better state detection and heartbeats - Add concurrency group to prevent duplicate bench runs per branch - Reduce timeout from 1400 to 480 minutes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codeant-ai · 2026-02-19T20:01:43Z

CodeAnt AI is reviewing your PR.

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

coderabbitai · 2026-02-19T20:02:08Z

📝 Walkthrough

Walkthrough

The pull request enhances SLURM job monitoring with state-driven polling and cleanup logic, refactors the benchmark workflow to use pull request triggers with retry-based build orchestration, and adds automatic job requeue support on preemption in the submission script.

Changes

Cohort / File(s)	Summary
SLURM Job Monitoring Enhancement `.github/scripts/monitor_slurm_job.sh`	Introduced state-driven monitoring via new `get_job_state()` and `is_terminal_state()` helper functions; replaced simplistic file-waiting logic with state-aware polling for PENDING, RUNNING, and terminal states; added cleanup path to cancel jobs on abnormal exits; strengthened output handling with tail read timeout and drain loop; signals successful completion via monitor_success variable.
Benchmark Workflow Refactoring `.github/workflows/bench.yml`	Replaced workflow_run trigger with pull_request and pull_request_review triggers; removed PR info collection step and associated outputs; consolidated job gating logic; replaced inline shell parallelization with nick-fields/retry@v3 for build orchestration with configurable retry policy; removed explicit Node version environment variables.
Job Submission Configuration `.github/workflows/phoenix/submit-bench.sh`	Added SBATCH `--requeue` option to enable automatic job requeuing on preemption across both outer and embedded SBATCH directives.

Sequence Diagram(s)

sequenceDiagram
    participant Script as monitor_slurm_job.sh
    participant SLURM as SLURM Scheduler
    participant FileSystem as Output File
    
    Script->>SLURM: get_job_state(job_id)
    SLURM-->>Script: squeue query
    alt squeue succeeds
        SLURM-->>Script: job state
    else squeue fails
        SLURM-->>Script: sacct fallback
    end
    
    loop Until Terminal State
        Script->>SLURM: query job state
        SLURM-->>Script: PENDING/RUNNING/CONFIGURING
        Script->>FileSystem: tail output file (with timeout)
        FileSystem-->>Script: latest output lines
        Script->>Script: check is_terminal_state()
    end
    
    alt Terminal State Reached
        Script->>FileSystem: wait for output quiescence
        FileSystem-->>Script: output stabilized
        Script->>Script: set monitor_success = 1
    else Output File Never Created
        Script->>SLURM: cancel job (cleanup)
        SLURM-->>Script: job cancelled
    end

sequenceDiagram
    participant GitHub as GitHub Actions
    participant Workflow as Benchmark Workflow
    participant RetryMechanism as nick-fields/retry
    participant BuildSystem as Build Executor
    
    GitHub->>Workflow: trigger on pull_request event
    Workflow->>Workflow: evaluate consolidated gating
    alt Gate conditions met
        Workflow->>RetryMechanism: invoke retry wrapper
        loop Retry Attempts
            RetryMechanism->>BuildSystem: execute parallel builds
            BuildSystem-->>RetryMechanism: build result
            alt Build fails and retries remain
                RetryMechanism->>BuildSystem: cleanup worktrees
                RetryMechanism->>BuildSystem: retry build
            else Build succeeds
                RetryMechanism-->>Workflow: success
            end
        end
    else Gate conditions not met
        Workflow->>Workflow: skip job execution
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Shell completion auto-install and pre-commit hook improvements #1124: Directly enhances monitor_slurm_job.sh with the state-driven monitoring, cleanup logic, and helper functions that replace the simplified polling approach in job submission workflows.

Suggested labels

Review effort 3/5

Poem

🐰 Whiskers twitching with glee ✨

SLURM jobs now monitored with care,
State machines dance through the air,
Workflows retry when storms appear,
Preempted jobs return without fear! 🎯

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The PR description includes a summary of changes, motivation, and test plan, but is missing required structured sections from the template (Type of change, Testing details, and Checklist items).	Add the standard template sections: check the appropriate 'Type of change' box, provide specific testing methodology, and complete the checklist items with concrete evidence of testing.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and concisely summarizes the three main changes: hardening the benchmark workflow through retry builds, proactive cleanup, and robust SLURM job monitoring.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codeant-ai · 2026-02-19T20:04:01Z

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review Temporary directory cleanup The temporary build directory (`$currentdir`) is removed near the end of the happy-path, but failures or early exits (e.g. other errors) may skip that cleanup. Add an exit/trap handler so the temporary directory is reliably cleaned and TMPDIR is unset on any exit. GNU stat portability The script uses `stat -c%s` to get file size, which is a GNU-specific option and may fail on systems with BSD stat (different flag semantics). This can cause the stabilization check to behave unexpectedly on non-GNU systems. Consider a portable fallback or checking for GNU stat availability. Tail PID reliability The script captures the PID of the tail process via process-substitution and `$!`. That PID capture can be unreliable across shells / kernel implementations because process substitution may spawn an extra subshell; killing the captured PID may not terminate the tail process and cleanup logic may not be effective. Consider explicitly starting `tail` in the background and capturing its PID (or using a FIFO / coproc) so cleanup reliably kills the right process. Proactive clean risk The script now runs `./mfc.sh clean` immediately at startup. If `./mfc.sh` is missing or `clean` exits non‑zero this can cause unexpected failures. Also calling clean unconditionally may hide transient failures if it aborts the job early. Consider guarding or tolerating the clean command and surfacing helpful diagnostics. Empty directory globbing The `for dir in benchmarks/*/; do` loop assumes the pattern expands to directories. If there are no matching directories and `nullglob` is not enabled, the literal pattern will be passed to `./mfc.sh` (causing a build failure). Either enable `shopt -s nullglob` or guard the loop with an existence check.

.github/scripts/monitor_slurm_job.sh

codeant-ai · 2026-02-19T20:05:26Z

CodeAnt AI finished reviewing your PR.

Copilot

Pull request overview

Hardens the CI benchmark workflow and cluster-side scripts to be more resilient on persistent/self-hosted runners and SLURM systems, reducing flaky benchmark runs and improving observability.

Changes:

Add proactive ./mfc.sh clean and simplify build scripts for Frontier/Frontier AMD and Phoenix bench runs.
Update bench.yml triggers/authorization logic, add a concurrency group, wrap builds in nick-fields/retry, and reduce workflow timeout.
Improve .github/scripts/monitor_slurm_job.sh with more robust SLURM state polling, heartbeats, and cleanup behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`.github/workflows/phoenix/submit-bench.sh`	Enables SLURM `--requeue` for Phoenix bench submissions.
`.github/workflows/phoenix/bench.sh`	Adds an upfront clean to avoid stale artifacts on persistent runners.
`.github/workflows/frontier/build.sh`	Adds `set -e`, proactive clean, and removes inline retry loop (now handled by workflow).
`.github/workflows/frontier_amd/build.sh`	Same as Frontier CCE script adjustments (clean + `set -e` + simplified build).
`.github/workflows/bench.yml`	Changes workflow triggers/conditions, adds concurrency grouping, wraps builds in retry, and adjusts timeouts.
`.github/scripts/monitor_slurm_job.sh`	Adds job-state helpers, better waiting logic, and improved streaming/heartbeat/cleanup.

.github/workflows/bench.yml

.github/scripts/monitor_slurm_job.sh

cubic-dev-ai

1 issue found across 6 files

Confidence score: 4/5

This PR looks safe to merge; the only concern is a low-severity workflow inefficiency rather than a functional bug.
In /.github/workflows/bench.yml, using ${{ github.event_name }} in the concurrency group can allow duplicate benchmark runs for the same ref, which may waste CI resources but shouldn’t affect product behavior.
Pay close attention to /.github/workflows/bench.yml - concurrency grouping may not cancel overlapping benchmark runs.

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/workflows/bench.yml">

<violation number="1" location=".github/workflows/bench.yml:10">
P2: Including `${{ github.event_name }}` in the concurrency group prevents `pull_request` and `pull_request_review` runs from canceling each other for the same ref, so duplicate benchmark runs can still happen. Use a single group key per ref to ensure only one run per branch is active.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

.github/workflows/bench.yml

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

.github/workflows/frontier_amd/build.sh (1)
26-32: Use = instead of == inside [ ] for POSIX portability.

== inside [ ] is a bash extension; the portable and idiomatic form is =. While this script uses bash, it's a trivial correctness improvement.
♻️ Proposed fix
-if [ "$run_bench" == "bench" ]; then
+if [ "$run_bench" = "bench" ]; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/frontier_amd/build.sh around lines 26 - 32, The shell
conditional in build.sh uses the non-portable test operator `==` to compare the
variable run_bench; update the condition in the `if [ "$run_bench" == "bench" ]`
check to use the POSIX-compatible `=` operator instead so the `if` branch (the
loop invoking ./mfc.sh run ...) becomes portable across /bin/sh implementations.
.github/workflows/frontier/build.sh (1)
26-32: Same == in [ ] as in frontier_amd/build.sh — use = for portability.
♻️ Proposed fix
-if [ "$run_bench" == "bench" ]; then
+if [ "$run_bench" = "bench" ]; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/frontier/build.sh around lines 26 - 32, The shell
conditional uses the non-portable operator `==` in the test expression; change
the conditional in the `if [ "$run_bench" == "bench" ];` line to use the
portable `=` operator (i.e., `if [ "$run_bench" = "bench" ];`), keeping the
variable `run_bench` quoted and leaving the rest of the block (the `for dir in
benchmarks/*/; ...` and `else` branch invoking `./mfc.sh`) unchanged.
.github/workflows/phoenix/submit-bench.sh (1)
47-47: Verify --requeue interaction with monitor_slurm_job.sh output-file tracking.

The state transitions PREEMPTED → REQUEUED → PENDING → RUNNING are handled correctly — get_job_state() treats all three as non-terminal, so the monitor stays alive. However, there is a real vulnerability: tail -f follows by inode (line 110), and if SLURM truncates the output file on requeue (the default behavior without --open-mode=append), the in-flight tail process will lose its inode reference and miss all new content from the requeued run. Additionally, if the CI runner is killed while the job is in REQUEUED or PENDING state, monitor_slurm_job.sh's cleanup() (lines 14–16) will call scancel, cancelling the requeued job.

Document the expected SLURM output-file behavior for requeued jobs on Phoenix (whether --open-mode=append is needed or already configured), and confirm whether the scancel-on-abnormal-exit behavior is intentional for requeued jobs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/phoenix/submit-bench.sh at line 47, Add documentation and
a confirmation check about SLURM output-file behavior and scancel-on-exit in the
submit-bench/monitor workflow: update the submit-bench.sh or repository CI docs
to state whether Phoenix config (or our job submission flags) uses
--open-mode=append for requeued runs so tail -f (referenced in
monitor_slurm_job.sh around the tail -f at line ~110) won't lose the file inode
on REQUEUED → RUNNING transitions; if not, state that we must add
--open-mode=append to sbatch invocation in submit-bench.sh. Also clarify and
confirm whether monitor_slurm_job.sh's cleanup() (lines ~14–16) intentionally
calls scancel on abnormal CI exits while a job is REQUEUED/PENDING, and if that
behavior is undesired, document that we should avoid scancel on cleanup for
non-terminal states or add a guard that checks get_job_state() before
cancelling. Ensure references to get_job_state(), tail -f, and cleanup() are
included so reviewers can locate the affected code.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/monitor_slurm_job.sh:
- Around line 59-66: The is_terminal_state() function currently omits the
PREEMPTED SLURM state, causing preempted jobs to be treated as non-terminal and
hang; update the case in is_terminal_state to include PREEMPTED alongside
COMPLETED|FAILED|CANCELLED|CANCELLED+|TIMEOUT|OUT_OF_MEMORY|NODE_FAIL|BOOT_FAIL|DEADLINE
so it returns 0 for PREEMPTED (ensuring the initial wait loop and main
monitoring loop detect it as terminal and exit appropriately with an error).

In @.github/workflows/bench.yml:
- Around line 110-118: The current parallel-run uses "wait $pid1 && wait $pid2",
which short-circuits and can leave the master build orphaned; change the logic
that launches the two background builds (the lines that set pid1 and pid2 and
call matrix.build_script) to always wait for both PIDs unconditionally by
calling wait for each PID separately, capture each exit status (e.g., rc1 from
pid1 and rc2 from pid2), and then exit with failure if either rc1 or rc2 is
non-zero; ensure the on_retry_command cleaning step (./mfc.sh clean in the
master directory) only runs after both waits complete so it cannot race with a
still-running master build.

---

Nitpick comments:
In @.github/workflows/frontier_amd/build.sh:
- Around line 26-32: The shell conditional in build.sh uses the non-portable
test operator `==` to compare the variable run_bench; update the condition in
the `if [ "$run_bench" == "bench" ]` check to use the POSIX-compatible `=`
operator instead so the `if` branch (the loop invoking ./mfc.sh run ...) becomes
portable across /bin/sh implementations.

In @.github/workflows/frontier/build.sh:
- Around line 26-32: The shell conditional uses the non-portable operator `==`
in the test expression; change the conditional in the `if [ "$run_bench" ==
"bench" ];` line to use the portable `=` operator (i.e., `if [ "$run_bench" =
"bench" ];`), keeping the variable `run_bench` quoted and leaving the rest of
the block (the `for dir in benchmarks/*/; ...` and `else` branch invoking
`./mfc.sh`) unchanged.

In @.github/workflows/phoenix/submit-bench.sh:
- Line 47: Add documentation and a confirmation check about SLURM output-file
behavior and scancel-on-exit in the submit-bench/monitor workflow: update the
submit-bench.sh or repository CI docs to state whether Phoenix config (or our
job submission flags) uses --open-mode=append for requeued runs so tail -f
(referenced in monitor_slurm_job.sh around the tail -f at line ~110) won't lose
the file inode on REQUEUED → RUNNING transitions; if not, state that we must add
--open-mode=append to sbatch invocation in submit-bench.sh. Also clarify and
confirm whether monitor_slurm_job.sh's cleanup() (lines ~14–16) intentionally
calls scancel on abnormal CI exits while a job is REQUEUED/PENDING, and if that
behavior is undesired, document that we should avoid scancel on cleanup for
non-terminal states or add a guard that checks get_job_state() before
cancelling. Ensure references to get_job_state(), tail -f, and cleanup() are
included so reviewers can locate the affected code.

.github/scripts/monitor_slurm_job.sh

.github/workflows/bench.yml

…ry timeout - Add PREEMPTED and REVOKED to monitor_slurm_job.sh terminal states so preempted jobs don't hang the monitor loop indefinitely - Wait for both build PIDs unconditionally to prevent orphaned processes racing with on_retry_command clean - Drop event_name from concurrency group so PR and review events for the same branch properly cancel each other - Reduce retry timeout to 150min so retries have room within the 480min job timeout Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codecov · 2026-02-19T23:08:47Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.05%. Comparing base (356b61f) to head (d5569b0).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1170   +/-   ##
=======================================
  Coverage   44.05%   44.05%           
=======================================
  Files          70       70           
  Lines       20498    20498           
  Branches     1990     1990           
=======================================
  Hits         9030     9030           
  Misses      10329    10329           
  Partials     1139     1139

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Resolve conflicts with MFlowCode#1148 (build caching): - frontier/build.sh, frontier_amd/build.sh: take upstream's cache + retry logic (proactive clean would defeat caching) - bench.yml: keep our pull_request trigger model (upstream's workflow_run Get PR Info step doesn't apply) - phoenix/bench.sh: remove proactive clean (unnecessary overhead for fresh checkouts, and would break caching) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codeant-ai · 2026-02-20T19:35:23Z

CodeAnt AI is running Incremental review

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

.github/scripts/monitor_slurm_job.sh

codeant-ai · 2026-02-20T19:38:03Z

CodeAnt AI Incremental review completed.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/scripts/monitor_slurm_job.sh (1)
195-198: ⚠️ Potential issue | 🔴 Critical

Bug: ExitCode regex also matches DerivedExitCode, causing false failures.

scontrol show job output contains both ExitCode=X:Y and DerivedExitCode=X:Y. The pattern ExitCode=[0-9]+:[0-9]+ is a substring of DerivedExitCode=…, so grep -oE emits two matches. After cut, exit_code becomes a two-line string ("0:0\n0:0"), which never equals "0:0" on line 217, making every successful job report as failed.
🐛 Proposed fix — take only the first match
 scontrol_output=$(scontrol show job "$job_id" 2>/dev/null || echo "")
 if [ -n "$scontrol_output" ]; then
-  exit_code=$(echo "$scontrol_output" | grep -oE 'ExitCode=[0-9]+:[0-9]+' | cut -d= -f2 || echo "")
+  exit_code=$(echo "$scontrol_output" | grep -oE 'ExitCode=[0-9]+:[0-9]+' | head -n1 | cut -d= -f2 || echo "")
 fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/monitor_slurm_job.sh around lines 195 - 198, The ExitCode
extraction from scontrol_output is matching both ExitCode and DerivedExitCode,
producing multiple lines in exit_code; update the extraction pipeline that sets
exit_code (the grep/cut sequence that reads scontrol_output) to only take the
first match (for example use grep -m 1 or pipe through head -n 1 after grep) so
exit_code becomes a single "X:Y" string; keep the variable names scontrol_output
and exit_code and ensure subsequent comparison logic still expects a single-line
"0:0" value.

🧹 Nitpick comments (1)

.github/workflows/bench.yml (1)
31-32: Gating condition is comprehensive but very dense — consider a trailing comment.

The multi-clause if correctly restricts self-hosted runner execution to (a) approved reviews, (b) PRs by trusted authors, or (c) manual dispatch. It reads correctly, but a brief inline YAML comment summarizing the intent would help future maintainers parse it faster.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/bench.yml around lines 31 - 32, Add a short trailing YAML
comment to the long `if:` expression that summarizes its intent (e.g., "run on
MFlowCode/MFC when changes detected and either PR approved, PR by trusted
authors, or manual dispatch") so future maintainers can quickly understand the
gating; locate the multi-clause `if: ${{ github.repository=='MFlowCode/MFC' &&
needs.file-changes.outputs.checkall=='true' &&
((github.event_name=='pull_request_review' &&
github.event.review.state=='approved') || (github.event_name=='pull_request' &&
(github.event.pull_request.user.login=='sbryngelson' ||
github.event.pull_request.user.login=='wilfonba')) ||
github.event_name=='workflow_dispatch') }}` and append a concise comment to that
line.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In @.github/scripts/monitor_slurm_job.sh:
- Around line 195-198: The ExitCode extraction from scontrol_output is matching
both ExitCode and DerivedExitCode, producing multiple lines in exit_code; update
the extraction pipeline that sets exit_code (the grep/cut sequence that reads
scontrol_output) to only take the first match (for example use grep -m 1 or pipe
through head -n 1 after grep) so exit_code becomes a single "X:Y" string; keep
the variable names scontrol_output and exit_code and ensure subsequent
comparison logic still expects a single-line "0:0" value.

---

Duplicate comments:
In @.github/scripts/monitor_slurm_job.sh:
- Around line 58-66: The review note is a duplicate—there is no code change
required because is_terminal_state() already includes PREEMPTED and REVOKED;
resolve this by removing the duplicate review comment or marking it resolved in
the PR so no further action is expected on the is_terminal_state function.

---

Nitpick comments:
In @.github/workflows/bench.yml:
- Around line 31-32: Add a short trailing YAML comment to the long `if:`
expression that summarizes its intent (e.g., "run on MFlowCode/MFC when changes
detected and either PR approved, PR by trusted authors, or manual dispatch") so
future maintainers can quickly understand the gating; locate the multi-clause
`if: ${{ github.repository=='MFlowCode/MFC' &&
needs.file-changes.outputs.checkall=='true' &&
((github.event_name=='pull_request_review' &&
github.event.review.state=='approved') || (github.event_name=='pull_request' &&
(github.event.pull_request.user.login=='sbryngelson' ||
github.event.pull_request.user.login=='wilfonba')) ||
github.event_name=='workflow_dispatch') }}` and append a concise comment to that
line.

Copilot AI review requested due to automatic review settings February 19, 2026 20:01

sbryngelson mentioned this pull request Feb 19, 2026

Add test sharding, proactive clean, and retry logic for self-hosted CI #1171

Open

4 tasks

codeant-ai bot added the size:L This PR changes 100-499 lines, ignoring generated files label Feb 19, 2026

Copilot started reviewing on behalf of sbryngelson February 19, 2026 20:03 View session

codeant-ai bot reviewed Feb 19, 2026

View reviewed changes

.github/scripts/monitor_slurm_job.sh Outdated Show resolved Hide resolved

Copilot AI reviewed Feb 19, 2026

View reviewed changes

cubic-dev-ai bot reviewed Feb 19, 2026

View reviewed changes

.github/workflows/bench.yml Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

.github/scripts/monitor_slurm_job.sh Show resolved Hide resolved

.github/workflows/bench.yml Show resolved Hide resolved

codeant-ai bot added size:L This PR changes 100-499 lines, ignoring generated files and removed size:L This PR changes 100-499 lines, ignoring generated files labels Feb 20, 2026

codeant-ai bot reviewed Feb 20, 2026

View reviewed changes

.github/scripts/monitor_slurm_job.sh Show resolved Hide resolved

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

sbryngelson merged commit 3781b98 into MFlowCode:master Feb 21, 2026
53 of 73 checks passed

Comments

Conversation

sbryngelson commented Feb 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

Test plan

Summary by CodeRabbit

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

codeant-ai bot commented Feb 19, 2026

Thanks for using CodeAnt! 🎉

Uh oh!

coderabbitai bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

codeant-ai bot commented Feb 19, 2026

Nitpicks 🔍

Uh oh!

Uh oh!

codeant-ai bot commented Feb 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codeant-ai bot commented Feb 20, 2026

Thanks for using CodeAnt! 🎉

Uh oh!

Uh oh!

codeant-ai bot commented Feb 20, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

sbryngelson commented Feb 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 19, 2026 •

edited

Loading

codecov bot commented Feb 19, 2026 •

edited

Loading