BREAKING: Rename --max-attempts to --n-critic-runs by juanmichelini · Pull Request #325 · OpenHands/benchmarks

juanmichelini · 2026-01-16T01:59:01Z

Summary

This PR renames the --max-attempts parameter to --n-critic-runs across the benchmarks codebase to better reflect its purpose: controlling the number of critic evaluation runs in iterative mode.

Changes

CLI argument: --max-attempts → --n-critic-runs
Model field: max_attempts → n_critic_runs (EvalMetadata)
Updated files: 17 files total
- 7 run_infer.py files (all benchmarks)
- Core logic files (evaluation.py, iterative.py)
- 4 test files
- 2 README documentation files

Breaking Changes

This is a breaking change for users. Existing scripts and workflows must be updated.

Migration Required

CLI usage: --max-attempts 3 → --n-critic-runs 3
Python API: EvalMetadata(max_attempts=3) → EvalMetadata(n_critic_runs=3)

Related PRs

This PR is part of a coordinated change. A corresponding PR will be created for the evaluation repo.

Testing

All test files updated
Changes verified across all benchmarks

This breaking change renames the parameter across the codebase to better reflect its purpose: controlling the number of critic evaluation runs in iterative mode, not general retry attempts. Changes: - CLI argument: --max-attempts → --n-critic-runs - Model field: max_attempts → n_critic_runs (EvalMetadata) - Updated all run_infer.py files (7 benchmarks) - Updated core logic (evaluation.py, iterative.py) - Updated test files (4 files) - Updated documentation (2 README files) Migration guide: - Update CLI usage: --max-attempts 3 → --n-critic-runs 3 - Update EvalMetadata construction: max_attempts= → n_critic_runs= Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-16T02:00:08Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #325 at branch `rename-max-attempts-to-n-critic-runs`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

juanmichelini · 2026-03-12T23:00:27Z

testing here https://github.com/OpenHands/evaluation/actions/runs/23027808880

juanmichelini · 2026-03-12T23:00:31Z

testing here https://github.com/OpenHands/evaluation/actions/runs/23027808880

juanmichelini · 2026-03-13T16:12:19Z

retesting here https://github.com/OpenHands/evaluation/actions/runs/23059734672

juanmichelini · 2026-03-14T17:12:54Z

@OpenHands bring all changes from main to this branch. resolve any conflicts like this:
main gets priority, then we make sure that we are respecting the "Rename --max-attempts to --n-critic-runs" even if we have to re do it.

openhands-ai · 2026-03-14T17:13:16Z

I'm on it! juanmichelini can track my progress at all-hands.dev

…o-n-critic-runs

Merged latest main into branch. Re-applied the rename for: - New benchmarks from main (swebenchmultilingual, swefficiency) - Tests that were reverted by auto-merge (test_iterative_resume, test_workspace_cleanup) - commit0/config.py default key - swtbench README documentation - evaluation.py log message - models.py formatting fix Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-03-14T17:22:25Z

All changes have already been pushed to the remote branch rename-max-attempts-to-n-critic-runs. Here's the summary:

Summary

I merged all changes from main into the PR branch and re-applied the max_attempts → n_critic_runs rename, following the conflict resolution strategy: main gets priority, then re-apply the rename.

What was done

Merged origin/main (31 new commits) — no conflicts, auto-merged cleanly.
Re-applied the rename in 8 files that either came from main or were reverted by the merge:

File	Change
`benchmarks/swebenchmultilingual/run_infer.py`	New benchmark from main — renamed `args.max_attempts` → `args.n_critic_runs`, `max_attempts=` → `n_critic_runs=`
`benchmarks/swefficiency/run_infer.py`	New benchmark from main — same rename
`tests/test_iterative_resume.py`	`max_attempts=3` → `n_critic_runs=3` (2 occurrences reverted by merge)
`tests/test_workspace_cleanup.py`	`max_attempts=1` → `n_critic_runs=1` (3 occurrences reverted by merge)
`benchmarks/commit0/config.py`	Config dict key `"max_attempts"` → `"n_critic_runs"` + comment update
`benchmarks/swtbench/README.md`	`--max-attempts` → `--n-critic-runs` in CLI examples
`benchmarks/utils/evaluation.py`	Log string "max attempts" → "critic runs"
`benchmarks/utils/models.py`	Ruff auto-formatting fix

Checklist

✅ Main merged into branch (main gets priority)
✅ Rename re-applied to all new/reverted files
✅ No remaining max_attempts references (except unrelated retry logic in modal_patches.py)
✅ All pre-commit checks pass (ruff format, ruff lint, pycodestyle, pyright)
✅ All 157 tests pass
✅ Changes are concise — only the necessary renames, no extraneous modifications
✅ Pushed to origin/rename-max-attempts-to-n-critic-runs

juanmichelini mentioned this pull request Jan 16, 2026

clarify retry semantics (max_attempts vs max_retries) in evaluation #295

Open

Merge branch 'main' into rename-max-attempts-to-n-critic-runs

bf2f07c

juanmichelini mentioned this pull request Mar 6, 2026

Rename max_retries to max_attempts for clarity #482

Open

openhands-agent added 2 commits March 14, 2026 17:13

Merge remote-tracking branch 'origin/main' into rename-max-attempts-t…

0756724

…o-n-critic-runs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BREAKING: Rename --max-attempts to --n-critic-runs#325

BREAKING: Rename --max-attempts to --n-critic-runs#325
juanmichelini wants to merge 4 commits intomainfrom
rename-max-attempts-to-n-critic-runs

juanmichelini commented Jan 16, 2026

Uh oh!

openhands-ai bot commented Jan 16, 2026

Uh oh!

juanmichelini commented Mar 12, 2026

Uh oh!

juanmichelini commented Mar 12, 2026

Uh oh!

juanmichelini commented Mar 13, 2026

Uh oh!

juanmichelini commented Mar 14, 2026

Uh oh!

openhands-ai bot commented Mar 14, 2026

Uh oh!

openhands-ai bot commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

juanmichelini commented Jan 16, 2026

Summary

Changes

Breaking Changes

Migration Required

Related PRs

Testing

Uh oh!

openhands-ai bot commented Jan 16, 2026

Uh oh!

juanmichelini commented Mar 12, 2026

Uh oh!

juanmichelini commented Mar 12, 2026

Uh oh!

juanmichelini commented Mar 13, 2026

Uh oh!

juanmichelini commented Mar 14, 2026

Uh oh!

openhands-ai bot commented Mar 14, 2026

Uh oh!

openhands-ai bot commented Mar 14, 2026

Summary

What was done

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants