Skip to content

fix: update remediation prompt to search ALL lockfiles across repository#108

Open
aivong-openhands wants to merge 2 commits intomainfrom
fix/search-all-lockfiles-across-repository
Open

fix: update remediation prompt to search ALL lockfiles across repository#108
aivong-openhands wants to merge 2 commits intomainfrom
fix/search-all-lockfiles-across-repository

Conversation

@aivong-openhands
Copy link
Contributor

@aivong-openhands aivong-openhands commented Mar 14, 2026

Summary

Port changes from OpenHands/vulnerability-fixer#44 to the vulnerability remediation plugin.

Problem

The remediation agent was only updating lockfiles in the root directory, missing subdirectory lockfiles (e.g., enterprise/poetry.lock in the OpenHands repository).

This was reported in OpenHands/OpenHands#13371 where the PR fixed CVE vulnerabilities but missed updating the enterprise poetry lock.

Root Cause

The original prompt told the agent to "Locate the dependency file" (singular), causing it to only look at root-level files.

Solution

Updated create_remediation_prompt() to explicitly instruct the agent to:

  1. Search the entire repository recursively for ALL dependency files using find commands
  2. Update every matching file that contains the vulnerable package
  3. Find ALL lockfiles in the repository, not just root-level ones
  4. Navigate to each lockfile's directory and regenerate it with the correct tool version

Performance Optimization

The find commands now exclude common dependency and build directories to avoid wasting time:

  • node_modules/ - npm/yarn/pnpm dependencies
  • .venv/, venv/ - Python virtual environments
  • vendor/ - Go/PHP vendor directories
  • .git/ - Git internal directory
  • __pycache__/ - Python bytecode cache
  • dist/, build/ - Build output directories

Changes

  • plugins/vulnerability-remediation/scripts/scan_and_remediate.py: Updated create_remediation_prompt() with recursive search and directory exclusions

Port changes from OpenHands/vulnerability-fixer#44

The remediation agent was only updating lockfiles in the root directory,
missing subdirectory lockfiles (e.g., enterprise/poetry.lock).

Changes:
- Add step to find ALL dependency files across the repository recursively
- Update lockfile regeneration to find ALL lockfiles, not just root-level
- Add find commands with exclusions for common dependency/build directories
- Update package manager instructions to cd to each lockfile's directory

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Core fix addresses a real problem, but the PR description is corrupted and needs evidence.

CRITICAL ISSUES:
Your PR description is garbled (see line starting with "4. **Navigate to each lockfile..."). Fix before merge.

KEY INSIGHT: The fix correctly shifts from singular "the dependency file" to recursive search of all dependency files. This is the right engineering approach for a real bug, but execution needs polish.

```bash
# Find ALL dependency files containing the package (excludes dependency/build directories)
find . \\( -name node_modules -o -name .venv -o -name venv -o -name vendor -o -name .git -o -name __pycache__ -o -name dist -o -name build \\) -prune -o -name "pyproject.toml" -print | xargs grep -l "{vuln.package_name}" 2>/dev/null
find . \\( -name node_modules -o -name .venv -o -name venv -o -name vendor -o -name .git \\) -prune -o -name "requirements*.txt" -print | xargs grep -l "{vuln.package_name}" 2>/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Robustness: These find + xargs + grep chains have an edge case. If find returns no results, xargs grep will hang waiting for stdin or fail depending on xargs implementation.

Better pattern:

find . \( -name node_modules ... \) -prune -o -name "pyproject.toml" -exec grep -l "{vuln.package_name}" {} + 2>/dev/null || true

The -exec ... + is more robust than piping to xargs, and || true prevents pipeline failures when no matches exist.

3. **CRITICAL: Sync/regenerate ALL lockfiles in the repository**
After updating the version in ALL manifest files, you MUST regenerate ALL corresponding lockfiles. Do NOT manually edit lockfiles.

**CRITICAL**: First, find ALL lockfiles (excluding dependency/build directories):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion - Pragmatism: This find command is cleaner since it doesn't need grep, but you're still building a long exclusion list. Consider if there's a simpler approach:

Could you leverage existing .gitignore patterns? Many repos already define what should be ignored. Alternatively, document why each exclusion is necessary so maintainers understand the tradeoffs.

Not blocking, but worth considering for the next iteration.

Comment on lines 207 to +212

- **Poetry (pyproject.toml + poetry.lock)**:
1. Extract version from lockfile header: `grep -m1 "^# This file is automatically @generated by Poetry" poetry.lock | sed 's/.*Poetry \\([0-9.]*\\).*/\\1/'`
2. If a version is found, install that specific version: `pipx install poetry==$POETRY_VERSION --force` or `pip install poetry==$POETRY_VERSION`
3. Verify installation: `poetry --version | grep "$POETRY_VERSION"` (proceed only if successful)
4. If version extraction fails or returns empty, proceed with the currently installed version and note this in your output
5. Run: `poetry lock --no-update` or `poetry update {vuln.package_name}`
1. `cd` to the directory containing the lockfile
2. Extract version: `grep -m1 "^# This file is automatically @generated by Poetry" poetry.lock | sed 's/.*Poetry \\([0-9.]*\\).*/\\1/'`
3. If a version is found, install it: `pipx install poetry==$POETRY_VERSION --force`
4. Verify installation: `poetry --version | grep "$POETRY_VERSION"` (proceed only if successful)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste: The explicit cd to directory approach eliminates the complexity of trying to run package managers from arbitrary locations. Simple and correct.

@@ -176,36 +176,57 @@ def create_remediation_prompt(vuln: Vulnerability, repo_name: str) -> str:
## Your Task

1. **Analyze** the vulnerability and understand what needs to be fixed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL - Missing Evidence: Your PR description claims to fix the subdirectory lockfile problem, but provides zero proof that this was tested.

Required before merge:

  1. Add an Evidence section to the PR description
  2. Show the actual output from running the remediation agent on a repo with subdirectory lockfiles (e.g., a test run showing both root and enterprise/poetry.lock being updated)
  3. Include the agent conversation URL if this was developed/tested using OpenHands

Test output alone doesn't count - I need to see the real agent execution with this updated prompt actually finding and updating subdirectory files.

Also: Your PR description is corrupted starting at item 4. Fix that garbled text.

Copy link
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

The core insight is correct: shifting from singular "the dependency file" to recursive search across ALL dependency files fixes a real bug. However, there are blocking issues that must be addressed before merge.

VERDICT: ❌ Needs rework - Fix the robustness issues and add evidence before merge.

KEY INSIGHT: The fix correctly identifies that monorepos have lockfiles in subdirectories, but the shell command patterns used to implement the search are fragile and will fail in edge cases.

After updating the version in the manifest file, you MUST run the appropriate package manager sync command to update lockfiles with correct versions and hashes. Do NOT manually edit lockfiles.

**IMPORTANT**: To avoid unnecessary diff noise, you MUST detect and use the same tool version that originally generated the lockfile. Each lockfile contains a version header that indicates which tool version was used.
2. **Find ALL dependency files across the repository**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL - Missing Evidence: Your PR description claims this fixes subdirectory lockfiles (e.g., enterprise/poetry.lock in OpenHands), but you provide ZERO proof that you actually tested this.

Required before merge:

Add an Evidence section to the PR description showing:

  1. The exact command you ran to test the remediation agent
  2. Terminal output demonstrating that it found and updated a subdirectory lockfile (not just root-level)
  3. A link to the agent conversation if this work was done via OpenHands Cloud

Without evidence, we have no confidence this actually works. Test output alone does not count - show the real runtime behavior end-to-end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vulnerability fixer correctly regenerated the enterprise poetry lock in OpenHands/OpenHands#13405

3. Verify installation: `poetry --version | grep "$POETRY_VERSION"` (proceed only if successful)
4. If version extraction fails or returns empty, proceed with the currently installed version and note this in your output
5. Run: `poetry lock --no-update` or `poetry update {vuln.package_name}`
1. `cd` to the directory containing the lockfile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste: The explicit cd to each lockfile's directory is simple and correct. No need to deal with relative paths or figure out how to invoke package managers from arbitrary locations. This is the right engineering choice.


**CRITICAL**: First, find ALL lockfiles (excluding dependency/build directories):
```bash
find . \\( -name node_modules -o -name .venv -o -name venv -o -name vendor -o -name .git \\) -prune -o \\( -name "poetry.lock" -o -name "uv.lock" -o -name "package-lock.json" -o -name "yarn.lock" -o -name "pnpm-lock.yaml" -o -name "Cargo.lock" -o -name "go.sum" \\) -print
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion - Pragmatism: This exclusion list works, but it's incomplete and will grow over time. Consider:

  1. Could you leverage .gitignore patterns that repos already maintain?
  2. Is there a standard "ignore build/dependency directories" pattern you can reuse?

That said, the current approach is pragmatic and covers 95% of real cases. Don't over-engineer this unless you have evidence of repos where it fails. Perfect is the enemy of good enough.

Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants