Skip to content

Conversation

@robotlearning123
Copy link
Owner

@robotlearning123 robotlearning123 commented Jan 23, 2026

Summary

Add agent-readiness CLI tool for scanning repository maturity against the Factory Agent Readiness model (10 pillars, 5 levels, 80% gating rule).

Security Improvements

  • Path traversal protection in profile loading and config files
  • ReDoS-safe regex checker using linear-time string scanning
  • LRU cache with size limits to prevent unbounded memory growth
  • Workspace pattern injection prevention

Features

  • 33 checks across 10 pillars (docs, style, build, test, security, observability, env, ci, monorepo, task_discovery)
  • YAML-based extensible profile system
  • JSON and Markdown output formats
  • Monorepo detection and app-scope scanning
  • Init command with templates for missing files

Testing

  • 43 tests covering checks, engine, security, and LRU cache
  • Test fixtures for minimal, standard, and monorepo scenarios

Test plan

  • npm run check passes (typecheck, lint, format)
  • npm test passes (43/43 tests)
  • Security tests validate path traversal protection
  • LRU cache tests validate size limits and eviction

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added agent-readiness CLI tool for evaluating repository maturity across 10 pillars and 5 levels.
    • scan command analyzes projects and generates readiness reports in JSON and Markdown formats.
    • init command scaffolds readiness templates and configuration files.
    • Monorepo support with per-application evaluation.
  • Documentation

    • Comprehensive README, contributing guidelines, and agent instructions.
    • GitHub issue and pull request templates.
  • Tests

    • Full test suite for checks, engine, security, and fixtures.
  • Chores

    • Project configuration, ESLint, Prettier, TypeScript, Docker, and CI workflow setup.

✏️ Tip: You can customize this high-level summary in your review settings.

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 23, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

A new comprehensive agent-readiness CLI tool is introduced within the agent-readiness directory, featuring a 10-pillar/5-level framework for assessing repository maturity. This includes scanning engine with multiple check types, profile-based configuration, output formatting, extensive documentation, configuration templates, and test fixtures. Minimal refactoring updates are made to existing source files for path validation and streaming improvements.

Changes

Cohort / File(s) Summary
Agent Readiness Core Framework
agent-readiness/src/types.ts, agent-readiness/src/engine/level-gate.ts, agent-readiness/src/engine/context.ts, agent-readiness/src/engine/index.ts
Defines comprehensive type system (Level, Pillar, CheckConfig, ScanResult, ScanContext) and implements 80% passing-threshold level gating logic with per-pillar and overall score calculations; builds repository context with Git metadata and monorepo detection.
Check Implementations
agent-readiness/src/checks/index.ts, agent-readiness/src/checks/file-exists.ts, agent-readiness/src/checks/path-glob.ts, agent-readiness/src/checks/any-of.ts, agent-readiness/src/checks/github-*.ts, agent-readiness/src/checks/build-command.ts, agent-readiness/src/checks/log-framework.ts, agent-readiness/src/checks/dependency-detect.ts
Implements 8 check type dispatchers and handlers for file validation, glob patterns, composite checks, GitHub workflows/actions, build commands, logging frameworks, and dependency detection with result aggregation.
Scanning & Output
agent-readiness/src/scanner.ts, agent-readiness/src/output/json.ts, agent-readiness/src/output/markdown.ts
Orchestrates full scan pipeline with profile loading, check execution, and summary calculation; formats results as JSON (structured output) or Markdown (terminal with colors and badges).
CLI & Commands
agent-readiness/src/index.ts, agent-readiness/src/commands/scan.ts, agent-readiness/src/commands/init.ts
Defines CLI entry point using Commander with two commands: scan (repository assessment) and init (template generation with variable substitution).
Profiles & Templates
agent-readiness/src/profiles/index.ts, agent-readiness/src/templates/index.ts, agent-readiness/profiles/factory_compat.yaml, agent-readiness/templates/*
Loads YAML-based profiles (factory_compat with 10 pillars and 80+ checks) and provides template system for init command (dotenv, CODEOWNERS, GitHub workflows, issue templates, devcontainer).
Utilities
agent-readiness/src/utils/fs.ts, agent-readiness/src/utils/git.ts, agent-readiness/src/utils/yaml.ts, agent-readiness/src/utils/exec.ts, agent-readiness/src/utils/regex.ts, agent-readiness/src/utils/lru-cache.ts, agent-readiness/src/utils/index.ts
Provides filesystem operations (with caching), Git integration, YAML parsing/serialization, safe command execution, regex safety (ReDoS mitigation), and LRU caching infrastructure.
Configuration & Build
agent-readiness/.devcontainer/devcontainer.json, agent-readiness/.editorconfig, agent-readiness/.env.example, agent-readiness/.prettierrc, agent-readiness/eslint.config.js, agent-readiness/tsconfig.json, agent-readiness/docker-compose.yml, agent-readiness/package.json
Defines development environment (Docker, TypeScript, ESLint, Prettier) with Node 20 target, build configuration, and npm scripts (build, test, lint).
GitHub Integration
agent-readiness/.github/CODEOWNERS, agent-readiness/.github/workflows/ci.yml, agent-readiness/.github/ISSUE_TEMPLATE/*, agent-readiness/.github/PULL_REQUEST_TEMPLATE.md, agent-readiness/.gitignore, agent-readiness/.husky/pre-commit, agent-readiness/.prettierignore
Establishes code ownership, CI pipeline (lint, typecheck, test, build on Node 20.x and 22.x), issue/PR templates, and pre-commit hooks for linting.
Documentation
agent-readiness/README.md, agent-readiness/CONTRIBUTING.md, agent-readiness/AGENTS.md, agent-readiness/FACTORY_COMPARISON.md, agent-readiness/LICENSE
Provides comprehensive project documentation covering CLI usage, 10 pillars/5 levels model, contribution guidelines, AI agent instructions, Factory model comparison, and MIT license.
Tests & Fixtures
agent-readiness/test/checks.test.ts, agent-readiness/test/engine.test.ts, agent-readiness/test/security.test.ts, agent-readiness/test/fixtures/minimal-repo/*, agent-readiness/test/fixtures/standard-repo/*
Includes unit tests for checks, level-gate calculations, and security utilities; provides two test fixture repositories (minimal-repo for L1, standard-repo for L2+).
Source Code Refinements
src/agents/subagent.ts, src/providers/ollama.ts, src/sandbox/index.ts, src/tools/spawn.ts
Tracks natural completion in sub-agents; refactors Ollama streaming parser to process JSON lines via buffer; adds path validation for sandbox execution; imports sub-agent types externally instead of locally defining them.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI / scan command
    participant Scanner as Scanner
    participant Profile as Profile Loader
    participant Context as Context Builder
    participant Checks as Check Dispatcher
    participant Output as Output Formatter

    CLI->>Scanner: scan(options)
    Scanner->>Profile: loadProfile(profileName)
    Profile-->>Scanner: Profile with checks
    Scanner->>Context: buildScanContext(rootPath)
    Context-->>Scanner: ScanContext (package.json, monorepo info)
    Scanner->>Checks: executeChecks(profile.checks, context)
    loop For each check by type
        Checks->>Checks: executeCheck(fileExists | pathGlob | etc.)
        Checks-->>Checks: CheckResult[]
    end
    Checks-->>Scanner: CheckResult[]
    Scanner->>Scanner: calculateLevelSummaries(results)
    Scanner->>Scanner: determineAchievedLevel(levelSummaries)
    Scanner->>Scanner: calculatePillarSummaries(results)
    Scanner-->>Scanner: ScanResult (level, score, pillars, action_items)
    Scanner->>Output: outputJson / outputMarkdown(result)
    Output-->>CLI: Formatted output (file or stdout)
Loading
sequenceDiagram
    participant CLI as CLI / init command
    participant InitCmd as Init Command
    participant TemplateLoader as Template Loader
    participant FileWriter as File Writer

    CLI->>InitCmd: initCommand(options: path, level?, check?, dryRun, force)
    InitCmd->>TemplateLoader: getTemplates()
    TemplateLoader-->>InitCmd: Template[] (content, targetPath)
    InitCmd->>InitCmd: filterTemplates(by level or check)
    alt dryRun mode
        InitCmd-->>CLI: List of files to create (no write)
    else Normal mode
        InitCmd->>InitCmd: substituteVariables(template.content, PROJECT_NAME, YEAR)
        InitCmd->>FileWriter: writeFile(targetPath, substituted content)
        FileWriter-->>CLI: Created files log
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 A readiness checker hops into view,
With pillars and levels, both tried and true,
Ten facets of code, from docs to CI,
We scan every repo and help projects fly!
Templates and checks, all bundled with care,

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: addition of agent-readiness v0.1.0 with security hardening, which aligns with the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Fix all issues with AI agents
In `@agent-readiness/.github/CODEOWNERS`:
- Around line 5-18: The CODEOWNERS file contains the invalid placeholder string
"@{{GITHUB_USERNAME}}" which prevents GitHub from recognizing owners; replace
every occurrence of "@{{GITHUB_USERNAME}}" in the file with one or more real
GitHub usernames or team handles (for example "@robotlearning123" or
"@org/team-name") so each path rule (the wildcard, /docs/, *.md, /.github/,
/src/, /test/) has valid owner entries; ensure usernames include the leading "@"
and update all instances consistently.

In `@agent-readiness/AGENTS.md`:
- Around line 7-8: Replace the placeholder project description and directory
placeholders in AGENTS.md (the "agent-readiness" header and any other
occurrences at the noted locations) with a concise, final project summary and
clear directory explanations; specifically, update the "agent-readiness"
paragraph to a one- or two-sentence description of the project's purpose and
intended audience, replace the generic directory placeholders with actual
directory names and short purpose statements, and remove any remaining
"placeholder" wording so the file contains no unfinished text (check the other
reported occurrence around the same file and ensure consistency).
- Around line 42-46: The fenced code block in AGENTS.md (the directory tree
snippet around the lines showing "src/", "index.ts", and "[describe key
directories]") lacks a language tag; update the opening triple-backticks to
include a language (e.g., ```text) so the block becomes a language-specified
fenced code block and satisfies MD040/markdownlint.

In `@agent-readiness/CONTRIBUTING.md`:
- Line 54: Replace the placeholder-looking "[ESLint/Prettier]" text in
CONTRIBUTING.md with an explicit, unbracketed statement of the actual tools used
(e.g., "We use ESLint and Prettier for linting and formatting") or, if only one
is used, name that tool alone; update the line containing "[ESLint/Prettier]" so
it clearly lists the specific tooling without brackets.

In `@agent-readiness/profiles/factory_compat.yaml`:
- Around line 1-9: Update the profile header/description to state "10 Pillars"
instead of "9 Pillars" to match the rest of the file: modify the description
block for the profile with name "factory_compat" (version "1.0.0") so it
references the 10 Pillars model with 5 maturity levels (L1-L5) rather than the 9
Pillars model.
- Around line 178-186: The L1 rule build.package_json currently enforces only
package.json which makes the build gate Node-only; update the rule (id:
build.package_json and name: Package manifest) to accept multiple manifest types
by replacing the single file_exists path with an any_of that contains
file_exists checks for package.json, pyproject.toml, Cargo.toml (and other
supported manifests), update the description to reflect "one of these manifests
must exist", and keep required: true so the gate remains L1 but
language-agnostic.

In `@agent-readiness/README.md`:
- Around line 66-90: The fenced code blocks in README.md (the "Agent Readiness
Report" block starting with "Agent Readiness Report" and the repository tree
block starting with "agent-readiness/") must include a language specifier to
satisfy markdownlint; change their opening fences from ``` to ```text for each
occurrence (including the second block around the file tree), and update any
other fenced blocks mentioned (around lines ~66 and ~202) similarly so all
examples use ```text.

In `@agent-readiness/src/checks/file-exists.ts`:
- Around line 11-115: The check treats empty file contents as unreadable because
executeFileExists uses if (!content) after calling readFileCached; change this
to an explicit null/undefined check (e.g., if (content == null) or content ===
null) so empty string content ('') is allowed and still passed to safeRegexTest
for regex validation; update the null check around the content variable returned
by readFileCached in executeFileExists accordingly.

In `@agent-readiness/src/checks/github-workflow.ts`:
- Around line 126-163: checkForEvent currently rejects matches when
requiredBranches is provided for triggers that imply "all branches" (string,
array, or object with null/undefined event config). Update logic in
checkForEvent so that: when triggers is a string or Array and it
includes/matches the event, return true regardless of requiredBranches; when
eventConfig is null or eventConfig.branches is undefined (i.e., no branches
key), treat that as "all branches" and return true; only perform the
requiredBranches.every(...) check when eventConfig.branches is an explicit
array. Use the existing function name checkForEvent and the local symbols
triggers, eventConfig, and configBranches to locate and change the three
branches of logic (string case, Array.isArray case, and object/null case).

In `@agent-readiness/src/output/markdown.ts`:
- Around line 70-94: The ASCII badge in printLevelBadge misaligns because
levelName and overall_score vary in length and chalk adds ANSI codes; fix by
measuring and padding the raw text (not chalk-wrapped) to the badge content
width before applying colorFn/chalk: implement a small helper (e.g.,
getVisibleLength or use strip-ansi) to compute visible length, pad the raw
strings with padEnd/padStart to the target inner width, then pass the padded raw
strings into colorFn and chalk.dim for consistent alignment; update references
in printLevelBadge (levelName, result.overall_score, colorFn) and ensure the
same inner width constant is used for the badge borders so all lines align.

In `@agent-readiness/src/utils/regex.ts`:
- Around line 18-54: isUnsafeRegex currently uses a single hasQuantifierInGroup
flag and a naive prevChar === '\\' escape check which misses nested-group
quantifiers (e.g., (a(b+))+) and misdetects escaped parens when there are
multiple backslashes; change it to track quantifiers per-group using a stack
(e.g., push a boolean on '(' and set the top when seeing +, *, or {…} inside
that group) and on ')' inspect the top-of-stack to decide if a following
quantifier makes it unsafe, then pop; also replace the single-char escape test
with a robust check that counts consecutive preceding backslashes to determine
if the current char is escaped (odd count = escaped), applying this when
encountering '(' , ')' and quantifier chars inside isUnsafeRegex.

In `@agent-readiness/templates/AGENTS.md`:
- Around line 42-46: Add a language tag to the fenced code block that shows the
directory tree (the triple-backtick block containing "src/" and "├── index.ts")
by changing the opening delimiter from ``` to ```text so the block is explicitly
marked as plain text for markdownlint and improved readability.

In `@agent-readiness/test/fixtures/standard-repo/README.md`:
- Around line 11-22: The README's Usage section references a non-existent npm
script `npm run dev`; update the README (README.md) to reference an existing
script (e.g., `npm run test`) or add a `dev` script to package.json. Locate the
Usage block in README.md and replace `npm run dev` with `npm run test` and
change the step "Run the development server" to "Run the test suite", or
alternatively add a `"dev": "<command>"` entry to the scripts object in
package.json so `npm run dev` is valid (ensure the script name matches existing
CI/dev workflow).

In `@src/agents/subagent.ts`:
- Around line 148-153: The success boolean is currently computed as turn <
maxTurns which marks a completion on exactly maxTurns as failure; change it to
detect whether the loop actually terminated because the agent hit the turn limit
versus because it finished without invoking a tool on the final turn. Track
whether a tool was invoked on the last iteration (e.g., a local flag like
usedToolThisTurn or compare toolsUsed size before/after the final loop) and
compute success as !(turn >= maxTurns && usedToolThisTurn) or equivalently
success = turn < maxTurns || (turn === maxTurns && !usedToolThisTurn), then
return success along with lastOutput, turn, and Array.from(toolsUsed).

In `@src/providers/ollama.ts`:
- Around line 105-156: The stream parser currently splits each decoded read by
'\n' and discards partial JSON lines when a chunk boundary splits a JSON object;
fix by introducing a persistent buffer string (e.g., leftoverBuffer) outside the
read loop and for each read append decoder.decode(value, {stream:true}) to it,
then split leftoverBuffer by '\n', keep the last entry as the new leftoverBuffer
if it does not end with a newline (only parse complete lines), parse each
complete line into OllamaStreamChunk and proceed to update fullText, toolCalls,
toolCallIndex and call onChunk as before, and after the reader loop finishes
parse any remaining leftoverBuffer if non-empty (or handle as incomplete JSON
with a safe try/catch) so no valid JSON lines are lost.

In `@src/sandbox/index.ts`:
- Around line 165-218: The runWithTimeout function can call resolve twice if
both proc 'error' and 'close' fire; add a local boolean (e.g., settled)
initialized false and in both the 'close' and 'error' handlers check if settled
is true then return, otherwise set settled = true before clearing the timeout
and calling resolve; also clear/remove other listeners as needed to avoid any
further callbacks (or use once semantics) so resolve is guaranteed to run
exactly once.
- Around line 144-162: runDirect currently forwards cwd to runWithTimeout
without validation; call the existing validateCwd helper before using cwd (e.g.,
const safeCwd = validateCwd(cwd)) and pass safeCwd to runWithTimeout (or let
validateCwd throw on invalid input), and ensure validateCwd is
imported/available in this file; update runDirect to use safeCwd so direct
execution enforces the same cwd checks as runInDocker/runInFirejail.
🧹 Nitpick comments (18)
agent-readiness/src/utils/exec.ts (1)

27-29: Consider capturing stderr for debugging purposes.

The error is silently swallowed, which may make debugging difficult when commands fail unexpectedly. Consider optionally capturing and returning stderr or the error message.

♻️ Optional enhancement to capture error details
 export interface ExecResult {
   stdout: string;
   success: boolean;
+  stderr?: string;
 }

 export function execSafe(
   command: string,
   args: string[],
   options?: ExecFileSyncOptions
 ): ExecResult {
   try {
     const stdout = execFileSync(command, args, {
       ...options,
       encoding: 'utf-8',
       stdio: ['pipe', 'pipe', 'pipe'],
     }) as string;
     return { stdout: stdout.trim(), success: true };
-  } catch {
-    return { stdout: '', success: false };
+  } catch (error) {
+    const stderr = error instanceof Error && 'stderr' in error 
+      ? String((error as { stderr: unknown }).stderr).trim() 
+      : '';
+    return { stdout: '', success: false, stderr };
   }
 }
agent-readiness/.github/workflows/ci.yml (1)

76-98: Consider adding artifact upload for the build output.

The build job verifies the output exists but doesn't upload it as an artifact. This could be useful for release workflows or debugging.

♻️ Optional: Upload build artifact
       - name: Verify build output
         run: |
           test -f dist/index.js
           echo "Build successful!"
+
+      - name: Upload build artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: dist
+          path: dist/
+          retention-days: 7
agent-readiness/src/utils/lru-cache.ts (1)

77-100: Potential issue with undefined values in cache.

Both getOrCompute and getOrComputeSync treat undefined as a cache miss. If V could legitimately be undefined, these methods will recompute on every access. For the current use cases (file contents, glob results), this is likely fine, but worth noting for future use.

♻️ Alternative using `has()` check
   async getOrCompute(key: K, compute: () => Promise<V>): Promise<V> {
-    const existing = this.get(key);
-    if (existing !== undefined) {
-      return existing;
+    if (this.has(key)) {
+      return this.get(key) as V;
     }

     const value = await compute();
     this.set(key, value);
     return value;
   }

   getOrComputeSync(key: K, compute: () => V): V {
-    const existing = this.get(key);
-    if (existing !== undefined) {
-      return existing;
+    if (this.has(key)) {
+      return this.get(key) as V;
     }

     const value = compute();
     this.set(key, value);
     return value;
   }
src/agents/subagent.ts (1)

110-146: Consider adding error handling for provider.chat failures.

If provider.chat throws (network error, rate limit, etc.), the entire runSubAgent call will reject with an unhandled exception. Consider wrapping the loop in try-catch to return a graceful failure result.

Suggested improvement
+  try {
     while (turn < maxTurns) {
       turn++;
       const response = await provider.chat(messages, tools, () => {}, definition.systemPrompt);
       // ... rest of loop
     }
+  } catch (error) {
+    return {
+      success: false,
+      output: `Sub-agent error: ${error instanceof Error ? error.message : String(error)}`,
+      turns: turn,
+      toolsUsed: Array.from(toolsUsed),
+    };
+  }
src/tools/spawn.ts (1)

49-63: Validation order: type cast precedes validation.

The code casts args.type to SubAgentType on line 50 before validating it on lines 59-63. While this works functionally (validation happens before use), it's cleaner to validate first. The current approach is acceptable but consider using as string initially and casting only after validation.

src/providers/index.ts (1)

36-40: Semantic confusion: apiKey repurposed as base URL for Ollama.

Using apiKey as the base URL for Ollama is semantically confusing. The codebase already defines ProviderConfig.baseURL specifically for custom API endpoints. Refactor to use the proper configuration field:

Alternative using existing ProviderConfig.baseURL
     case 'ollama': {
       const { OllamaProvider } = await import('./ollama.js');
-      // apiKey is used as base URL for Ollama (optional, defaults to localhost:11434)
-      return new OllamaProvider(model, apiKey || 'http://localhost:11434');
+      const baseUrl = config.providers?.ollama?.baseURL || 'http://localhost:11434';
+      return new OllamaProvider(model, baseUrl);
     }
agent-readiness/src/commands/scan.ts (1)

64-66: Consider reusing LEVELS constant from types.

The valid levels array is duplicated here. Importing LEVELS from types.ts would ensure consistency if the level definitions change.

♻️ Suggested refactor
-import type { ScanOptions, Level } from '../types.js';
+import type { ScanOptions, Level } from '../types.js';
+import { LEVELS } from '../types.js';
 ...
 function isValidLevel(level: string): level is Level {
-  return ['L1', 'L2', 'L3', 'L4', 'L5'].includes(level);
+  return LEVELS.includes(level as Level);
 }
agent-readiness/test/security.test.ts (1)

198-214: Test doesn't exercise concurrent access; consider adding a true concurrency test.

The current test calls getValue() sequentially with await, so both calls complete one at a time. This verifies basic caching but doesn't test what happens when two calls race before the first one populates the cache.

Looking at LRUCache.getOrCompute in agent-readiness/src/utils/lru-cache.ts (lines 74-82), concurrent calls can both miss the cache check and execute compute() multiple times since there's no deduplication mechanism (e.g., pending promises map).

If this is acceptable behavior, consider documenting it. Otherwise, consider adding a test that exercises true concurrency:

🧪 Suggested concurrent test
it('should handle concurrent getOrCompute calls', async () => {
  const cache = new LRUCache<string, string>(3);
  let computeCount = 0;

  const compute = async () => {
    computeCount++;
    await new Promise((r) => setTimeout(r, 10)); // Simulate async work
    return 'computed';
  };

  // Start both calls concurrently (no await between)
  const [value1, value2] = await Promise.all([
    cache.getOrCompute('key', compute),
    cache.getOrCompute('key', compute),
  ]);

  assert.strictEqual(value1, 'computed');
  assert.strictEqual(value2, 'computed');
  // Note: computeCount may be 2 without deduplication
});
agent-readiness/src/commands/init.ts (1)

55-62: Consider validating template.targetPath against path traversal.

While templates are loaded from bundled definitions, applying safePath validation (from ../utils/fs.js) to template.targetPath would provide defense-in-depth, especially if custom/external profiles are ever supported.

🛡️ Suggested defensive validation
+import { directoryExists, fileExists, writeFile, readFile, safePath } from '../utils/fs.js';

 for (const template of templatesNeeded) {
-  const targetPath = path.join(options.path, template.targetPath);
+  const targetPath = safePath(template.targetPath, options.path);
+  if (!targetPath) {
+    console.warn(chalk.yellow(`Skipping template with invalid path: ${template.targetPath}`));
+    continue;
+  }
   const exists = await fileExists(targetPath);
agent-readiness/test/engine.test.ts (1)

16-28: Consider removing unused makeCheck helper.

The makeCheck helper is defined but never actually used - the checks arrays are created but immediately voided with void checks. The tests correctly use only makeResult since calculateLevelSummaries derives all needed info from results.

Removing makeCheck and the unused checks arrays would reduce noise.

♻️ Suggested cleanup
-// Helper to create check configs
-function makeCheck(id: string, level: Level, required: boolean): CheckConfig {
-  return {
-    id,
-    name: id,
-    description: 'Test',
-    type: 'file_exists',
-    pillar: 'docs',
-    level,
-    required,
-    path: 'test.md',
-  } as CheckConfig;
-}
-
 // Helper to create check results

And remove the checks arrays and void checks statements in each test.

agent-readiness/src/checks/build-command.ts (1)

107-114: Reduce false positives in pyproject.toml detection.

cmd = anywhere in the file can match non-script keys (e.g., tool config). Consider restricting to [project.scripts] / [tool.poetry.scripts] sections.

♻️ Possible refinement
 } else if (filename === 'pyproject.toml') {
-  // Look for scripts section or tool.poetry.scripts (escape command)
-  for (const cmd of commandsToFind) {
-    const scriptRegex = new RegExp(`${escapeRegex(cmd)}\\s*=`, 'm');
-    if (scriptRegex.test(content)) {
-      found.push(cmd);
-    }
-  }
+  // Look only inside known scripts sections
+  const lines = content.split(/\r?\n/);
+  let inScripts = false;
+  for (const line of lines) {
+    const section = line.match(/^\s*\[([^\]]+)\]\s*$/)?.[1]?.trim();
+    if (section) {
+      inScripts = section === 'project.scripts' || section === 'tool.poetry.scripts';
+      continue;
+    }
+    if (!inScripts) continue;
+    for (const cmd of commandsToFind) {
+      const scriptRegex = new RegExp(`^\\s*${escapeRegex(cmd)}\\s*=`);
+      if (scriptRegex.test(line)) {
+        found.push(cmd);
+      }
+    }
+  }
 }
agent-readiness/src/utils/yaml.ts (1)

102-111: Validate scalar types before building the base check.

required, weight, and tags are accepted without type checks, so invalid YAML (e.g., "required": "yes") can slip through and break downstream logic.

♻️ Suggested guardrails
   // Validate level is a known value
   if (!LEVELS.includes(raw.level as Level)) {
     throw new Error(
       `Check '${raw.id}' has invalid level '${raw.level}'. Valid levels: ${LEVELS.join(', ')}`
     );
   }
+
+  if (raw.required !== undefined && typeof raw.required !== 'boolean') {
+    throw new Error(`Check '${raw.id}' 'required' must be a boolean`);
+  }
+  if (raw.weight !== undefined && typeof raw.weight !== 'number') {
+    throw new Error(`Check '${raw.id}' 'weight' must be a number`);
+  }
+  if (raw.tags !== undefined) {
+    if (!Array.isArray(raw.tags) || !raw.tags.every((t) => typeof t === 'string')) {
+      throw new Error(`Check '${raw.id}' 'tags' must be an array of strings`);
+    }
+  }
agent-readiness/src/checks/log-framework.ts (1)

75-112: The 100‑file scan cap can hide frameworks in large repos.

Consider surfacing the cap in results (or making it configurable) so users understand potential false negatives.

♻️ Example transparency tweak
-  // Search source files for framework imports
-  const sourcePatterns = ['**/*.ts', '**/*.js', '**/*.py', '**/*.go', '**/*.java', '**/*.rs'];
+  // Search source files for framework imports
+  const sourcePatterns = ['**/*.ts', '**/*.js', '**/*.py', '**/*.go', '**/*.java', '**/*.rs'];
+  const MAX_FILES_PER_PATTERN = 100;
@@
-    const filesToCheck = files.slice(0, 100);
+    const filesToCheck = files.slice(0, MAX_FILES_PER_PATTERN);
@@
     details: {
       searched_for: check.frameworks,
+      files_scanned_limit: MAX_FILES_PER_PATTERN,
     },

Also applies to: 131-147

agent-readiness/src/utils/regex.ts (1)

99-125: The “timeout protection” claim isn’t implemented.

This function only caps input length; there’s no actual time-based cutoff. Either implement a real timeout (worker/RE2) or adjust the docstring so callers aren’t misled.

📎 Suggested doc clarification
- * Test content against a regex pattern with timeout protection
- * Returns false if pattern is invalid/unsafe or times out
+ * Test content against a regex pattern with size-limited input
+ * Returns false if pattern is invalid/unsafe
agent-readiness/src/checks/index.ts (1)

20-88: Consider a single source of truth for supported check types.

The switch and getSupportedCheckTypes() can drift. A map of {type: executor} lets you derive the supported list from Object.keys().

agent-readiness/src/templates/index.ts (1)

133-135: Return a defensive copy from listTemplates().

Exposing the internal array lets callers mutate template definitions at runtime. Returning a shallow copy avoids accidental mutations.

🧩 Suggested tweak
 export function listTemplates(): Array<Omit<Template, 'content'>> {
-  return TEMPLATE_DEFS;
+  return TEMPLATE_DEFS.map((t) => ({ ...t }));
 }
agent-readiness/src/engine/level-gate.ts (1)

132-158: Minor inconsistency in empty score handling.

Empty pillars return a score of 100 (line 141), while empty levels return a score of 0 (line 32). This inconsistency may be intentional (pillars without checks shouldn't penalize the score), but consider documenting the rationale or aligning the behavior.

src/sandbox/index.ts (1)

106-107: Consider pinning the Docker image version.

Using alpine:latest may pull different versions over time, potentially affecting reproducibility or introducing unexpected behavior. Consider pinning to a specific version like alpine:3.19.

🔧 Suggested fix
-  dockerArgs.push('alpine:latest', 'sh', '-c', command);
+  dockerArgs.push('alpine:3.19', 'sh', '-c', command);

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@agent-readiness/package.json`:
- Around line 18-19: The test scripts "test" and "test:watch" use a glob
"test/*.test.ts" that only matches top-level files; update both scripts in
package.json ("test" and "test:watch") to use the recursive glob
"test/**/*.test.ts" so nested tests (e.g.,
test/fixtures/standard-repo/src/index.test.ts) are discovered by the test
runner.

In `@agent-readiness/README.md`:
- Around line 144-155: Add a new row for the missing check type
`dependency_detect` to the "Check Types" table under the "Check Types" heading:
describe it concisely (e.g., "Detects presence and optional version constraints
of declared dependencies in package manifests or lockfiles") and place it
alongside the other entries (`file_exists`, `path_glob`, `any_of`, etc.) so the
README explicitly documents that the profile uses `dependency_detect`.

In `@agent-readiness/test/fixtures/standard-repo/README.md`:
- Around line 23-25: The README.md in the standard-repo test fixture references
a non-existent CONTRIBUTING.md; fix by either adding a minimal CONTRIBUTING.md
to the fixture with contribution guidelines or removing/updating the reference
in README.md (search for the "Contributing" heading in README.md and either
replace the link with existing documentation or create CONTRIBUTING.md alongside
it so the reference resolves).

In `@src/providers/ollama.ts`:
- Around line 115-127: The current loop stores tool calls into the toolCalls Map
keyed by sequential toolCallIndex and overwrites arguments per chunk; change it
to key by tc.id (use tc.id || generate a stable id from toolCallIndex but do not
use toolCallIndex as the primary key) and, when an entry already exists for that
id in toolCalls, append/merge the incoming tc.function.arguments (concatenate
fragments or merge JSON strings) into the existing entry's arguments instead of
replacing them; update references to toolCallIndex only for generating fallback
ids and ensure toolCalls.set uses the id as the Map key and
preserves/accumulates prior data for that id.

In `@src/sandbox/index.ts`:
- Around line 76-107: runInDocker currently ignores SandboxOptions.env; update
it to forward environment variables into the container by iterating over
options.env (or treat it as a Record/string[] per type) and appending '-e',
'KEY=VALUE' entries to dockerArgs before adding the image/command. Preserve
existing behavior when env is undefined and ensure values are stringified; keep
validateCwd logic and then call runWithTimeout('docker', dockerArgs, timeout,
'docker') as before so the container receives the intended environment.
- Around line 109-131: The Firejail backend in runInFirejail ignores the env
field from SandboxOptions; update runInFirejail to capture options.env and pass
it into the call to runWithTimeout so the child inherits the intended
environment (e.g., const { timeout = 30000, network = false, cwd, env } =
options; and then call runWithTimeout('firejail', firejailArgs, timeout,
'firejail', { env }) ). Also persist the validated cwd as safeCwd (or
restructure) so you keep existing whitelist behavior while forwarding env to
runWithTimeout.
🧹 Nitpick comments (5)
agent-readiness/.devcontainer/devcontainer.json (1)

7-7: Prefer npm ci for deterministic installs in devcontainers.

If a lockfile exists, npm ci is faster and reproducible, reducing “works on my machine” drift.

♻️ Proposed change
-  "postCreateCommand": "npm install",
+  "postCreateCommand": "npm ci",
agent-readiness/src/utils/git.ts (1)

40-46: Normalize git output to avoid trailing newlines.

git outputs typically include a trailing newline; trimming makes JSON/Markdown outputs cleaner and prevents subtle formatting issues. Consider applying the same normalization to other gitExec consumers (branch/remote) as well.

♻️ Proposed fix
 export function getCommitSha(repoPath: string): string {
   if (!hasGitDir(repoPath)) {
     return 'unknown';
   }
   const result = gitExec(['rev-parse', 'HEAD'], repoPath);
-  return result.success ? result.stdout : 'unknown';
+  return result.success ? (result.stdout.trim() || 'unknown') : 'unknown';
 }
src/sandbox/index.ts (1)

222-238: Broad catch may mask execution failures and cause unexpected backend fallback.

The generic catch around each backend will catch any error, including actual command execution failures. If Docker is available but the command fails due to timeout or resource limits, it silently falls through to Firejail, potentially re-running the command.

Consider catching only backend-unavailability errors (e.g., Docker daemon not running) rather than all errors.

♻️ Suggested approach
   if (_hasDocker) {
     try {
       return await runInDocker(command, options);
-    } catch {
-      // Fall through to next option
+    } catch (e) {
+      // Only fall through if Docker itself failed to start
+      // Don't retry if the command ran but failed
+      if (!(e instanceof Error && e.message.includes('Cannot connect to the Docker daemon'))) {
+        throw e;
+      }
     }
   }

Alternatively, check Docker availability more robustly (e.g., docker info) before attempting to run.

src/providers/ollama.ts (1)

75-87: Add a request timeout/abort to avoid hanging CLI calls.

Network stalls will currently block indefinitely. Consider an AbortController with a reasonable timeout and clear it in a finally to avoid leaks.

♻️ Proposed refactor (timeout + abort)
-    const response = await fetch(`${this.baseUrl}/api/chat`, {
+    const controller = new AbortController();
+    const timeout = setTimeout(() => controller.abort(), 120_000);
+    let response: Response;
+    try {
+      response = await fetch(`${this.baseUrl}/api/chat`, {
         method: 'POST',
         headers: { 'Content-Type': 'application/json' },
         body: JSON.stringify({
           model: this.model,
           messages: ollamaMessages,
           tools: ollamaTools.length > 0 ? ollamaTools : undefined,
           stream: true,
           options: {
             num_predict: this.maxTokens,
           },
         }),
+        signal: controller.signal,
       });
+    } finally {
+      clearTimeout(timeout);
+    }
src/agents/subagent.ts (1)

130-140: Consider using a more robust error detection mechanism.

The error detection at line 138 relies on string prefix matching (result.startsWith('Error:')), which couples tightly to executeTool's error message format. If that format changes, errors may go undetected.

💡 Suggested improvement

Consider having executeTool return a structured result with an explicit error flag, or catch exceptions during execution:

     for (const toolCall of response.toolCalls) {
       toolsUsed.add(toolCall.name);
-      const result = await executeTool(toolCall.name, toolCall.arguments);
-      results.push({
-        toolCallId: toolCall.id,
-        content: result,
-        isError: result.startsWith('Error:'),
-      });
+      try {
+        const result = await executeTool(toolCall.name, toolCall.arguments);
+        results.push({
+          toolCallId: toolCall.id,
+          content: result,
+          isError: result.startsWith('Error:'),
+        });
+      } catch (err) {
+        results.push({
+          toolCallId: toolCall.id,
+          content: `Error: ${err instanceof Error ? err.message : String(err)}`,
+          isError: true,
+        });
+      }
     }

Comment on lines +18 to +19
"test": "tsx --test test/*.test.ts",
"test:watch": "tsx --test --watch test/*.test.ts",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all test files in agent-readiness to see their actual locations
echo "=== All *.test.ts files in agent-readiness ==="
fd -e test.ts . agent-readiness 2>/dev/null || find agent-readiness -name '*.test.ts' -type f

echo -e "\n=== Directory structure of agent-readiness ==="
tree -d -L 3 agent-readiness 2>/dev/null || find agent-readiness -type d | head -20

echo -e "\n=== Content of agent-readiness package.json (lines 15-25) ==="
if [ -f agent-readiness/package.json ]; then
  sed -n '15,25p' agent-readiness/package.json
fi

Repository: robotlearning123/nano-opencode

Length of output: 1122


Fix test glob to include nested test files.

The test script glob test/*.test.ts only matches top-level test files and will miss test/fixtures/standard-repo/src/index.test.ts. Update to test/**/*.test.ts to discover all tests:

-    "test": "tsx --test test/*.test.ts",
-    "test:watch": "tsx --test --watch test/*.test.ts",
+    "test": "tsx --test test/**/*.test.ts",
+    "test:watch": "tsx --test --watch test/**/*.test.ts",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"test": "tsx --test test/*.test.ts",
"test:watch": "tsx --test --watch test/*.test.ts",
"test": "tsx --test test/**/*.test.ts",
"test:watch": "tsx --test --watch test/**/*.test.ts",
🤖 Prompt for AI Agents
In `@agent-readiness/package.json` around lines 18 - 19, The test scripts "test"
and "test:watch" use a glob "test/*.test.ts" that only matches top-level files;
update both scripts in package.json ("test" and "test:watch") to use the
recursive glob "test/**/*.test.ts" so nested tests (e.g.,
test/fixtures/standard-repo/src/index.test.ts) are discovered by the test
runner.

Comment on lines +144 to +155
## Check Types

| Type | Description |
|------|-------------|
| `file_exists` | File presence + optional content regex |
| `path_glob` | Glob pattern with min/max matches |
| `any_of` | Composite OR check (pass if any child passes) |
| `github_workflow_event` | CI triggers on specific events |
| `github_action_present` | Specific GitHub Action used |
| `build_command_detect` | Build/test commands in package.json/Makefile |
| `log_framework_detect` | Logging library detection |

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document dependency_detect in the Check Types table.
The profile uses this check type, but it’s missing from the list of supported types.

✏️ Suggested doc update
 | `build_command_detect` | Build/test commands in package.json/Makefile |
 | `log_framework_detect` | Logging library detection |
+| `dependency_detect` | Dependency presence in manifests/config files |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Check Types
| Type | Description |
|------|-------------|
| `file_exists` | File presence + optional content regex |
| `path_glob` | Glob pattern with min/max matches |
| `any_of` | Composite OR check (pass if any child passes) |
| `github_workflow_event` | CI triggers on specific events |
| `github_action_present` | Specific GitHub Action used |
| `build_command_detect` | Build/test commands in package.json/Makefile |
| `log_framework_detect` | Logging library detection |
## Check Types
| Type | Description |
|------|-------------|
| `file_exists` | File presence + optional content regex |
| `path_glob` | Glob pattern with min/max matches |
| `any_of` | Composite OR check (pass if any child passes) |
| `github_workflow_event` | CI triggers on specific events |
| `github_action_present` | Specific GitHub Action used |
| `build_command_detect` | Build/test commands in package.json/Makefile |
| `log_framework_detect` | Logging library detection |
| `dependency_detect` | Dependency presence in manifests/config files |
🧰 Tools
🪛 LanguageTool

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...OR check (pass if any child passes) | | github_workflow_event | CI triggers on specif...

(GITHUB)


[uncategorized] ~152-~152: The official name of this software platform is spelled with a capital “H”.
Context: ...t| CI triggers on specific events | |github_action_present` | Specific GitHub Actio...

(GITHUB)

🤖 Prompt for AI Agents
In `@agent-readiness/README.md` around lines 144 - 155, Add a new row for the
missing check type `dependency_detect` to the "Check Types" table under the
"Check Types" heading: describe it concisely (e.g., "Detects presence and
optional version constraints of declared dependencies in package manifests or
lockfiles") and place it alongside the other entries (`file_exists`,
`path_glob`, `any_of`, etc.) so the README explicitly documents that the profile
uses `dependency_detect`.

Comment on lines +23 to +25
## Contributing

See CONTRIBUTING.md for guidelines.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if CONTRIBUTING.md exists in the standard-repo fixture directory
fd -t f '^CONTRIBUTING\.md$' agent-readiness/test/fixtures/standard-repo/

Repository: robotlearning123/nano-opencode

Length of output: 56


🏁 Script executed:

#!/bin/bash
# List contents of the standard-repo fixture
ls -la agent-readiness/test/fixtures/standard-repo/

# Also check if there are any test files that describe this fixture's purpose
find agent-readiness/test/fixtures -name "*.md" -o -name "*.txt" | head -20

Repository: robotlearning123/nano-opencode

Length of output: 701


🏁 Script executed:

#!/bin/bash
# Read the README.md from standard-repo fixture
cat -n agent-readiness/test/fixtures/standard-repo/README.md

# Search for test files that use this fixture
find agent-readiness/test -name "*.test.*" -o -name "*.spec.*" | head -10

Repository: robotlearning123/nano-opencode

Length of output: 722


🏁 Script executed:

#!/bin/bash
# Check test files for references to CONTRIBUTING.md or broken link detection
rg -i "contributing|broken.*link|documentation.*reference" agent-readiness/test/ -A 2 -B 2

# Also search for what the checks actually verify
rg -i "standard-repo\|L2\+" agent-readiness/test/checks.test.ts -A 3 -B 3

Repository: robotlearning123/nano-opencode

Length of output: 419


🏁 Script executed:

#!/bin/bash
# Read the checks.test.ts file to understand what it tests
head -100 agent-readiness/test/checks.test.ts

# Search for any setup or fixture definition
rg -i "standard-repo|describe|it.*should|expect" agent-readiness/test/checks.test.ts | head -30

Repository: robotlearning123/nano-opencode

Length of output: 3563


🏁 Script executed:

#!/bin/bash
# Search all test files for any assertions about CONTRIBUTING.md
rg "CONTRIBUTING" agent-readiness/test/ 

# Check if there's any documentation about the fixture
find agent-readiness/test/fixtures -name "README*" -o -name "SETUP*" -o -name "*.md"

Repository: robotlearning123/nano-opencode

Length of output: 270


Remove or fix the broken CONTRIBUTING.md reference in the fixture.

The file CONTRIBUTING.md does not exist in the standard-repo test fixture, but line 25 of README.md references it. Since no tests verify this broken link and the fixture is meant to simulate a realistic repository structure, either create the file or remove the reference.

🤖 Prompt for AI Agents
In `@agent-readiness/test/fixtures/standard-repo/README.md` around lines 23 - 25,
The README.md in the standard-repo test fixture references a non-existent
CONTRIBUTING.md; fix by either adding a minimal CONTRIBUTING.md to the fixture
with contribution guidelines or removing/updating the reference in README.md
(search for the "Contributing" heading in README.md and either replace the link
with existing documentation or create CONTRIBUTING.md alongside it so the
reference resolves).

Comment on lines +115 to +127
if (chunk.message?.tool_calls) {
for (const tc of chunk.message.tool_calls) {
if (tc.function?.name) {
const id = tc.id || `call_${toolCallIndex}`;
toolCalls.set(toolCallIndex, {
id,
name: tc.function.name,
arguments: tc.function.arguments || '{}',
});
toolCallIndex++;
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Ollama streaming tool_calls arguments partial chunks delta API documentation

💡 Result:

Ollama’s native REST streaming (/api/chat, /api/generate) is newline-delimited JSON (NDJSON), where each line is a “chunk” you must accumulate/merge to reconstruct the final assistant message (including tool calls). [1] (docs.ollama.com)

Native /api/chat streaming: tool_calls + partial chunks

  • Ollama’s docs explicitly say you must accumulate partial fields across streamed chunks, and that this is “particularly important for tool calling”. [2] (docs.ollama.com)
  • In the tool-calling docs, the “Tool calling with streaming” section shows an agent loop that collects thinking, content, and tool_calls from every chunk, then appends a single combined assistant message back into messages before sending tool results. [3] (docs.ollama.com)
  • The Ollama blog post announcing streaming tool calls shows streamed chunks where a later chunk’s message.tool_calls contains the function name and arguments. [4] (ollama.com)
  • The same blog post describes an “incremental parser” that can handle partial prefixes while streaming and separate tool calls from normal content—i.e., you should be prepared to see tool-call information emerge over time and merge what you receive rather than expecting it all at once. [4] (ollama.com)

Practical merge rule (native API): keep running buffers for thinking + content, and keep a list for tool_calls; on each chunk, append/extend what’s present; when the stream ends, send back one assistant message containing the fully-accumulated tool_calls plus your tool results (as {role:"tool", tool_name, content}). [2][3] (docs.ollama.com)

OpenAI-compatible streaming (/v1/chat/completions): “delta” tool_calls

  • Ollama documents an OpenAI-compatible API under /v1/.... [5] (docs.ollama.com)
  • For “delta”-style streaming semantics, the reference behavior is OpenAI’s streaming spec: streamed events contain choices[].delta (including delta.content and delta.tool_calls). [6] (platform.openai.com)
  • Compatibility note: Ollama’s OpenAI-compatible streaming has had differences/bugs—for example, an issue where streamed tool calls did not populate choices[].delta.tool_calls[].index, breaking some OpenAI client stream helpers. [7] (github.com)

Sources

[1] Ollama API “Streaming” (NDJSON format) (docs.ollama.com)
[2] Ollama “Streaming” capability (accumulate partial fields; tool calling) (docs.ollama.com)
[3] Ollama “Tool calling” capability (“Tool calling with streaming” accumulation loop) (docs.ollama.com)
[4] Ollama blog: “Streaming responses with tool calling” (May 28, 2025; incremental parser; example chunks) (ollama.com)
[5] Ollama docs: “OpenAI compatibility” (docs.ollama.com)
[6] OpenAI docs: streaming chunk object uses choices[].delta (platform.openai.com)
[7] GitHub issue: missing delta.tool_calls[].index in Ollama’s OpenAI-compatible stream (github.com)

Citations:


Fix tool call accumulation to merge by id and append arguments across chunks.

Ollama's streaming API explicitly requires accumulating partial tool call data across chunks, particularly for tool calling. The current code indexes toolCalls by arrival order (toolCallIndex) instead of by tc.id, and replaces arguments rather than appending them. When Ollama streams arguments in fragments across multiple chunks, this approach drops/loses argument fragments.

Instead, use tc.id as the key in the toolCalls map and append/merge arguments for each call ID across all chunks until the stream completes.

🤖 Prompt for AI Agents
In `@src/providers/ollama.ts` around lines 115 - 127, The current loop stores tool
calls into the toolCalls Map keyed by sequential toolCallIndex and overwrites
arguments per chunk; change it to key by tc.id (use tc.id || generate a stable
id from toolCallIndex but do not use toolCallIndex as the primary key) and, when
an entry already exists for that id in toolCalls, append/merge the incoming
tc.function.arguments (concatenate fragments or merge JSON strings) into the
existing entry's arguments instead of replacing them; update references to
toolCallIndex only for generating fallback ids and ensure toolCalls.set uses the
id as the Map key and preserves/accumulates prior data for that id.

Comment on lines 76 to 127
async function runInDocker(command: string, options: SandboxOptions): Promise<SandboxResult> {
const { timeout = 30000, maxMemory = '512m', network = false, cwd } = options;

const dockerArgs = [
'run',
'--rm',
'-i',
`--memory=${maxMemory}`,
'--memory-swap=' + maxMemory, // Prevent swap
'--cpus=1', // Limit CPU
'--pids-limit=100', // Limit processes
];

if (!network) {
dockerArgs.push('--network=none');
}

// Validate cwd to prevent path injection
if (cwd) {
const safeCwd = validateCwd(cwd);
if (safeCwd) {
dockerArgs.push('-v', `${safeCwd}:${safeCwd}`);
dockerArgs.push('-w', safeCwd);
}
// If validation fails, run without volume mount (safer)
}

// Use a minimal image with shell
dockerArgs.push('alpine:latest', 'sh', '-c', command);

return runWithTimeout('docker', dockerArgs, timeout, 'docker');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

env option is accepted but not forwarded to Docker.

SandboxOptions.env is defined but runInDocker doesn't pass environment variables to the container. Users may expect env vars to be available inside Docker.

🔧 Suggested fix
 async function runInDocker(command: string, options: SandboxOptions): Promise<SandboxResult> {
-  const { timeout = 30000, maxMemory = '512m', network = false, cwd } = options;
+  const { timeout = 30000, maxMemory = '512m', network = false, cwd, env } = options;

   const dockerArgs = [
     'run',
     '--rm',
     '-i',
     `--memory=${maxMemory}`,
     '--memory-swap=' + maxMemory, // Prevent swap
     '--cpus=1', // Limit CPU
     '--pids-limit=100', // Limit processes
   ];

   if (!network) {
     dockerArgs.push('--network=none');
   }

+  // Forward environment variables to container
+  if (env) {
+    for (const [key, value] of Object.entries(env)) {
+      dockerArgs.push('-e', `${key}=${value}`);
+    }
+  }
+
   // Validate cwd to prevent path injection
   if (cwd) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function runInDocker(command: string, options: SandboxOptions): Promise<SandboxResult> {
const { timeout = 30000, maxMemory = '512m', network = false, cwd } = options;
const dockerArgs = [
'run',
'--rm',
'-i',
`--memory=${maxMemory}`,
'--memory-swap=' + maxMemory, // Prevent swap
'--cpus=1', // Limit CPU
'--pids-limit=100', // Limit processes
];
if (!network) {
dockerArgs.push('--network=none');
}
// Validate cwd to prevent path injection
if (cwd) {
const safeCwd = validateCwd(cwd);
if (safeCwd) {
dockerArgs.push('-v', `${safeCwd}:${safeCwd}`);
dockerArgs.push('-w', safeCwd);
}
// If validation fails, run without volume mount (safer)
}
// Use a minimal image with shell
dockerArgs.push('alpine:latest', 'sh', '-c', command);
return runWithTimeout('docker', dockerArgs, timeout, 'docker');
}
async function runInDocker(command: string, options: SandboxOptions): Promise<SandboxResult> {
const { timeout = 30000, maxMemory = '512m', network = false, cwd, env } = options;
const dockerArgs = [
'run',
'--rm',
'-i',
`--memory=${maxMemory}`,
'--memory-swap=' + maxMemory, // Prevent swap
'--cpus=1', // Limit CPU
'--pids-limit=100', // Limit processes
];
if (!network) {
dockerArgs.push('--network=none');
}
// Forward environment variables to container
if (env) {
for (const [key, value] of Object.entries(env)) {
dockerArgs.push('-e', `${key}=${value}`);
}
}
// Validate cwd to prevent path injection
if (cwd) {
const safeCwd = validateCwd(cwd);
if (safeCwd) {
dockerArgs.push('-v', `${safeCwd}:${safeCwd}`);
dockerArgs.push('-w', safeCwd);
}
// If validation fails, run without volume mount (safer)
}
// Use a minimal image with shell
dockerArgs.push('alpine:latest', 'sh', '-c', command);
return runWithTimeout('docker', dockerArgs, timeout, 'docker');
}
🤖 Prompt for AI Agents
In `@src/sandbox/index.ts` around lines 76 - 107, runInDocker currently ignores
SandboxOptions.env; update it to forward environment variables into the
container by iterating over options.env (or treat it as a Record/string[] per
type) and appending '-e', 'KEY=VALUE' entries to dockerArgs before adding the
image/command. Preserve existing behavior when env is undefined and ensure
values are stringified; keep validateCwd logic and then call
runWithTimeout('docker', dockerArgs, timeout, 'docker') as before so the
container receives the intended environment.

Comment on lines 109 to 160
// Run command in Firejail sandbox
async function runInFirejail(command: string, options: SandboxOptions): Promise<SandboxResult> {
const { timeout = 30000, network = false, cwd } = options;

const firejailArgs = ['--quiet', '--private-tmp', '--noroot', '--caps.drop=all'];

if (!network) {
firejailArgs.push('--net=none');
}

// Validate cwd to prevent path injection
if (cwd) {
const safeCwd = validateCwd(cwd);
if (safeCwd) {
firejailArgs.push(`--whitelist=${safeCwd}`);
}
// If validation fails, run without whitelist (more restricted)
}

firejailArgs.push('--', 'sh', '-c', command);

return runWithTimeout('firejail', firejailArgs, timeout, 'firejail');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

env option not forwarded in Firejail backend.

Same issue as Docker - the env option from SandboxOptions is ignored. For Firejail, you could pass env to runWithTimeout since the child process would inherit it.

🔧 Suggested fix
 async function runInFirejail(command: string, options: SandboxOptions): Promise<SandboxResult> {
-  const { timeout = 30000, network = false, cwd } = options;
+  const { timeout = 30000, network = false, cwd, env } = options;

   const firejailArgs = ['--quiet', '--private-tmp', '--noroot', '--caps.drop=all'];

   // ... rest of args building ...

   firejailArgs.push('--', 'sh', '-c', command);

-  return runWithTimeout('firejail', firejailArgs, timeout, 'firejail');
+  return runWithTimeout('firejail', firejailArgs, timeout, 'firejail', safeCwd, env);
 }

Note: You'll also need to track safeCwd from the validation to pass it here, or restructure the cwd handling.

🤖 Prompt for AI Agents
In `@src/sandbox/index.ts` around lines 109 - 131, The Firejail backend in
runInFirejail ignores the env field from SandboxOptions; update runInFirejail to
capture options.env and pass it into the call to runWithTimeout so the child
inherits the intended environment (e.g., const { timeout = 30000, network =
false, cwd, env } = options; and then call runWithTimeout('firejail',
firejailArgs, timeout, 'firejail', { env }) ). Also persist the validated cwd as
safeCwd (or restructure) so you keep existing whitelist behavior while
forwarding env to runWithTimeout.

robotlearning123 and others added 4 commits January 24, 2026 10:55
- Ollama provider for local LLM support (localhost:11434)
- Enhanced sandbox with Docker/Firejail/ulimit backends
- Sub-agent system for specialized tasks (research, code-review, test-writer, refactor, documentation)

New files:
- src/providers/ollama.ts - Ollama streaming provider
- src/sandbox/index.ts - Multi-backend sandbox implementation
- src/tools/sandbox.ts - sandbox and sandbox_info tools
- src/agents/subagent.ts - Sub-agent types and runner
- src/tools/spawn.ts - spawn_agent and list_agents tools

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add agent-readiness CLI tool for scanning repository maturity against
Factory Agent Readiness model (10 pillars, 5 levels, 80% gating).

Security improvements:
- Path traversal protection in profile loading and config files
- ReDoS-safe regex checker using linear-time string scanning
- LRU cache with size limits to prevent unbounded memory growth
- Workspace pattern injection prevention

Features:
- 33 checks across 10 pillars (docs, style, build, test, security,
  observability, env, ci, monorepo, task_discovery)
- YAML-based extensible profile system
- JSON and Markdown output formats
- Monorepo detection and app-scope scanning
- Init command with templates for missing files

Testing:
- 43 tests covering checks, engine, security, and LRU cache
- Test fixtures for minimal, standard, and monorepo scenarios

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add findGitRoot() to walk up directory tree for .git
- Update getRepoName() to show "repo/subdir" format for subdirs
- Generate missing template files via init command
- Add .editorconfig for code style compliance

Score improved from 47% (L1) to 75% (L2):
- Code Style: 100% (L5)
- Environment: 100% (L5)
- Task Discovery: 100% (L5)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Only use git info if .git exists directly in scanned path
- Do NOT walk up to parent directories for git info
- Return folder name and "unknown" commit for non-git directories
- Add hasGitDir() helper for sync git existence check

This ensures agent-readiness is treated as independent repo,
not as subdirectory of nano-opencode.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@robotlearning123 robotlearning123 force-pushed the feat/agent-readiness-v0.1.0 branch from 148e3a9 to 806b31f Compare January 24, 2026 15:57
Code review fixes:
- Replace CODEOWNERS placeholder with @robotlearning123
- Fix AGENTS.md placeholder text and add code block language
- Remove brackets from ESLint/Prettier in CONTRIBUTING.md
- Update factory_compat.yaml: 9→10 pillars, add multi-language build manifest
- Add language tags to README.md fenced code blocks
- Fix file-exists.ts: use null check for empty file handling
- Fix github-workflow.ts: correct branch matching for all-branches triggers
- Fix markdown.ts: dynamic padding for badge alignment
- Harden regex.ts ReDoS detection with per-group stack tracking
- Fix templates/AGENTS.md code block language
- Fix test fixture README.md: npm run dev → npm run test
- Fix subagent.ts: track natural completion vs hitting turn limit
- Fix ollama.ts: buffer partial JSON lines across stream chunks
- Fix sandbox/index.ts: add cwd validation to runDirect, prevent double-resolve

Rename agent-readiness → agent-ready:
- Update package.json name and repository URLs
- Update all documentation references
- Update source code comments and config files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@robotlearning123 robotlearning123 force-pushed the feat/agent-readiness-v0.1.0 branch from 806b31f to 656525c Compare January 24, 2026 15:58
@robotlearning123 robotlearning123 merged commit cd56ec9 into main Jan 24, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants