docs(prd): add PRD for DAG-based concurrent execution#2194
docs(prd): add PRD for DAG-based concurrent execution#2194
Conversation
Implement a ready-queue scheduler for component-type-agnostic concurrent execution with bounded concurrency, proper output isolation, and safe defaults. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Dependency Review✅ No vulnerabilities or license issues found.Scanned FilesNone |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new PRD describing a DAG-aware concurrent executor: a ready-queue scheduler with bounded worker pool, phased rollout, integration points for replacing sequential execution, stream-injectable per-node outputs, and configuration defaults (max_concurrency default 1). Changes
Sequence Diagram(s)sequenceDiagram
participant CLI
participant Scheduler
participant DependencyGraph as DepGraph
participant WorkerPool
participant Subprocess
CLI->>DepGraph: load graph
CLI->>Scheduler: start execution (max_concurrency)
Scheduler->>DepGraph: request ready nodes
DepGraph-->>Scheduler: ready node(s)
Scheduler->>WorkerPool: dispatch node job
WorkerPool->>Subprocess: run node (stream-injected stdout/stderr)
Subprocess-->>WorkerPool: complete (exit + outputs)
WorkerPool-->>Scheduler: node finished (result, logs)
Scheduler->>DepGraph: mark node complete, request new ready nodes
DepGraph-->>Scheduler: new ready node(s) or done
Scheduler->>CLI: emit JSON summary / per-node logs when complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/prd/dag-concurrent-execution.md`:
- Around line 273-319: The Run signature and loop must be fixed: change
Scheduler.Run(ctx context.Context) to return error (or return a *Result that
wraps g.Wait() error) so returning g.Wait() type-checks; also avoid deadlock by
making all channel operations respect context cancellation — when receiving the
next node from ready use a select with case <-ctx.Done() to return
ctx.Err()/g.Wait(), and when enqueuing dependents inside the g.Go closure (the
loop that updates inDegree and does ready <- s.graph.GetNode(dep)) use a
non-blocking/ctx-aware send (select with case ready<-node and case <-ctx.Done())
so failFast cancellation won't leave goroutines blocked. Apply these changes in
Scheduler.Run, around the ready channel receive and the dependents enqueue logic
inside the g.Go closure, and ensure completed/total bookkeeping still guarded by
mu.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7f195206-7caf-465b-9592-b65508e033f5
📒 Files selected for processing (1)
docs/prd/dag-concurrent-execution.md
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2194 +/- ##
==========================================
- Coverage 77.29% 77.29% -0.01%
==========================================
Files 960 960
Lines 91088 91088
==========================================
- Hits 70410 70404 -6
- Misses 16593 16602 +9
+ Partials 4085 4082 -3
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Address CodeRabbit review: fix Run() to return *Result (not error from g.Wait()), add ctx.Done() select to prevent deadlock on fail-fast cancellation, and use context-aware channel sends in dependent enqueuing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
what
Added comprehensive Product Requirements Document for implementing DAG-based concurrent execution in Atmos. The PRD proposes a ready-queue scheduler that enables concurrent execution of components across all types (Terraform, Packer, Ansible, custom registry) while respecting dependency graphs and maintaining safe defaults (sequential by default with opt-in parallelism via
--max-concurrency).why
Currently Atmos executes components sequentially even when they have no dependencies and could safely run in parallel. For large deployments with dozens or hundreds of components, this serialization is the dominant bottleneck. The PRD establishes architectural principles, justifies ready-queue scheduling through industry research (Terragrunt, Make, Ninja, Bazel, Buck2, and 10+ other tools all use this pattern), and provides a phased rollout plan. The document also addresses critical concerns: output isolation under concurrency via stream injection, integration with legacy built-in component types without requiring migration, and configuration of concurrency defaults through
atmos.yaml.references
Summary by CodeRabbit