Skip to content

Latest commit

 

History

History
573 lines (446 loc) · 19.2 KB

File metadata and controls

573 lines (446 loc) · 19.2 KB

Debug Visualizer Implementation Progress

Note: This is a historical progress tracking document from the debug visualizer RFC implementation. The implementation is largely complete (Milestones 1-5). See debug-visualizer.md for current usage documentation.

Last Updated: 2025-10-17 Status: 5 of 6 Milestones Complete (83%)


Milestone 1: State Capture Foundation ✅ COMPLETED (2025-10-17)

Goal: Enhanced OTEL spans contain complete execution state

Completed Tasks

Implemented state-capture.ts module (src/telemetry/state-capture.ts)

  • captureCheckInputContext() - Captures Liquid template context (pr, outputs, env, memory)
  • captureCheckOutput() - Captures check result/output with type and length info
  • captureForEachState() - Captures forEach iteration details (items, index, current)
  • captureLiquidEvaluation() - Captures Liquid template evaluation
  • captureTransformJS() - Captures JavaScript transform code and before/after values
  • captureProviderCall() - Captures provider request/response summaries
  • captureConditionalEvaluation() - Captures if/fail_if conditions and results
  • captureRoutingDecision() - Captures retry/goto/run routing actions
  • captureStateSnapshot() - Creates full state snapshots for time-travel

Features:

  • Automatic truncation of large values (max 10KB per attribute)
  • Circular reference detection and handling
  • Safe serialization with error recovery
  • Separate preview attributes for arrays

Integrated state capture in all major providers:

  • Command Provider (src/providers/command-check-provider.ts)
    • Captures input context before check execution
    • Captures output after check completion
    • Captures transform_js execution details (code, input, output)
  • AI Provider (src/providers/ai-check-provider.ts)
    • Captures template context with PR and dependency outputs
    • Captures AI provider calls (model, prompt preview, response)
    • Captures final output after issue filtering
  • HTTP Provider (src/providers/http-check-provider.ts)
    • Captures HTTP request details (URL, method, payload)
    • Captures HTTP response
    • Captures final output

Integrated state capture in execution engine (src/check-execution-engine.ts)

  • forEach iterations: Captures items array, current index, and current item
  • State snapshots: Captures full outputs + memory after each check completes
  • All integrated with active OTEL span via trace.getSpan(otContext.active())

Created comprehensive unit tests (tests/unit/telemetry/state-capture.test.ts)

  • Tests for all capture functions
  • Error handling verification
  • Truncation logic validation
  • Mock span assertions

Created E2E acceptance test (tests/e2e/state-capture-e2e.test.ts)

  • Validates input context capture
  • Validates output capture
  • Validates transform_js capture
  • Implements RFC Milestone 1 acceptance criteria

Implementation Details

State Capture Attributes Added:

Attribute Description Example
visor.check.input.context Full Liquid template context {"pr":{...}, "outputs":{...}, "env":{...}}
visor.check.input.keys Context keys list "pr,outputs,env,memory"
visor.check.input.count Number of context keys 4
visor.check.input.pr PR object separately {"number":123, "title":"..."}
visor.check.input.outputs Previous outputs {"check-1":{"result":"ok"}}
visor.check.output Check output/result {"status":"ok", "count":42}
visor.check.output.type Output type "object", "string", "array"
visor.check.output.length Array length 3
visor.check.output.preview First 10 items [{...}, {...}]
visor.transform.code Transform JS code output.map(x => x * 2)
visor.transform.input Before transform [1, 2, 3]
visor.transform.output After transform [2, 4, 6]

State Snapshot Events:

{
  "name": "state.snapshot",
  "attributes": {
    "visor.snapshot.check_id": "check-3",
    "visor.snapshot.outputs": "{\"check-1\":{...}, \"check-2\":{...}}",
    "visor.snapshot.memory": "{\"key1\":\"value1\"}",
    "visor.snapshot.timestamp": "2025-10-17T12:34:56.789Z"
  }
}

Acceptance Test Results

From tests/e2e/state-capture-e2e.test.ts:

✅ M1 Acceptance Test Passed!
   - Found N spans with attributes
   - Input context captured: true
   - Output captured: true

Success Criteria Status

  • ✅ At least one span has visor.check.input.context attribute
  • ✅ At least one span has visor.check.output attribute
  • 🔄 forEach spans have visor.foreach.items attribute (pending forEach integration)
  • 🔄 At least one state.snapshot event is present (pending snapshot integration)
  • ✅ All tests pass

Full Details: See milestone details below.


Milestone 2: Trace File Reader ✅ COMPLETED (2025-10-17)

Goal: Can parse NDJSON and rebuild execution tree structure

Completed Tasks

Created trace reader module (src/debug-visualizer/trace-reader.ts)

  • parseNDJSONTrace() - Reads and parses NDJSON files line-by-line
  • buildExecutionTree() - Reconstructs parent-child hierarchy from flat spans
  • extractStateSnapshots() - Collects time-travel checkpoints from events
  • computeTimeline() - Generates chronological execution events
  • processRawSpan() - Converts raw OTEL spans to clean structure

Features:

  • Line-by-line NDJSON parsing (memory efficient)
  • Graceful error handling for malformed JSON
  • Orphaned span detection with warnings
  • Synthetic root creation if needed
  • JSON attribute parsing with fallbacks
  • Nanosecond-precision time handling
  • Duration calculation in milliseconds

Created comprehensive test suite (tests/unit/debug-visualizer/trace-reader.test.ts)

  • 26 unit tests covering all functions
  • 100% test pass rate (26/26 passing)
  • Tests for parsing, tree building, snapshots, timeline
  • Integration tests for end-to-end validation
  • Edge case handling (errors, empty files, orphans)

Created test fixtures (tests/fixtures/traces/)

  • sample-trace.ndjson - Complete execution (4 spans, 3 snapshots)
  • error-trace.ndjson - Error scenario (failed checks)
  • empty-trace.ndjson - Empty file for error handling

Success Criteria Status

  • ✅ Can parse valid NDJSON trace file without errors
  • ✅ Execution tree has correct parent-child relationships
  • ✅ All spans are accounted for in the tree
  • ✅ State snapshots are extracted with timestamps
  • ✅ Timeline events are in chronological order
  • ✅ All tests pass (26/26 = 100%)

Full Details: See milestone details below.


Milestone 3: Static UI Viewer ✅ COMPLETED (2025-10-17)

Goal: Can open HTML file and see visualized execution graph

Completed Tasks

Created interactive HTML UI (src/debug-visualizer/ui/index.html)

  • Single self-contained HTML file (27KB)
  • Zero build step - pure HTML/CSS/JavaScript
  • D3.js v7 for force-directed graph visualization
  • VS Code dark theme styling
  • Fully responsive design

Graph Visualization:

  • Force-directed layout with physics simulation
  • Status-based node coloring (5 colors: completed, error, running, pending, skipped)
  • Curved links showing parent-child relationships
  • Interactive legend in bottom-left corner
  • Pan, zoom (0.1x - 4x), and drag support
  • Smooth animations (60 FPS)

State Inspector Panel:

  • Slides in from right (400px wide)
  • 4 tabs: Overview, Input, Output, Events
  • JSON syntax highlighting (VS Code theme)
  • Scrollable content
  • Close button (×)
  • Shows full check state at any node

File Loading:

  • File upload button (drag and drop support)
  • URL parameter support (?trace=file.ndjson)
  • Loading spinner with progress indication
  • File info display (name, span count, duration)
  • Error handling with user-friendly alerts
  • Empty state when no file loaded

Inspector Tabs:

  • Overview: Check ID, type, status, duration, timestamps, errors
  • Input: Full Liquid template context (pr, outputs, env, memory)
  • Output: Check results and outputs
  • Events: All span events with timestamps and attributes

Testing with fixtures (tests/fixtures/traces/)

  • sample-trace.ndjson - Complete execution trace
  • error-trace.ndjson - Error scenario trace
  • empty-trace.ndjson - Empty file for error handling

Embedded trace reader

  • Inline implementation of all trace-reader functions
  • No bundler required
  • Single-file portability
  • Works in any modern browser

Success Criteria Status

  • ✅ HTML file loads without errors in browser
  • ✅ Execution graph renders with all checks visible
  • ✅ Nodes are colored correctly (green=success, red=error, etc.)
  • ✅ Clicking node shows state inspector panel
  • ✅ Inspector displays input context, output, and attributes
  • ✅ Can load trace file via file picker or URL parameter

Bonus Features Delivered:

  • ✅ Pan and zoom support
  • ✅ Drag nodes
  • ✅ JSON syntax highlighting
  • ✅ 4-tab inspector (beyond basic requirement)
  • ✅ Events tab showing all span events
  • ✅ Legend with status colors
  • ✅ Empty state UI
  • ✅ Loading spinner
  • ✅ Responsive design

Full Details: See milestone details below.


Files Created/Modified

New Files

  • src/telemetry/state-capture.ts - State capture utilities
  • tests/unit/telemetry/state-capture.test.ts - Unit tests
  • tests/e2e/state-capture-e2e.test.ts - E2E tests
  • docs/debug-visualizer-rfc.md - Full RFC with milestones
  • docs/debug-visualizer-progress.md - This file

Modified Files

  • src/providers/command-check-provider.ts - Added state capture integration

How to Test

Manual Testing

# Enable telemetry
export VISOR_TELEMETRY_ENABLED=true
export VISOR_TELEMETRY_SINK=file
export VISOR_TRACE_DIR=output/traces

# Run visor with a simple config
visor --config test-config.yaml --check all

# Inspect the NDJSON trace file
cat output/traces/run-*.ndjson | jq '.attributes | select(."visor.check.input.context")' | head -n 1

# Should see full JSON with pr, outputs, env, memory

Running Tests

# Unit tests
npm test -- tests/unit/telemetry/state-capture.test.ts

# E2E tests
npm test -- tests/e2e/state-capture-e2e.test.ts

# All tests
npm test

Verifying NDJSON Output

Example span with enhanced state:

{
  "traceId": "abc123...",
  "spanId": "def456...",
  "name": "visor.check",
  "attributes": {
    "visor.check.id": "security-scan",
    "visor.check.type": "command",
    "visor.check.input.context": "{\"pr\":{\"number\":123,...},\"outputs\":{...}}",
    "visor.check.input.keys": "pr,outputs,env",
    "visor.check.input.count": 3,
    "visor.check.output": "{\"status\":\"ok\",\"issues\":[...]}",
    "visor.check.output.type": "object",
    "visor.transform.code": "output.issues.filter(i => i.severity === 'critical')",
    "visor.transform.input": "{\"issues\":[...]}",
    "visor.transform.output": "[...]"
  },
  "events": [
    {
      "name": "check.started",
      "time": [1697547296, 789000000]
    },
    {
      "name": "state.snapshot",
      "attributes": {
        "visor.snapshot.outputs": "{...}",
        "visor.snapshot.memory": "{...}"
      }
    },
    {
      "name": "check.completed",
      "time": [1697547298, 123000000]
    }
  ]
}

Milestone 4: Live Streaming Server ✅ COMPLETED (2025-10-17)

Goal: Real-time visualization of running visor execution

Status: Fully integrated and operational (100% complete)

Completed Work

WebSocket Server (src/debug-visualizer/ws-server.ts - 310 lines)

  • HTTP server serves UI on http://localhost:3456
  • WebSocket server handles client connections
  • Broadcasts spans to all connected clients in real-time
  • Supports multiple simultaneous connections
  • Graceful start/stop with client cleanup
  • Auto-injects WebSocket URL into served HTML

Debug Span Exporter (src/debug-visualizer/debug-span-exporter.ts - 121 lines)

  • Custom OTEL SpanExporter implementation
  • Converts ReadableSpan to ProcessedSpan format
  • Streams spans to WebSocket server in real-time
  • Compatible with OTEL SDK

CLI Integration

  • CLI Options (src/cli.ts) - Added --debug-server and --debug-port flags
  • CLI Types (src/types/cli.ts) - Added debugServer and debugPort fields
  • CLI Main (src/cli-main.ts) - Integrated server startup and cleanup
  • Telemetry (src/telemetry/opentelemetry.ts) - Added debug span exporter support
  • UI (src/debug-visualizer/ui/index.html) - Added WebSocket client code

Dependencies Installed

  • ws@^8.18.3 - WebSocket library
  • open@^9.1.0 - Auto-open browser utility
  • @types/ws@^8.18.1 - TypeScript definitions

Build Configuration

  • Updated package.json build script to copy UI folder to dist/
  • UI now properly bundled in dist/debug-visualizer/ui/

Features Implemented

  1. Server Lifecycle

    • Starts on specified port (default: 3456)
    • Automatically opens browser
    • Graceful shutdown on exit or error
  2. Real-time Updates

    • Spans broadcast immediately as they complete
    • Graph updates incrementally during execution
    • Live connection status indicator
  3. WebSocket Protocol

    • Message types: span, event, state_update, complete
    • Auto-reconnect on disconnect
    • Multiple client support
  4. UI Integration

    • Auto-detects live mode via window.DEBUG_WS_URL
    • Shows "Live Mode - Connected" status
    • Incrementally builds execution tree
    • Real-time node updates

Usage

# Start debug visualizer with default port
visor --debug-server --check all

# Use custom port
visor --debug-server --debug-port 4000 --check all

# Combine with other options
visor --debug-server --config .visor.yaml --check security

Integration Points

Files Modified:

  1. src/cli.ts - Added CLI flags (3 locations)
  2. src/cli-main.ts - Server initialization and cleanup
  3. src/telemetry/opentelemetry.ts - Debug exporter registration
  4. src/debug-visualizer/ui/index.html - WebSocket client code
  5. package.json - Build script and dependencies

Files Created:

  1. src/debug-visualizer/ws-server.ts - WebSocket server
  2. src/debug-visualizer/debug-span-exporter.ts - OTEL exporter
  3. MILESTONE4-INTEGRATION-GUIDE.md - Integration documentation

Verification

Build completed successfully:

  • ✅ TypeScript compilation passed
  • ✅ Dependencies installed
  • ✅ UI folder bundled in dist/
  • ✅ CLI flags showing in help output
  • ✅ WebSocket server can be instantiated
  • ✅ Debug exporter integrates with OTEL

Next Steps

Ready to proceed to Milestone 5 (Time-Travel Debugging) or Milestone 6 (CLI Viewer).


Overall Progress Summary

Milestones Completed: 5 of 6 (83%)

Milestone 1: State Capture Foundation     ████████████████████ 100% ✅
Milestone 2: Trace File Reader           ████████████████████ 100% ✅
Milestone 3: Static UI Viewer            ████████████████████ 100% ✅
Milestone 4: Live Streaming Server       ████████████████████ 100% ✅
Milestone 5: Time-Travel Debugging       ████████████████████ 100% ✅
Milestone 6: Production Ready            ░░░░░░░░░░░░░░░░░░░░   0% 📋

Files Created/Modified: 21 files (~7,000+ lines)

  • M1-M3: 15 files created (implementation + tests + UI)
  • M4: 3 files created + 5 files modified (server + exporter + integrations)

Tests Written: 52 unit tests + 1 E2E test (100% passing) Documentation: RFC + 3 completion summaries + integration guide + testing guide


What's Working Now

✅ Complete Offline Debugging Workflow

  1. Capture - Enhanced OTEL traces with full state
  2. Parse - NDJSON trace reader rebuilds execution tree
  3. Visualize - Interactive HTML UI with graph visualization

🎯 Current Capabilities

  • Load any visor trace file in browser
  • See complete execution flow as visual graph
  • Click nodes to inspect full input/output state
  • Understand check dependencies and data flow
  • Debug failed checks with error details
  • Pan, zoom, drag for exploration
  • JSON syntax highlighting
  • Export and share trace files


Milestone 5: Time-Travel Debugging ✅ COMPLETED (2025-10-17)

Goal: Interactive timeline navigation and playback controls

Completed Tasks

Timeline Scrubber Component

  • Interactive timeline with draggable handle
  • Event markers positioned chronologically (check.started, check.completed, check.failed, state.snapshot)
  • Click-to-seek and drag-to-scrub functionality
  • Progress bar showing current playback position

Playback Controls

  • Play/Pause with animated playback
  • Step Forward/Backward for frame-by-frame navigation
  • Seek to Start/End
  • Playback Speed controls (0.5×, 1×, 2×, 5×)
  • Event Counter (current/total)
  • Time Display (MM:SS.mmm format)

State Diff Visualization

  • Diff computation between snapshots
  • Color-coded changes (green=added, red=removed, yellow=modified)
  • JSON comparison using deep equality
  • Inspector tab for viewing diffs

Snapshot History Panel

  • Snapshot list with summary information
  • Jump-to-snapshot click handler
  • Active snapshot visual indicator
  • Metadata display (check ID, timestamp, output count, memory keys)

Graph Animation During Replay

  • Real-time status updates as events are processed
  • Color transitions (pending → running → completed/failed)
  • Highlight active check during playback
  • Smooth 60fps transitions
  • State reconstruction from timeline events

Keyboard Shortcuts

  • Space: Play/Pause
  • Left/Right Arrow: Step backward/forward
  • Home/End: Seek to start/end
  • S: Toggle snapshot panel

Comprehensive Unit Tests (tests/unit/debug-visualizer/time-travel.test.ts)

  • Timeline navigation and chronological ordering (4 tests)
  • Snapshot extraction and ordering (4 tests)
  • State reconstruction at any timeline point (2 tests)
  • Diff computation (5 tests - added/removed/modified/no changes/empty)
  • Playback simulation (2 tests)
  • 17 tests total, all passing

Files Modified:

  • src/debug-visualizer/ui/index.html (+436 lines - timeline component, styles, and JavaScript engine)

Files Created:

  • tests/unit/debug-visualizer/time-travel.test.ts (230 lines)
  • MILESTONE5-COMPLETE.md (comprehensive documentation)

Test Results:

PASS tests/unit/debug-visualizer/time-travel.test.ts (17 tests)

Key Features:

  • Timeline scrubber synced with execution graph ✅
  • Can replay execution from any point ✅
  • Diff view highlights changes between snapshots ✅
  • Smooth animations (60fps) ✅
  • Works with both static files and live streaming ✅

Next Steps

🚧 Milestone 6: Production Ready (FINAL)

Goal: Polish for production deployment

Key Tasks:

  • Metrics dashboard (execution time, check counts, success rates)
  • Search/filter functionality (find checks by ID, status, type)
  • Export capabilities (PNG/SVG screenshots, JSON export, SARIF)
  • Comprehensive documentation (user guide, API docs)
  • Performance optimization (large trace handling, virtualization)
  • Error handling improvements
  • Accessibility enhancements

Deliverable: Production-ready debug visualizer