Skip to content
/ red-cell Public
forked from Eslamanwar/red-cell

penetration agent example reference for expansion of diagnostics for consideration of inclusion as agents or tools including dns and autoconfig

Notifications You must be signed in to change notification settings

augml/red-cell

Β 
Β 

Repository files navigation

Red-Cell AI Pentester Agent

An autonomous AI-powered penetration testing agent that discovers, analyzes, and validates security vulnerabilities with human oversight.

🎯 Overview

Red-Cell is an advanced security testing agent that combines traditional penetration testing tools with AI-powered reasoning to:

  • Automatically discover and map attack surfaces (domains, subdomains, APIs, services)
  • Identify potential vulnerabilities using AI reasoning and traditional scanning
  • Prioritize findings based on exploitability, impact, and business context
  • Generate proof-of-concept exploits to validate vulnerabilities
  • Provide actionable remediation guidance in natural language
  • Track attack surface changes over time and alert on new exposures
  • Generate comprehensive reports (executive summaries and technical reports)

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Red-Cell AI Pentester                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚   Discovery  β”‚  β”‚   Analysis   β”‚  β”‚ Exploitation β”‚              β”‚
β”‚  β”‚   Workflow   │──▢│   Workflow   │──▢│   Workflow   β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚         β”‚                 β”‚                 β”‚                        β”‚
β”‚         β–Ό                 β–Ό                 β–Ό                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚                    State Machine                              β”‚  β”‚
β”‚  β”‚  WAITING β†’ DISCOVERING β†’ REASONING β†’ EXPLOITING β†’ REPORTING  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚         β”‚                 β”‚                 β”‚                        β”‚
β”‚         β–Ό                 β–Ό                 β–Ό                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚   Alerting   β”‚  β”‚   History    β”‚  β”‚   Reporting  β”‚              β”‚
β”‚  β”‚   System     β”‚  β”‚  Persistence β”‚  β”‚   Engine     β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚                                                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Features

1. Attack Surface Discovery

  • Domain Enumeration: Discovers root domains and related assets
  • Subdomain Discovery: Uses subfinder, DNS brute-forcing, and certificate transparency
  • API Discovery: Detects OpenAPI/Swagger, GraphQL, and REST endpoints
  • Service Detection: Port scanning with nmap, service fingerprinting
  • Technology Detection: Identifies frameworks, libraries, and versions

2. Vulnerability Analysis

  • AI-Powered Reasoning: Uses LLMs to analyze potential vulnerabilities
  • Traditional Scanning: Nuclei templates, custom checks
  • Attack Chain Analysis: Identifies chained vulnerabilities
  • CVSS Scoring: Automatic severity assessment
  • Business Context: Considers asset criticality in prioritization

3. Exploitation & Validation

  • PoC Generation: Creates proof-of-concept exploits
  • Safe Exploitation: Validates vulnerabilities without causing damage
  • Payload Mutation: Generates creative bypass payloads
  • Evidence Collection: Screenshots, logs, and reproduction steps

4. Continuous Monitoring

  • Attack Surface Tracking: Monitors for new assets and changes
  • Change Detection: Alerts on new subdomains, services, APIs
  • Scheduled Scans: Configurable continuous testing
  • Historical Analysis: Trend analysis over time

5. Alerting & Notifications

  • Multi-Channel Alerts: Slack, PagerDuty, Microsoft Teams, Email
  • Severity-Based Routing: Critical findings trigger incidents
  • Change Notifications: Real-time alerts on attack surface changes

6. Reporting

  • Executive Summaries: High-level risk overview for leadership
  • Technical Reports: Detailed findings with reproduction steps
  • Remediation Guidance: Actionable fix recommendations
  • Trend Reports: Historical vulnerability trends
  • Dashboard Data: JSON data for visualization

πŸ“¦ Components

Activities

Activity Module Description
discovery_activities.py Subdomain and asset discovery
scanning_activities.py Port scanning and service detection
api_discovery.py OpenAPI, GraphQL, REST API discovery
threat_intel_activities.py AI-powered threat intelligence
exploitation_activities.py Exploit generation and execution
exploitation_verification.py Vulnerability validation
comprehensive_reporting.py Report generation
attack_surface_history.py Historical tracking
alerting.py Multi-channel alerting
continuous_discovery.py Continuous monitoring
zero_day_discovery.py Novel vulnerability discovery
pentest_memory.py Learning from past findings

Workflows

Workflow Description
RedCellWorkflow Main orchestration workflow
ContinuousPentestWorkflow Continuous testing workflow

State Machine

The agent uses a state machine to coordinate the penetration testing process:

WAITING_FOR_TARGET
       β”‚
       β–Ό
DISCOVERING_ASSETS ──────────────────┐
       β”‚                              β”‚
       β–Ό                              β”‚
REASONING_VULNERABILITIES             β”‚
       β”‚                              β”‚
       β–Ό                              β”‚
AWAITING_APPROVAL ◄────────────────────
       β”‚                              β”‚
       β–Ό                              β”‚
EXPLOITING_VULNERABILITIES            β”‚
       β”‚                              β”‚
       β–Ό                              β”‚
VERIFYING_FINDINGS                    β”‚
       β”‚                              β”‚
       β–Ό                              β”‚
GENERATING_REPORT                     β”‚
       β”‚                              β”‚
       β–Ό                              β”‚
COMPLETED β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Installation

Prerequisites

  • Python 3.12+
  • Temporal server
  • MongoDB (for history persistence)
  • Redis (for streaming)
  • Security tools: nmap, subfinder, nuclei, httpx, katana

Local Development

# Clone the repository
cd agents/red-cell

# Install dependencies
pip install -e .

# Set environment variables
export OPENAI_API_KEY="your-api-key"
export TEMPORAL_ADDRESS="localhost:7233"
export MONGODB_URI="mongodb://localhost:27017"

# Run the worker
python -m project.run_worker

Kubernetes Deployment

# Deploy using Helm
helm install red-cell ./chart/red-cell \
  --set temporal-worker.env_vars.OPENAI_API_KEY="your-api-key" \
  --set temporal-worker.env_vars.SLACK_WEBHOOK_URL="your-webhook"

βš™οΈ Configuration

Environment Variables

Variable Description Default
OPENAI_API_KEY OpenAI/OpenRouter API key Required
OPENAI_BASE_URL LLM API base URL https://api.openai.com/v1
OPENAI_MODEL LLM model to use gpt-4
TEMPORAL_ADDRESS Temporal server address localhost:7233
MONGODB_URI MongoDB connection string mongodb://localhost:27017
MONGODB_DATABASE Database name red_cell
ALLOWED_EMAILS Comma-separated allowed user emails Required
SLACK_WEBHOOK_URL Slack webhook for alerts Optional
PAGERDUTY_ROUTING_KEY PagerDuty routing key Optional
TEAMS_WEBHOOK_URL Microsoft Teams webhook Optional
CONTINUOUS_DISCOVERY_ENABLED Enable continuous monitoring true
CONTINUOUS_DISCOVERY_INTERVAL_HOURS Scan interval 24

Alerting Configuration

Configure alerting channels in the Helm values:

temporal-worker:
  env_vars:
    SLACK_WEBHOOK_URL: "https://hooks.slack.com/services/..."
    PAGERDUTY_ROUTING_KEY: "your-routing-key"
    TEAMS_WEBHOOK_URL: "https://outlook.office.com/webhook/..."

πŸ”’ Security Considerations

Human Oversight

Red-Cell is designed with human oversight at critical points:

  1. Approval Required: Exploitation requires explicit user approval
  2. Scope Limits: Testing is limited to approved targets
  3. Safe Mode: Non-destructive testing by default
  4. Audit Trail: All actions are logged

Access Control

  • Email-based access control via ALLOWED_EMAILS
  • API key authentication for agent communication
  • Kubernetes RBAC for deployment security

Responsible Use

⚠️ Important: Only use Red-Cell against systems you own or have explicit authorization to test. Unauthorized penetration testing is illegal.

πŸ“Š Usage Examples

Starting a Pentest

# Via Temporal workflow
from temporalio.client import Client

client = await Client.connect("localhost:7233")

# Start the workflow
handle = await client.start_workflow(
    "RedCellWorkflow",
    id="pentest-example-com",
    task_queue="red-cell-queue",
)

# Send target scope
await handle.signal("user_input", {
    "type": "target_scope",
    "domains": ["example.com"],
    "scope": "*.example.com",
})

Approving Exploitation

# Approve exploitation of findings
await handle.signal("approval", {
    "approved": True,
    "findings": ["finding-1", "finding-2"],
    "approver": "security-team@example.com",
})

πŸ“ˆ Monitoring

Metrics

Red-Cell exposes metrics for monitoring:

  • red_cell_discoveries_total: Total assets discovered
  • red_cell_vulnerabilities_found: Vulnerabilities by severity
  • red_cell_exploits_executed: Exploitation attempts
  • red_cell_scan_duration_seconds: Scan duration

Logging

Structured logging with levels:

logger.info("Starting discovery", extra={
    "target": "example.com",
    "scan_type": "full",
})

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

πŸ“„ License

This project is part of the AgentEx platform. See the main repository for license information.

πŸ†˜ Support

For issues and questions:

  1. Check the documentation
  2. Open a GitHub issue
  3. Contact the security team

Built with ❀️ on https://hub.rilo.dev/ AI platform

About

penetration agent example reference for expansion of diagnostics for consideration of inclusion as agents or tools including dns and autoconfig

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.3%
  • Other 0.7%