Autonomous Multi-Agent Content Generation Pipeline

🎯 Overview

An autonomous, multi-agent content-generation pipeline that researches, verifies, and writes complete YouTube video scripts using local LLMs, vector storage, and structured databases. Built like a small newsroom of cooperating AI agents coordinated by an orchestrator.

What it does: Takes a creative brief like "Make a 10-minute YouTube script about AI in healthcare for business executives" and autonomously generates a polished, ready-to-record script with full research backing.

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                           USER INTERFACE                                   │
│  (CLI / Streamlit / LangServe endpoint for input + monitoring)             │
└─────────────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
                       ┌────────────────────┐
                       │  ORCHESTRATOR AGENT │
                       │ (LangGraph / async) │
                       └────────────────────┘
                                 │
                                 ▼
                       ┌────────────────────┐
                       │   SEARCH AGENT      │
                       │ - Web Search (DuckDuckGo)│
                       │ - URL Following (GET)│
                       │ - Text Extraction    │
                       │ - Content Summarization│
                       │ - Qdrant + MongoDB   │
                       └────────────────────┘
                                 │
               (triggers after first content batch)
                                 ▼
                       ┌─────────────────────────┐
                       │ SUPPORTING DOC AGENT    │
                       │ - Fact-checking         │
                       │ - Source Validation     │
                       │ - Content Expansion     │
                       └─────────────────────────┘
                                 │
                                 ▼
                       ┌─────────────────────────┐
                       │ QDRANT VECTOR STORE     │
                       │ - Semantic Embeddings   │
                       │ - Content Retrieval     │
                       └─────────────────────────┘
                                 │
                                 ▼
                       ┌─────────────────────────┐
                       │ AUDIENCE MODELING AGENT │
                       │ - Narrative Structure   │
                       │ - Content Planning      │
                       │ - Tone & Length Control │
                       └─────────────────────────┘
                                 │
                                 ▼
                       ┌─────────────────────────┐
                       │ SCRIPT WRITING AGENT    │
                       │ - Llama 3 Generation    │
                       │ - Script Polishing      │
                       │ - Length Optimization   │
                       └─────────────────────────┘
                                 │
                                 ▼
                       ┌─────────────────────────┐
                       │ MONGODB PERSISTENCE     │
                       │ - Metadata Storage      │
                       │ - Run Logs              │
                       │ - Output Archiving      │
                       └─────────────────────────┘

🚀 Key Features

🤖 Multi-Agent Coordination: Specialized agents for research, validation, structuring, and writing
🔍 Autonomous Research: Web search, content extraction, and semantic storage
✅ Fact-Checking: Validates claims against authoritative sources
🎯 Flexible Audience Targeting: Adapts tone, complexity, and structure for ANY audience (e.g., "healthcare professionals", "marketing managers", "high school teachers")
📝 Script Generation: Uses local Llama 3 for polished, voice-ready scripts with precise speaking time control
📚 APA Citations: Automatically generates APA 7th edition citations for all sources
💾 Persistent Storage: MongoDB for metadata, Qdrant for vectors
📊 Full Audit Trail: Complete traceability of all operations and sources
⚡ Local-First: Runs entirely on your machine with Ollama (no API calls)

🛠️ Technology Stack

Core Technologies

LangChain/LangGraph: Multi-agent workflow orchestration
Ollama (Llama 3): Local LLM for script generation
Qdrant: Vector database for semantic search
MongoDB: NoSQL database for metadata and logs
DuckDuckGo: Web search capabilities (free)
Trafilatura: Clean web content extraction

Development Stack

Python 3.11+: Core implementation
AsyncIO: Asynchronous processing
Pydantic: Data validation
pip: Dependency management
Docker: Containerization

🏃‍♂️ Quick Start

Prerequisites

Docker & Docker Compose (for infrastructure services)
Python 3.11+ (for the pipeline)
Ollama (for local LLM)
NVIDIA GPU (recommended for faster LLM inference) with appropriate drivers

1. Start Infrastructure Services

# Clone the repository
git clone https://github.com/cskillzmartin/script-bot.git
cd script-bot

# Start MongoDB, Qdrant, and Ollama (with GPU acceleration if available)
docker-compose up -d

# Wait for services to start (check logs if needed)
docker-compose logs -f ollama

# Pull the Llama 3 model (required for script generation)
docker exec -it ollama ollama pull llama3

2. Install Python Dependencies

# Install dependencies
pip install ddgs trafilatura httpx pydantic pydantic-settings sentence-transformers langchain langchain-community ollama qdrant-client pymongo structlog typer rich

3. Configure Environment (Optional)

Create a .env file for optional API keys:

# .env
SERPAPI_API_KEY=your_serpapi_api_key_here  # Optional: for enhanced search
MONGODB_URL=mongodb://admin:adminpass@localhost:27017
QDRANT_URL=http://localhost:6333
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3

4. Generate Your First Script

# Using the CLI
python -m cli generate \
    --subject "AI in healthcare" \
    --scope "for business executives" \
    --audience "healthcare executives" \
    --length 10  # Length in minutes of SPOKEN content (at 150 words/min)

# Interactive mode (prompts for input)
python -m cli generate

Note: The --length parameter specifies the duration of spoken content only (pure reading time at ~150 words/minute). It does not include b-roll, music, transitions, or other production elements.

5. Check Results

# View recent workflow runs
python -m cli status

# List generated scripts
python -m cli scripts

# Find your script in outputs/scripts/
ls outputs/scripts/

6. Verify Data Persistence

The pipeline automatically saves all data to MongoDB and Qdrant during workflow execution:

MongoDB Collections (accessible via MongoDB Compass or CLI):

workflow_runs: Session metadata and execution status
search_results: All web search results with URLs and content
content_plans: Structured narrative plans for each script
validated_content: Fact-checked information with confidence scores
scripts: Complete generated scripts with metadata

Qdrant Vector Store:

All search results are embedded and stored for semantic retrieval
Enables context-aware script generation

📖 Usage Examples

Example 1: Healthcare AI Script

python -m cli generate \
    --subject "AI in healthcare" \
    --scope "transforming patient care" \
    --audience "healthcare administrators" \
    --length 8 \
    --instructions "Focus on ROI and implementation strategies"

Example 2: Technical Deep Dive

python -m cli generate \
    --subject "Machine learning algorithms" \
    --scope "in computer vision" \
    --audience "computer vision engineers" \
    --length 12 \
    --instructions "Include code examples and technical implementation details"

Example 3: Educational Content

python -m cli generate \
    --subject "Blockchain technology" \
    --scope "explained simply" \
    --audience "curious non-technical adults" \
    --length 6 \
    --instructions "Use everyday analogies and avoid technical jargon"

🎮 CLI Commands Reference

The content generation pipeline provides several commands for managing your scripts and workflow data.

`generate` - Generate a New Script

Create a complete YouTube script with autonomous research and writing.

# Interactive mode (prompts for all inputs)
python -m cli generate

# With all parameters specified
python -m cli generate \
    --subject "Your topic here" \
    --scope "Specific angle or focus" \
    --audience "Target audience description" \
    --length 10 \
    --instructions "Optional: specific requirements"

# Example
python -m cli generate \
    --subject "Cybersecurity trends" \
    --scope "for small businesses" \
    --audience "small business owners" \
    --length 8 \
    --instructions "Focus on practical, affordable solutions"

Parameters:

--subject: Main topic/subject (required)
--scope: Specific scope or angle (required)
--audience: Target audience - can be ANY description (required)
- Examples: "CTOs & CIOs", "marketing managers", "high school teachers", "healthcare professionals"
--length: Target length in minutes of spoken content (default: 10, range: 1-30)
--instructions: Additional specific instructions (optional)
--verbose: Enable verbose logging (optional)

`status` - View Workflow Status

Check the status of recent workflow runs and see execution analytics.

# View recent workflow runs
python -m cli status

Shows:

Session IDs of recent runs
Status (completed, in_progress, error)
Created timestamp
Duration in seconds
Analytics summary (count and average duration by status)

`scripts` - List Generated Scripts

Display recently generated scripts with metadata.

# List last 5 scripts (default)
python -m cli scripts

# List last 10 scripts
python -m cli scripts --limit 10

# List last 20 scripts
python -m cli scripts --limit 20

Shows:

Script title
Subject
Word count
Creation timestamp

Parameters:

--limit: Number of recent scripts to show (default: 5)

`cleanup` - Clean Up Old Data

Remove old workflow runs and associated data from MongoDB.

# Delete runs older than 30 days (default)
python -m cli cleanup

# Delete runs older than 7 days
python -m cli cleanup --days 7

# Delete runs older than 90 days
python -m cli cleanup --days 90

Parameters:

--days: Delete runs older than N days (default: 30)

What gets deleted:

Workflow run records
Associated search results
Content plans
Validated content
Generated scripts

Note: This is permanent! Make sure to backup any scripts you want to keep from the outputs/scripts/ directory first.

🖥️ GPU Setup (Recommended for Performance)

NVIDIA GPU Requirements

NVIDIA GPU with CUDA support (GTX 1060 or better recommended)
NVIDIA Drivers installed and up to date
Docker GPU Support configured

GPU Setup Instructions

Windows with Docker Desktop

Prerequisites:

NVIDIA GPU with recent drivers (Game Ready or Studio)
Windows 10/11 with WSL 2 enabled
Docker Desktop 4.30.0 or later
WSL 2 backend enabled in Docker Desktop

Step-by-step setup:

# 1. Install/Update NVIDIA drivers on Windows (NOT in WSL)
# Download from: https://www.nvidia.com/Download/index.aspx
# Install the latest Game Ready or Studio driver for your GPU

# 2. Enable WSL 2 (if not already enabled)
# Run in PowerShell as Administrator:
wsl --install
# or
wsl --update

# 3. Verify GPU is accessible in WSL 2
wsl
nvidia-smi  # Should show your GPU details

# 4. Enable GPU support in Docker Desktop:
# - Open Docker Desktop Settings
# - Go to "Resources" > "WSL Integration"
# - Enable integration with your WSL 2 distro
# - Go to "Docker Engine" and ensure it's using the WSL 2 backend
# - Restart Docker Desktop

# 5. Verify Docker can access GPU
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi

Important Notes for Windows:

Install NVIDIA drivers on Windows host, not inside WSL
WSL 2 automatically passes through GPU access (no CUDA toolkit needed in WSL)
Docker Desktop must be configured to use WSL 2 backend
If GPU isn't detected, restart Docker Desktop after driver installation

Troubleshooting Windows GPU:

# Check WSL version (should be 2)
wsl --list --verbose

# Verify NVIDIA drivers in WSL
wsl
nvidia-smi

# Test Ollama GPU access
docker exec -it ollama nvidia-smi

Linux

# Install NVIDIA drivers and CUDA toolkit
# Ubuntu/Debian:
sudo apt update && sudo apt install nvidia-driver-535 cuda-toolkit-12-2

# Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt update && sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker

# Verify installation
nvidia-smi
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi

Docker GPU Configuration

The docker-compose.yml already includes GPU configuration:

deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [gpu]

Note: GPU acceleration provides 5-10x faster LLM inference compared to CPU-only execution.

CPU Fallback: If no GPU is available, Ollama will automatically fall back to CPU inference (slower but still functional).

🔧 Configuration

Environment Variables

Variable	Default	Description
`SERPAPI_API_KEY`	None	SerpAPI search key (optional)
`MONGODB_URL`	`mongodb://admin:adminpass@localhost:27017`	MongoDB connection
`QDRANT_URL`	`http://localhost:6333`	Qdrant vector store
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama service URL
`OLLAMA_MODEL`	`llama3`	LLM model to use (GPU-accelerated when available)
`MAX_SEARCH_RESULTS`	`10`	Maximum search results
`WORDS_PER_MINUTE`	`150`	Speaking rate for timing

Directory Structure

outputs/
└── scripts/          # Generated script files (.txt)

agents/               # Agent implementations
├── __init__.py
├── orchestrator.py   # Workflow coordinator
├── search.py         # Web research agent
├── validation.py     # Fact-checking agent
├── audience.py       # Content structuring agent
└── script_writer.py  # Script generation agent

database/             # Database clients
├── __init__.py
├── mongodb_client.py # MongoDB operations
└── qdrant_client.py  # Vector store operations

models.py             # Pydantic data models
config.py             # Configuration and settings
cli.py                # Command-line interface

🔍 Monitoring & Debugging

Check Service Status

# Check if services are running
docker-compose ps

# View service logs
docker-compose logs ollama
docker-compose logs qdrant
docker-compose logs mongo

Workflow Monitoring

# View recent workflow runs with status and duration
python -m cli status

# List generated scripts
python -m cli scripts --limit 10

# Clean up old data
python -m cli cleanup --days 30

Troubleshooting

Common Issues:

Ollama model not found: Run ollama pull llama3 in the Ollama container
Database connection issues: Verify docker-compose.yml services are running
Script generation failures: Check Ollama logs for model loading issues
Search failures: DuckDuckGo search is used by default (no API key needed)
GPU acceleration not working: Ensure NVIDIA drivers are installed and Docker has GPU access

🚀 Advanced Usage

Custom Search Configuration

# In your code or configuration
from config import settings

# Customize search behavior
settings.max_search_results = 15
settings.search_timeout_seconds = 45
settings.include_domains = ["nih.gov", "cdc.gov"]
settings.exclude_domains = ["spam.example.com"]

Programmatic Usage

from agents.orchestrator import OrchestratorAgent
from models import UserInput

# Create orchestrator
orchestrator = OrchestratorAgent()

# Define input
user_input = UserInput(
    subject="AI in healthcare",
    scope="for business executives",
    target_audience="healthcare executives",  # Free-form audience description
    target_length_minutes=10
)

# Execute workflow
workflow_state = await orchestrator.execute_workflow(user_input)

# Access results
script = workflow_state["final_script"]
print(f"Generated: {script.title}")

Batch Processing

# Process multiple subjects (future feature)
python -m cli batch \
    --input-file subjects.json \
    --output-dir batch_output/

📊 Performance & Scaling

Current Capabilities

Concurrent Processing: Up to 5 parallel searches
Content Processing: Handles 10-15 search results per workflow
Script Length: Optimized for 1-30 minute videos
Storage: Scales with MongoDB and Qdrant clustering

GPU Acceleration (Recommended)

Massive Performance Boost: 5-10x faster LLM inference with GPU
Lower CPU Usage: Reduces system load significantly
Better Responsiveness: Faster script generation and research

Optimization Tips

Use authoritative domains in search filters for better quality
Adjust WORDS_PER_MINUTE based on speaking style
Monitor vector store growth for performance tuning
Consider content caching for frequently requested topics
Enable GPU acceleration for optimal performance (see GPU setup below)

🤝 Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Setup

# Clone and setup
git clone https://github.com/cskillzmartin/script-bot.git
cd script-bot

# Install dependencies
pip install ddgs trafilatura httpx pydantic pydantic-settings sentence-transformers langchain langchain-community ollama qdrant-client pymongo structlog typer rich

## 📝 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- **LangChain** for the multi-agent framework
- **Ollama** for local LLM capabilities with GPU acceleration
- **Qdrant** for vector storage
- **MongoDB** for metadata persistence
- **DuckDuckGo** for free web search
- **NVIDIA** for GPU acceleration support
- The open-source community for amazing tools and libraries

**Made with ❤️ by cskillzmartin**

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agents		agents
database		database
outputs/scripts		outputs/scripts
README.md		README.md
__init__.py		__init__.py
cli.py		cli.py
config.py		config.py
docker-compose.yml		docker-compose.yml
models.py		models.py
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Autonomous Multi-Agent Content Generation Pipeline

🎯 Overview

🏗️ Architecture

🚀 Key Features

🛠️ Technology Stack

Core Technologies

Development Stack

🏃‍♂️ Quick Start

Prerequisites

1. Start Infrastructure Services

2. Install Python Dependencies

3. Configure Environment (Optional)

4. Generate Your First Script

5. Check Results

6. Verify Data Persistence

📖 Usage Examples

Example 1: Healthcare AI Script

Example 2: Technical Deep Dive

Example 3: Educational Content

🎮 CLI Commands Reference

generate - Generate a New Script

status - View Workflow Status

scripts - List Generated Scripts

cleanup - Clean Up Old Data

🖥️ GPU Setup (Recommended for Performance)

NVIDIA GPU Requirements

GPU Setup Instructions

Windows with Docker Desktop

Linux

Docker GPU Configuration

🔧 Configuration

Environment Variables

Directory Structure

🔍 Monitoring & Debugging

Check Service Status

Workflow Monitoring

Troubleshooting

🚀 Advanced Usage

Custom Search Configuration

Programmatic Usage

Batch Processing

📊 Performance & Scaling

Current Capabilities

GPU Acceleration (Recommended)

Optimization Tips

🤝 Contributing

Development Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`generate` - Generate a New Script

`status` - View Workflow Status

`scripts` - List Generated Scripts

`cleanup` - Clean Up Old Data

Packages