Skip to content

anshul-garg27/rate-limiter-python

Repository files navigation

Core Rate Limiter

A compact, high-performance rate limiting service and library supporting Token Bucket and Leaky Bucket algorithms with per-IP/per-user/per-key limiting, comprehensive observability, and production-ready features.

Features

  • 🪣 Dual Algorithms: Token Bucket (burstiness) and Leaky Bucket (pacing)
  • 🔑 Flexible Key Derivation: IP, JWT subject, or custom headers
  • 📊 Full Observability: Prometheus metrics, Grafana dashboards
  • 🌙 Shadow Mode: Log-only testing for safe rollouts
  • 🏃 Multiple Backends: In-memory or Redis with atomic Lua scripts
  • 📝 Standard Headers: X-RateLimit-* and IETF RateLimit-* styles
  • 🚫 Penalty Box: Temporary bans for repeat violators
  • 🎯 Hot Key Detection: Identify and monitor high-traffic keys
  • 🔧 Admin API: Policy management, key inspection, health checks
  • 🐳 Docker Ready: Complete stack with Docker Compose

Quick Start

# Setup environment
make setup
make config-env

# Start observability stack
make up

# Option 1: Memory backend (fastest setup)
make api

# Option 2: Redis backend (production-like)
make redis-up
make api-redis

# Run load tests
make load-smoke
make load-spike

# View dashboards
make urls

Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Demo API      │    │  Rate Limiter    │    │    Storage      │
│                 │    │                  │    │                 │
│ ┌─────────────┐ │    │ ┌──────────────┐ │    │ ┌─────────────┐ │
│ │ aiohttp     │ │───▶│ │ Middleware   │ │───▶│ │ Memory/     │ │
│ │ middleware  │ │    │ │              │ │    │ │ Redis       │ │
│ └─────────────┘ │    │ └──────────────┘ │    │ └─────────────┘ │
│                 │    │ ┌──────────────┐ │    │                 │
│ ┌─────────────┐ │    │ │ Algorithms   │ │    │                 │
│ │ Endpoints   │ │    │ │ • Token      │ │    │                 │
│ │ • /api/echo │ │    │ │ • Leaky      │ │    │                 │
│ │ • /api/orders│ │    │ └──────────────┘ │    │                 │
│ └─────────────┘ │    └──────────────────┘    └─────────────────┘
└─────────────────┘                                               
                                                                  
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Monitoring    │    │   Load Testing   │    │   Management    │
│                 │    │                  │    │                 │
│ ┌─────────────┐ │    │ ┌──────────────┐ │    │ ┌─────────────┐ │
│ │ Prometheus  │ │    │ │ k6 Scripts   │ │    │ │ Admin API   │ │
│ └─────────────┘ │    │ │ • Smoke      │ │    │ │ • Policies  │ │
│ ┌─────────────┐ │    │ │ • Spike      │ │    │ │ • Hot Keys  │ │
│ │ Grafana     │ │    │ │ • Fairness   │ │    │ │ • Reset     │ │
│ └─────────────┘ │    │ └──────────────┘ │    │ └─────────────┘ │
└─────────────────┘    └──────────────────┘    └─────────────────┘

Algorithm Comparison

Feature Token Bucket Leaky Bucket
Burst Handling ✅ Allows bursts up to capacity ❌ Strict pacing
Smoothing ❌ Variable rate ✅ Consistent rate
Use Case API rate limiting, bursty traffic Traffic shaping, QoS
Client Experience Fast when tokens available Predictable delays
Implementation Simpler More complex

Token Bucket Formula

tokens = min(capacity, tokens + (time_elapsed * rate))
allow = tokens >= cost

Leaky Bucket Formula

queue_depth = max(0, current_depth - (time_elapsed * leak_rate))
allow = queue_depth + cost <= capacity

Configuration

Policy Configuration (config/policies.yaml)

defaults:
  algo: token
  rate_per_sec: 10
  burst: 30
  headers_style: x-ratelimit

routes:
  - id: public-api
    match:
      path_prefix: "/api/public"
    key_derivation: ip
    algo: token
    rate_per_sec: 8
    burst: 16

  - id: user-orders
    match:
      regex: "^/api/orders(/.*)?$"
    key_derivation: jwt.sub
    algo: token
    rate_per_sec: 5
    burst: 10
    method_costs:
      POST: 2
      DELETE: 3
    penalty_box:
      enabled: true
      fail_threshold: 5
      ttl_sec: 60

  - id: heavy-processing
    match:
      path_prefix: "/api/heavy"
    key_derivation: "header:X-Api-Key"
    algo: leaky
    leak_per_sec: 2
    capacity: 10
    shadow: false

Environment Variables

# Core Configuration
WITH_REDIS=false              # Use Redis backend
DEFAULT_ALGO=token           # Default algorithm
HEADERS_STYLE=x-ratelimit    # Header style
KEY_DERIVATION=ip            # Default key derivation

# Shadow Mode
WITH_SHADOW=false            # Global shadow mode

# Performance
LOCAL_RPS=400               # Target RPS for load tests
ACTORS=5000                 # Number of simulated users
KEY_CARDINALITY=1000        # Number of unique keys

# Ports
API_PORT=8080               # Demo API
RL_PORT=8085                # Rate limiter service
PROM_PORT=9090              # Prometheus
GRAFANA_PORT=3000           # Grafana
REDIS_PORT=6379             # Redis

# Clock
CLOCK_SKEW_MS=0             # Simulated clock skew

Key Derivation Strategies

1. IP-Based (ip)

key_derivation: ip
  • Uses client IP address
  • Supports X-Forwarded-For, X-Real-IP headers
  • Good for: Public APIs, DDoS protection

2. JWT Subject (jwt.sub)

key_derivation: jwt.sub
  • Extracts sub claim from JWT token
  • No signature verification (for performance)
  • Good for: User-specific limits

3. Header-Based (header:X-Header-Name)

key_derivation: "header:X-Api-Key"
  • Uses value from specified header
  • Good for: API key-based limiting

Rate Limit Headers

X-RateLimit Style (Default)

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 75
X-RateLimit-Reset: 1640995800
X-RateLimit-Cost: 2
Retry-After: 30

IETF Style

RateLimit-Limit: 100;w=3600
RateLimit-Remaining: 75
RateLimit-Reset: 1825
Retry-After: 30

Shadow Mode

Shadow mode allows testing rate limiting policies without blocking traffic:

routes:
  - id: test-route
    shadow: true  # Log blocks but don't enforce

Features:

  • ✅ All requests allowed
  • 📝 Logs would-be blocks
  • 📊 Metrics recorded
  • 🏷️ Special headers added

Observability

Metrics

# Request metrics
rl_requests_total{route,decision,algo,backend}
rl_tokens_remaining{route,algo}
rl_retry_after_seconds_bucket{route,algo}

# Shadow mode
rl_shadow_would_block_total{route,algo}

# Penalty box
rl_penalty_box_total{route,action}

# Storage performance
rl_storage_latency_seconds_bucket{backend,operation}
rl_active_keys{route}

# Health
rl_backend_health{backend}

Dashboards

Access Grafana at http://localhost:3000 (admin/admin) for:

  • Request Rate & Decisions: Allow/block rates by route
  • Response Times: P50/P95/P99 latencies
  • Token Levels: Remaining capacity by route
  • Hot Keys: Most active rate limiting keys
  • Storage Health: Redis/memory performance
  • Penalty Box Activity: Temporary bans
  • Shadow Mode Analysis: Would-block events

Load Testing

Smoke Test

make load-smoke

Basic functionality test with light load.

Spike Test

make load-spike LOCAL_RPS=800 ACTORS=10000

High-load test with configurable parameters.

Fairness Test

make load-fairness KEY_CARDINALITY=5000

Multi-user fairness and hot-key detection.

API Endpoints

Demo API

  • GET /healthz - Health check
  • GET/POST /api/echo - Echo endpoint (IP-based limiting)
  • GET /api/heavy?delay=N - Heavy processing (leaky bucket)
  • GET /api/burst - Burst testing
  • GET/POST/DELETE /api/orders - Orders (JWT-based limiting)
  • GET /api/data - Data endpoint (API key-based)

Admin API

  • GET /admin/policies - View policies
  • POST /admin/reload - Reload configuration
  • GET /admin/keys/hot?limit=10 - Hot keys
  • POST /admin/keys/reset - Reset specific key
  • POST /admin/shadow - Toggle shadow mode

Observability

  • GET /metrics - Prometheus metrics
  • GET /healthz - Health check

Storage Backends

Memory Backend

  • ✅ Zero dependencies
  • ✅ Low latency
  • ✅ Lock-based concurrency
  • ❌ Single instance only
  • ❌ No persistence

Redis Backend

  • ✅ Distributed
  • ✅ Atomic Lua scripts
  • ✅ Persistent
  • ✅ Clock skew tolerance
  • ⚠️ Network latency
  • ⚠️ Additional dependency

Production Deployment

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rate-limiter
spec:
  replicas: 3
  selector:
    matchLabels:
      app: rate-limiter
  template:
    metadata:
      labels:
        app: rate-limiter
    spec:
      containers:
      - name: rate-limiter
        image: rate-limiter-service:latest
        ports:
        - containerPort: 8085
        env:
        - name: WITH_REDIS
          value: "true"
        - name: REDIS_URL
          value: "redis://redis-service:6379"

Performance Tuning

Memory Backend

  • Adjust cleanup_interval for GC frequency
  • Monitor rl_active_keys metric
  • Consider TTL values vs memory usage

Redis Backend

  • Use Redis cluster for scale
  • Monitor Lua script performance
  • Tune clock_skew_tolerance_ms
  • Consider Redis pipeline for bulk operations

General

  • Profile hot paths with rl_request_latency_seconds
  • Monitor rl_storage_latency_seconds
  • Scale horizontally behind load balancer

Troubleshooting

Common Issues

High Latency

make debug-metrics | grep latency

Check storage backend performance.

Missing Rate Limit Headers

make debug-policies

Verify route configuration matches request paths.

Redis Connection Issues

make debug-redis

Check Redis connectivity and health.

Unexpected Blocking

curl http://localhost:8085/admin/keys/hot

Look for hot keys or penalty box activations.

Debug Commands

make status           # Service health
make debug-metrics    # Current metrics  
make debug-policies   # Active policies
make debug-hot-keys   # Hot keys
make debug-redis      # Redis status

Development

Setup

make setup-dev        # Install dev dependencies
make check            # Run linting and type checks
make test             # Run test suite
make test-cov         # Test coverage report

Code Quality

make lint             # Flake8 linting
make format           # Black formatting  
make type-check       # MyPy type checking

Testing Strategy

Unit Tests (test/unit/)

  • Algorithm math correctness
  • Key derivation logic
  • Header formatting
  • Shadow mode behavior

Integration Tests (test/integration/)

  • Storage backend atomicity
  • Policy loading and matching
  • Concurrency under load

E2E Tests (test/e2e/)

  • Full request flow
  • k6 load test validation
  • Multi-service interaction

Extending

Custom Storage Backend

class CustomStorage:
    async def get(self, key: str) -> Optional[BucketState]:
        # Implement key retrieval
        pass
    
    async def set(self, key: str, state: BucketState, ttl: int):
        # Implement key storage
        pass

Custom Key Derivation

def custom_key_deriver(strategy: str, route_id: str, request_data: dict) -> str:
    if strategy == "custom:device_id":
        device_id = request_data["headers"].get("X-Device-ID")
        return f"{route_id}:device:{device_id}"
    raise KeyDerivationError(f"Unknown strategy: {strategy}")

Custom Algorithm

class CustomBucket:
    def check_and_consume(self, state: BucketState, cost: int) -> Tuple[RateLimitResult, BucketState]:
        # Implement custom rate limiting logic
        pass

Performance Benchmarks

Memory Backend

  • Throughput: 50,000+ RPS (single instance)
  • Latency: P95 < 1ms, P99 < 5ms
  • Memory: ~100 bytes per active key

Redis Backend

  • Throughput: 10,000+ RPS (single Redis)
  • Latency: P95 < 10ms, P99 < 50ms
  • Overhead: ~200 bytes per key + network

Scaling Characteristics

  • Memory: Linear scaling per instance
  • Redis: Horizontal scaling with cluster
  • Hot Keys: Automatic detection and monitoring

License

MIT License - see LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass (make test)
  5. Run code quality checks (make check)
  6. Submit pull request

Support

  • 📖 Full documentation in /docs
  • 🐛 Issues: GitHub Issues
  • 💬 Discussions: GitHub Discussions
  • 📧 Security: security@example.com

About

Production-ready rate limiter with token bucket, sliding window, Prometheus metrics & Grafana dashboards

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors