diff --git a/README.md b/README.md
index f9dd2d4..58385c4 100644
--- a/README.md
+++ b/README.md
@@ -1,207 +1,220 @@
 # RustKernels
 
-**High-performance GPU kernel library for financial services, compliance, and enterprise analytics.**
+**GPU-accelerated kernel library for financial services, compliance, and enterprise analytics.**
 
 [![License](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE)
 [![Rust](https://img.shields.io/badge/rust-1.85%2B-orange.svg)](https://www.rust-lang.org)
 [![Documentation](https://img.shields.io/badge/docs-online-green.svg)](https://mivertowski.github.io/RustKernels/)
-[![Version](https://img.shields.io/badge/version-0.2.0-blue.svg)](CHANGELOG.md)
-
----
-
-## Why RustKernels?
-
-Financial institutions face a common challenge: implementing high-performance analytics that scale from batch processing to real-time streaming while maintaining regulatory compliance. RustKernels solves this by providing:
-
-- **Battle-tested algorithms** ported from production C# Orleans grains
-- **Nanosecond-scale latency** for real-time fraud detection and order matching
-- **Unified API** across 14 specialized domains
-- **GPU acceleration** with automatic CPU fallback
-
-```rust
-// Detect AML patterns in transaction graphs - sub-millisecond response
-let detector = CircularFlowRatio::new();
-let risk_score = detector.compute(&transaction_graph)?;
-if risk_score > 0.8 {
-    alert_compliance_team(&transaction);
-}
-```
+[![Version](https://img.shields.io/badge/version-0.4.0-blue.svg)](CHANGELOG.md)
+[![RingKernel](https://img.shields.io/badge/ringkernel-0.4.2-purple.svg)](https://crates.io/crates/ringkernel-core)
 
 ---
 
 ## Overview
 
-RustKernels provides **106 GPU-accelerated algorithms** across **14 domain-specific crates**, purpose-built for financial institutions, compliance teams, and enterprise analytics platforms.
+RustKernels delivers **106 production-ready GPU kernels** across **14 domain-specific crates**, purpose-built for financial institutions, compliance operations, and enterprise analytics platforms. It is the Rust port of the DotCompute GPU kernel library, built on the [RingKernel 0.4.2](https://crates.io/crates/ringkernel-core) persistent actor runtime.
+
+Version 0.4.0 provides full end-to-end kernel execution through REST, gRPC, Tower middleware, and Actix actor interfaces — no stubs, no mocks.
 
-### What Makes RustKernels Different
+### Key Capabilities
 
-| Challenge | RustKernels Solution |
-|-----------|---------------------|
-| Latency requirements vary widely | Dual execution modes: Batch (10-50μs) or Ring (100-500ns) |
-| Complex multi-kernel workflows | Built-in K2K coordination patterns |
-| Production reliability concerns | Ported from battle-tested C# implementations |
-| GPU availability uncertainty | Automatic CPU fallback when CUDA unavailable |
-| Regulatory explainability | SHAP values and feature importance kernels |
-| Enterprise security requirements | Auth, RBAC, multi-tenancy, secrets management |
-| Production observability | Metrics, tracing, logging, alerting |
-| Fault tolerance | Circuit breakers, retry, health checks |
-| Service deployment | REST, gRPC, Actix actor integrations |
+| Requirement | Solution |
+|---|---|
+| Diverse latency profiles | **Batch mode** (10–50 μs launch) and **Ring mode** (100–500 ns message latency) |
+| Multi-kernel orchestration | Built-in K2K coordination: scatter-gather, fan-out, pipeline |
+| Production deployment | REST (Axum), gRPC (Tonic), Tower middleware, Actix actors |
+| Enterprise security | JWT/API key auth, RBAC, multi-tenancy, secrets management |
+| Observability | Prometheus metrics, OTLP tracing, structured logging, SLO alerting |
+| Fault tolerance | Circuit breakers, exponential retry, timeout propagation, health probes |
+| GPU availability | Automatic CPU fallback when CUDA is unavailable |
+| Regulatory explainability | SHAP values, feature importance, audit trail kernels |
 
 ---
 
-## Performance Characteristics
+## Performance
 
 | Metric | Batch Mode | Ring Mode |
-|--------|------------|-----------|
-| Launch overhead | 10-50μs | N/A (persistent) |
-| Message latency | N/A | 100-500ns |
-| State location | CPU → GPU transfer | GPU-resident |
-| Throughput (PageRank) | ~100K nodes/sec | ~500K updates/sec |
-| Memory efficiency | Standard | Optimized (persistent) |
-
-### When to Use Each Mode
-
-**Batch Mode** - Best for scheduled, heavy computation:
-- End-of-day risk aggregation
-- Batch AML screening (millions of transactions)
-- Monthly compliance reporting
-- Model training and backtesting
-
-**Ring Mode** - Best for real-time, high-frequency operations:
-- Order book matching (sub-millisecond)
-- Real-time fraud scoring
-- Streaming anomaly detection
-- Live transaction monitoring
+|---|---|---|
+| Launch overhead | 10–50 μs | N/A (persistent) |
+| Message latency | N/A | 100–500 ns |
+| State location | CPU memory, transferred per invocation | GPU-resident, zero-copy ring buffers |
+| Throughput (PageRank) | ~100K nodes/s | ~500K updates/s |
+
+**Batch mode** is suited for scheduled, compute-heavy workloads: end-of-day risk aggregation, batch AML screening, model training, compliance reporting.
+
+**Ring mode** targets high-frequency, latency-sensitive operations: order book matching, real-time fraud scoring, streaming anomaly detection, live transaction monitoring.
 
 ---
 
 ## Domain Coverage
 
 | Domain | Crate | Kernels | Key Algorithms |
-|--------|-------|---------|----------------|
-| **Graph Analytics** | `rustkernel-graph` | 28 | PageRank, Louvain, GNN inference, graph attention, cycle detection |
-| **Statistical ML** | `rustkernel-ml` | 17 | K-Means, DBSCAN, isolation forest, federated learning, SHAP |
-| **Compliance** | `rustkernel-compliance` | 11 | Circular flow detection, rapid movement, sanctions screening |
-| **Temporal Analysis** | `rustkernel-temporal` | 7 | ARIMA, Prophet decomposition, change point detection |
-| **Risk Analytics** | `rustkernel-risk` | 5 | Monte Carlo VaR, credit scoring, stress testing |
-| **Process Intelligence** | `rustkernel-procint` | 7 | DFG construction, conformance checking, digital twin |
-| **Behavioral Analytics** | `rustkernel-behavioral` | 6 | Profiling, forensic queries, causal analysis |
-| **Treasury** | `rustkernel-treasury` | 5 | Liquidity optimization, FX hedging, NSFR calculation |
-| **Clearing** | `rustkernel-clearing` | 5 | Multilateral netting, DVP matching, settlement |
-| **Accounting** | `rustkernel-accounting` | 9 | Network generation, GL reconciliation, GAAP detection |
-| **Banking** | `rustkernel-banking` | 1 | Fraud pattern matching (Aho-Corasick) |
-| **Order Matching** | `rustkernel-orderbook` | 1 | High-frequency order book engine |
-| **Payments** | `rustkernel-payments` | 2 | Payment processing, flow analysis |
-| **Audit** | `rustkernel-audit` | 2 | Feature extraction, hypergraph construction |
+|---|---|---|---|
+| Graph Analytics | `rustkernel-graph` | 28 | PageRank, Louvain, GNN inference, graph attention, cycle detection |
+| Statistical ML | `rustkernel-ml` | 17 | K-Means, DBSCAN, isolation forest, federated learning, SHAP |
+| Compliance | `rustkernel-compliance` | 11 | Circular flow detection, rapid movement, sanctions screening |
+| Temporal Analysis | `rustkernel-temporal` | 7 | ARIMA, Prophet decomposition, change point detection |
+| Risk Analytics | `rustkernel-risk` | 5 | Monte Carlo VaR, credit scoring, stress testing, correlation |
+| Process Intelligence | `rustkernel-procint` | 7 | DFG construction, conformance checking, digital twin |
+| Behavioral Analytics | `rustkernel-behavioral` | 6 | Profiling, forensic queries, causal graph analysis |
+| Treasury | `rustkernel-treasury` | 5 | Liquidity optimization, FX hedging, NSFR calculation |
+| Clearing | `rustkernel-clearing` | 5 | Multilateral netting, DVP matching, settlement |
+| Accounting | `rustkernel-accounting` | 9 | Network generation, GL reconciliation, GAAP detection |
+| Banking | `rustkernel-banking` | 1 | Fraud pattern matching (Aho-Corasick) |
+| Order Matching | `rustkernel-orderbook` | 1 | Price-time priority order book engine |
+| Payments | `rustkernel-payments` | 2 | Payment processing, flow analysis |
+| Audit | `rustkernel-audit` | 2 | Feature extraction, hypergraph construction |
 
 ---
 
-## Use Cases
+## Installation
 
-### Anti-Money Laundering (AML)
+```toml
+[dependencies]
+rustkernels = "0.4.0"
+```
 
-Detect layering, structuring, and circular transaction patterns:
+### Feature Flags
 
-```rust
-use rustkernel::graph::cycles::ShortCycleParticipation;
-use rustkernel::compliance::circular_flow::CircularFlowRatio;
+```toml
+# Default features (graph, ml, compliance, temporal, risk)
+rustkernels = "0.4.0"
 
-// Detect nodes participating in suspicious cycles
-let cycle_detector = ShortCycleParticipation::new();
-let results = cycle_detector.compute_all(&transaction_graph);
+# Selective domain inclusion
+rustkernels = { version = "0.4.0", features = ["graph", "compliance", "procint"] }
 
-for result in results.iter().filter(|r| r.risk_level == CycleRiskLevel::Critical) {
-    // Nodes in 4-cycles are high-priority for investigation
-    flag_for_investigation(result.node_index);
-}
+# All domains
+rustkernels = { version = "0.4.0", features = ["full"] }
 
-// Compute circular flow ratios
-let cfr = CircularFlowRatio::new();
-let scores = cfr.compute_batch(&graph);
+# Enterprise ecosystem (REST/gRPC service)
+rustkernel-ecosystem = { version = "0.4.0", features = ["axum", "grpc"] }
 ```
 
-### Real-Time Fraud Detection
+### Requirements
 
-Score transactions in real-time with streaming anomaly detection:
+| Dependency | Version | Notes |
+|---|---|---|
+| Rust | 1.85+ | Edition 2024 |
+| RingKernel | 0.4.2 | GPU-native persistent actor runtime (crates.io) |
+| CUDA Toolkit | 12.0+ | Optional; CPU fallback when unavailable |
+
+---
+
+## Quick Start
 
 ```rust
-use rustkernel::ml::streaming::StreamingIsolationForest;
+use rustkernel::prelude::*;
+use rustkernel::graph::centrality::BetweennessCentrality;
+use rustkernel::graph::messages::CentralityInput;
 
-let detector = StreamingIsolationForest::new(StreamingConfig {
-    num_trees: 100,
-    sample_size: 256,
-    window_size: 10000,
-});
+#[tokio::main]
+async fn main() -> Result<()> {
+    // Create a registry and register kernels
+    let registry = KernelRegistry::new();
+    rustkernel::graph::register_all(&registry)?;
+
+    // Instantiate a kernel
+    let kernel = BetweennessCentrality::new();
+    println!("{} ({})", kernel.metadata().id, kernel.metadata().domain);
+
+    // Execute via the typed BatchKernel interface
+    let input = CentralityInput {
+        num_nodes: 4,
+        edges: vec![(0, 1), (1, 2), (2, 3), (0, 3)],
+    };
+    let result = kernel.execute(input).await?;
+    println!("Centrality scores: {:?}", result.scores);
 
-// Process incoming transactions
-for transaction in transaction_stream {
-    let score = detector.score(&transaction.features)?;
-    if score > 0.7 {
-        block_transaction(&transaction);
-    }
+    Ok(())
 }
 ```
 
-### Process Mining & Digital Twin
+### Type-Erased Execution via REST
 
-Simulate process changes before deployment:
+Kernels registered with `register_batch_typed()` are automatically available for execution through the ecosystem service layer:
 
 ```rust
-use rustkernel::procint::simulation::{DigitalTwin, SimulationConfig};
+use rustkernel_ecosystem::axum::{KernelRouter, RouterConfig};
+use rustkernel_core::registry::KernelRegistry;
+use std::sync::Arc;
 
-let twin = DigitalTwin::new();
+let registry = Arc::new(KernelRegistry::new());
+rustkernel::graph::register_all(&registry)?;
+rustkernel::ml::register_all(&registry)?;
 
-// Simulate adding 2 more resources to bottleneck activity
-let what_if = twin.simulate(&process_model, &SimulationConfig {
-    num_simulations: 1000,
-    resource_overrides: vec![("Review", 5)], // 3 → 5 reviewers
-    ..Default::default()
-})?;
+let router = KernelRouter::new(registry, RouterConfig::default());
+let app = router.into_router();
 
-println!("Projected improvement: {:.1}% faster",
-    (1.0 - what_if.avg_completion_time / baseline.avg_completion_time) * 100.0);
+// POST /api/v1/kernels/:kernel_id/execute
+// GET  /api/v1/kernels
+// GET  /health
+// GET  /metrics
+let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await?;
+axum::serve(listener, app).await?;
 ```
 
-### Graph Neural Networks for Entity Resolution
+---
 
-Link prediction and entity matching with GNN inference:
+## Architecture
 
-```rust
-use rustkernel::graph::gnn::{GNNInference, GNNConfig};
+```
+┌──────────────────────────────────────────────────────────────────────┐
+│                           rustkernels                                │
+│                      (facade crate, re-exports)                      │
+├──────────────────────────────────────────────────────────────────────┤
+│  rustkernel-core (0.4.0)       │  rustkernel-ecosystem (0.4.0)      │
+│  ├── traits (Gpu/Batch/Ring)   │  ├── axum (REST API)               │
+│  ├── registry (typed factory)  │  ├── tower (middleware)             │
+│  ├── security (auth, RBAC)     │  ├── grpc (Tonic server)           │
+│  ├── observability (metrics)   │  └── actix (actors)                │
+│  ├── resilience (circuit)      ├─────────────────────────────────────┤
+│  ├── runtime (lifecycle)       │  rustkernel-derive                  │
+│  ├── memory (pooling)          │  ├── #[gpu_kernel] macro            │
+│  ├── config (production)       │  └── #[derive(RingMessage)]         │
+│  └── k2k (coordination)       │                                     │
+├──────────────────────────────────────────────────────────────────────┤
+│                       Domain Crates (14)                             │
+│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐  │
+│  │  graph   │ │    ml    │ │compliance│ │ temporal │ │   risk   │  │
+│  │   (28)   │ │   (17)   │ │   (11)   │ │    (7)  │ │    (5)   │  │
+│  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘  │
+│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐  │
+│  │ procint  │ │behavioral│ │ treasury │ │ clearing │ │accounting│  │
+│  │    (7)   │ │    (6)   │ │    (5)   │ │    (5)   │ │    (9)   │  │
+│  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘  │
+│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐               │
+│  │ banking  │ │orderbook │ │ payments │ │  audit   │               │
+│  │    (1)   │ │    (1)   │ │    (2)   │ │    (2)   │               │
+│  └──────────┘ └──────────┘ └──────────┘ └──────────┘               │
+├──────────────────────────────────────────────────────────────────────┤
+│  rustkernel-cli              │  RingKernel 0.4.2 (crates.io)        │
+│  (kernel management CLI)     │  (GPU-native persistent actor runtime)│
+└──────────────────────────────────────────────────────────────────────┘
+```
 
-let gnn = GNNInference::new();
+**19 crates total**: 1 facade, 1 core, 1 derive, 1 ecosystem, 1 CLI, 14 domain crates.
 
-let embeddings = gnn.infer(&entity_graph, &node_features, &GNNConfig {
-    hidden_dim: 64,
-    num_layers: 2,
-    aggregation: AggregationType::Mean,
-})?;
+### Execution Model
 
-// Find similar entities via embedding similarity
-let matches = find_similar_embeddings(&embeddings, threshold: 0.9);
-```
+RustKernels supports two execution paths:
 
----
+1. **Typed execution** — Call `BatchKernel<I, O>::execute(input)` directly with compile-time type safety.
+2. **Type-erased execution** — Kernels registered via `register_batch_typed()` are wrapped in `TypeErasedBatchKernel`, enabling invocation through REST/gRPC with JSON serialization. The ecosystem layer handles `input → JSON bytes → execute_dyn() → JSON bytes → output` automatically.
 
-## Recent Additions
+Ring kernels require the RingKernel persistent actor runtime and are not callable through REST. They communicate through zero-copy ring buffers at sub-microsecond latency.
 
-The latest release introduces innovative kernel categories:
+### K2K Coordination
 
-| Category | Kernels | Description |
-|----------|---------|-------------|
-| **Graph Neural Networks** | GNNInference, GraphAttention | Message-passing GNN and multi-head attention for node classification |
-| **NLP/Embeddings** | EmbeddingGeneration, SemanticSimilarity | TF-IDF embeddings and document similarity |
-| **Federated Learning** | SecureAggregation | Privacy-preserving model aggregation with differential privacy |
-| **Healthcare Analytics** | DrugInteractionPrediction, ClinicalPathwayConformance | Clinical decision support kernels |
-| **Process Simulation** | DigitalTwin | Monte Carlo process simulation and what-if analysis |
-| **Streaming ML** | StreamingIsolationForest, AdaptiveThreshold | Online anomaly detection with concept drift handling |
-| **Explainability** | SHAPValues, FeatureImportance | Model interpretability for regulatory compliance |
+Cross-kernel orchestration patterns built on RingKernel 0.4.2 K2K messaging:
 
----
+- **IterativeState** — Track convergence across multi-pass algorithms (PageRank, K-Means)
+- **ScatterGatherState** — Parallel worker fan-out with result aggregation
+- **FanOutTracker** — Broadcast to multiple downstream kernels
+- **PipelineTracker** — Multi-stage sequential processing
 
-## Enterprise Features (0.2.0)
+---
 
-RustKernels 0.2.0 introduces production-ready enterprise capabilities:
+## Enterprise Features
 
 ### Security
 ```rust
@@ -210,9 +223,6 @@ use rustkernel_core::security::{SecurityContext, Role, KernelPermission};
 let ctx = SecurityContext::new(user_id, tenant_id)
     .with_roles(vec![Role::KernelExecutor])
     .with_permissions(vec![KernelPermission::Execute]);
-
-// Execute with security context
-kernel.execute_with_context(&ctx, input).await?;
 ```
 
 ### Resilience
@@ -225,150 +235,23 @@ let cb = CircuitBreaker::new(CircuitBreakerConfig {
     timeout: Duration::from_secs(30),
     ..Default::default()
 });
-
-// Execute with circuit breaker protection
-cb.call(|| kernel.execute(input)).await?;
 ```
 
 ### Production Configuration
 ```rust
 use rustkernel_core::config::ProductionConfig;
 
-// Load from environment
+// Load from environment or TOML
 let config = ProductionConfig::from_env()?;
-
-// Or use presets
-let config = ProductionConfig::production();
-
-// Validate before use
 config.validate()?;
 ```
 
-### Service Deployment
-```rust
-use rustkernel_ecosystem::axum::{KernelRouter, RouterConfig};
-
-let router = KernelRouter::new(registry, RouterConfig::default());
-let app = router.into_router();
-
-// Endpoints: /kernels, /execute, /health, /metrics
-axum::serve(listener, app).await?;
-```
-
----
-
-## Installation
-
-Add RustKernels to your `Cargo.toml`:
-
-```toml
-[dependencies]
-rustkernels = "0.2.0"
-```
-
-### Feature Flags
+### Observability
 
-Control which domains are compiled to optimize binary size:
-
-```toml
-# Default features (graph, ml, compliance, temporal, risk)
-rustkernels = "0.2.0"
-
-# Selective domain inclusion
-rustkernels = { version = "0.2.0", features = ["graph", "compliance", "procint"] }
-
-# All domains
-rustkernels = { version = "0.2.0", features = ["full"] }
-
-# Enterprise ecosystem (REST/gRPC service)
-rustkernel-ecosystem = { version = "0.2.0", features = ["axum", "grpc"] }
-```
-
----
-
-## Quick Start
-
-```rust
-use rustkernel::prelude::*;
-use rustkernel::graph::centrality::PageRank;
-
-#[tokio::main]
-async fn main() -> Result<()> {
-    // Create kernel instance
-    let kernel = PageRank::new();
-
-    // Access metadata
-    let metadata = kernel.metadata();
-    println!("Kernel: {} ({})", metadata.id, metadata.domain);
-
-    // Build input
-    let input = PageRankInput {
-        num_nodes: 1000,
-        edges: load_edges()?,
-        damping_factor: 0.85,
-        max_iterations: 100,
-        tolerance: 1e-6,
-    };
-
-    // Execute
-    let result = kernel.execute(input).await?;
-
-    println!("Converged in {} iterations", result.iterations);
-    println!("Top node: {} (score: {:.4})",
-        result.top_node(),
-        result.scores[result.top_node()]);
-
-    Ok(())
-}
-```
-
----
-
-## Architecture
-
-```
-┌─────────────────────────────────────────────────────────────────┐
-│                         rustkernels                              │
-│                    (facade crate, re-exports)                    │
-├─────────────────────────────────────────────────────────────────┤
-│  rustkernel-core (0.2.0)     │  rustkernel-ecosystem (0.2.0)    │
-│  ├── traits (Gpu/Batch/Ring) │  ├── axum (REST API)             │
-│  ├── security (auth, RBAC)   │  ├── tower (middleware)          │
-│  ├── observability (metrics) │  ├── grpc (Tonic server)         │
-│  ├── resilience (circuit)    │  └── actix (actors)              │
-│  ├── runtime (lifecycle)     ├──────────────────────────────────┤
-│  ├── memory (pooling)        │  rustkernel-derive               │
-│  ├── config (production)     │  - #[gpu_kernel] macro           │
-│  └── k2k (coordination)      │  - #[derive(RingMessage)]        │
-├─────────────────────────────────────────────────────────────────┤
-│                    Domain Crates (14)                            │
-│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐   │
-│  │  graph  │ │   ml    │ │complianc│ │temporal │ │  risk   │   │
-│  │  (28)   │ │  (17)   │ │  (11)   │ │   (7)   │ │   (5)   │   │
-│  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘   │
-│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐   │
-│  │procint  │ │behavior │ │treasury │ │clearing │ │accounting│   │
-│  │   (7)   │ │   (6)   │ │   (5)   │ │   (5)   │ │   (9)   │   │
-│  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘   │
-│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐               │
-│  │ banking │ │orderbook│ │payments │ │  audit  │               │
-│  │   (1)   │ │   (1)   │ │   (2)   │ │   (2)   │               │
-│  └─────────┘ └─────────┘ └─────────┘ └─────────┘               │
-├─────────────────────────────────────────────────────────────────┤
-│                    RustCompute / RingKernel 0.3.1                │
-│                    (GPU execution framework)                     │
-└─────────────────────────────────────────────────────────────────┘
-```
-
----
-
-## Requirements
-
-| Requirement | Version | Notes |
-|-------------|---------|-------|
-| Rust | 1.85+ | Edition 2024 features required |
-| RustCompute | Latest | RingKernel framework (path dependency) |
-| CUDA Toolkit | 12.0+ | Optional; falls back to CPU if unavailable |
+- **Prometheus-compatible metrics** — request counts, latency, per-domain kernel counts, error rates
+- **Distributed tracing** — OTLP export with kernel-level span instrumentation
+- **Structured logging** — JSON-formatted, kernel-context-aware
+- **SLO alerting** — configurable alert rules with multi-channel notification
 
 ---
 
@@ -378,36 +261,43 @@ async fn main() -> Result<()> {
 # Build entire workspace
 cargo build --workspace
 
-# Run all tests
+# Run all 895 tests
 cargo test --workspace
 
-# Test specific domain
+# Test a specific domain
 cargo test --package rustkernel-graph
 cargo test --package rustkernel-ml
 
-# Run benchmarks
-cargo bench --package rustkernel
+# Lint (warnings as errors)
+cargo clippy --all-targets --all-features -- -D warnings
+
+# Format
+cargo fmt --all
 
-# Generate documentation
+# Generate API documentation
 cargo doc --workspace --no-deps --open
 
-# Lint
-cargo clippy --all-targets --all-features -- -D warnings
+# Build mdBook documentation
+cd docs && mdbook build
 ```
 
 ---
 
 ## Documentation
 
-- **[Online Documentation](https://mivertowski.github.io/RustKernels/)** - Comprehensive guides and API reference
-- **[Kernel Catalogue](https://mivertowski.github.io/RustKernels/domains/)** - Complete listing of all 106 kernels
-- **[Architecture Guide](https://mivertowski.github.io/RustKernels/architecture/overview.html)** - System design and patterns
+| Resource | Description |
+|---|---|
+| [Online Documentation](https://mivertowski.github.io/RustKernels/) | Guides, architecture, and API reference |
+| [Kernel Catalogue](https://mivertowski.github.io/RustKernels/domains/) | All 106 kernels across 14 domains |
+| [Architecture Guide](https://mivertowski.github.io/RustKernels/architecture/overview.html) | System design, execution modes, K2K patterns |
+| [Enterprise Guide](https://mivertowski.github.io/RustKernels/enterprise/security.html) | Security, observability, resilience, runtime |
+| [API Docs](https://docs.rs/rustkernels) | Auto-generated Rust API documentation |
 
 ---
 
 ## Contributing
 
-Contributions are welcome. Please see [CONTRIBUTING.md](docs/src/appendix/contributing.md) for guidelines.
+Contributions are welcome. See [CONTRIBUTING.md](docs/src/appendix/contributing.md) for development setup, code style guidelines, and the pull request process.
 
 ---
 
@@ -418,6 +308,5 @@ Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for detai
 ---
 
 **Author**: Michael Ivertowski
-**Version**: 0.2.0
-**Kernels**: 106 across 14 domains
-**Crates**: 19 (including rustkernel-ecosystem)
+**Version**: 0.4.0 — Deep integration with RingKernel 0.4.2
+**Scope**: 106 kernels, 14 domains, 19 crates
diff --git a/crates/rustkernel-accounting/src/lib.rs b/crates/rustkernel-accounting/src/lib.rs
index 9b3d3db..462e4e1 100644
--- a/crates/rustkernel-accounting/src/lib.rs
+++ b/crates/rustkernel-accounting/src/lib.rs
@@ -41,50 +41,32 @@ pub use temporal::TemporalCorrelation;
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering accounting kernels");
 
-    // CoA mapping kernel (1)
-    registry.register_metadata(
-        coa_mapping::ChartOfAccountsMapping::new()
-            .metadata()
-            .clone(),
-    )?;
+    // CoA mapping kernel (1) — Batch
+    registry.register_ring_metadata_from(coa_mapping::ChartOfAccountsMapping::new)?;
 
-    // Journal kernel (1)
-    registry.register_metadata(journal::JournalTransformation::new().metadata().clone())?;
+    // Journal kernel (1) — Batch
+    registry.register_ring_metadata_from(journal::JournalTransformation::new)?;
 
-    // Reconciliation kernel (1)
-    registry.register_metadata(reconciliation::GLReconciliation::new().metadata().clone())?;
+    // Reconciliation kernel (1) — Batch
+    registry.register_ring_metadata_from(reconciliation::GLReconciliation::new)?;
 
-    // Network analysis kernel (1)
-    registry.register_metadata(network::NetworkAnalysis::new().metadata().clone())?;
+    // Network analysis kernel (1) — Batch
+    registry.register_ring_metadata_from(network::NetworkAnalysis::new)?;
 
-    // Temporal kernel (1)
-    registry.register_metadata(temporal::TemporalCorrelation::new().metadata().clone())?;
+    // Temporal kernel (1) — Batch
+    registry.register_ring_metadata_from(temporal::TemporalCorrelation::new)?;
 
-    // Network generation batch kernel (1)
-    registry.register_metadata(
-        network_generation::NetworkGeneration::new()
-            .metadata()
-            .clone(),
-    )?;
+    // Network generation batch kernel (1) — Batch
+    registry.register_batch_typed(network_generation::NetworkGeneration::new)?;
 
-    // Network generation ring kernel (1)
-    registry.register_metadata(
-        network_generation::NetworkGenerationRing::new()
-            .metadata()
-            .clone(),
-    )?;
+    // Network generation ring kernel (1) — Ring
+    registry.register_ring_metadata_from(network_generation::NetworkGenerationRing::new)?;
 
-    // Detection kernels (2)
-    registry.register_metadata(
-        detection::SuspenseAccountDetection::new()
-            .metadata()
-            .clone(),
-    )?;
-    registry.register_metadata(detection::GaapViolationDetection::new().metadata().clone())?;
+    // Detection kernels (2) — Batch
+    registry.register_ring_metadata_from(detection::SuspenseAccountDetection::new)?;
+    registry.register_ring_metadata_from(detection::GaapViolationDetection::new)?;
 
     tracing::info!("Registered 9 accounting kernels");
     Ok(())
diff --git a/crates/rustkernel-audit/src/lib.rs b/crates/rustkernel-audit/src/lib.rs
index deecd9b..a25e9bf 100644
--- a/crates/rustkernel-audit/src/lib.rs
+++ b/crates/rustkernel-audit/src/lib.rs
@@ -20,19 +20,13 @@ pub use types::*;
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering financial audit kernels");
 
-    // Feature extraction kernel (1)
-    registry.register_metadata(
-        feature_extraction::FeatureExtraction::new()
-            .metadata()
-            .clone(),
-    )?;
+    // Feature extraction kernel (1) - Batch
+    registry.register_batch_metadata_from(feature_extraction::FeatureExtraction::new)?;
 
-    // Hypergraph kernel (1)
-    registry.register_metadata(hypergraph::HypergraphConstruction::new().metadata().clone())?;
+    // Hypergraph kernel (1) - Batch
+    registry.register_batch_metadata_from(hypergraph::HypergraphConstruction::new)?;
 
     tracing::info!("Registered 2 financial audit kernels");
     Ok(())
diff --git a/crates/rustkernel-banking/src/lib.rs b/crates/rustkernel-banking/src/lib.rs
index 992c901..64a531a 100644
--- a/crates/rustkernel-banking/src/lib.rs
+++ b/crates/rustkernel-banking/src/lib.rs
@@ -35,12 +35,10 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering banking kernels");
 
-    // Fraud detection kernel (1)
-    registry.register_metadata(fraud::FraudPatternMatch::new().metadata().clone())?;
+    // Fraud detection kernel (1) - Ring
+    registry.register_ring_metadata_from(fraud::FraudPatternMatch::new)?;
 
     tracing::info!("Registered 1 banking kernel");
     Ok(())
diff --git a/crates/rustkernel-behavioral/src/lib.rs b/crates/rustkernel-behavioral/src/lib.rs
index dbc98fc..64917f3 100644
--- a/crates/rustkernel-behavioral/src/lib.rs
+++ b/crates/rustkernel-behavioral/src/lib.rs
@@ -48,33 +48,24 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering behavioral analytics kernels");
 
-    // Profiling kernels (2)
-    registry.register_metadata(profiling::BehavioralProfiling::new().metadata().clone())?;
-    registry.register_metadata(profiling::AnomalyProfiling::new().metadata().clone())?;
+    // Profiling kernels (2) - Ring
+    registry.register_ring_metadata_from(profiling::BehavioralProfiling::new)?;
+    registry.register_ring_metadata_from(profiling::AnomalyProfiling::new)?;
 
-    // Signature detection kernel (1)
-    registry.register_metadata(
-        signatures::FraudSignatureDetection::new()
-            .metadata()
-            .clone(),
-    )?;
+    // Signature detection kernel (1) - Ring
+    registry.register_ring_metadata_from(signatures::FraudSignatureDetection::new)?;
 
-    // Causal kernel (1)
-    registry.register_metadata(causal::CausalGraphConstruction::new().metadata().clone())?;
+    // Causal kernel (1) - Batch (uses register_ring_metadata_from because it only
+    // implements GpuKernel, not BatchKernel<I, O>; computation is via static methods)
+    registry.register_ring_metadata_from(causal::CausalGraphConstruction::new)?;
 
-    // Forensics kernel (1)
-    registry.register_metadata(forensics::ForensicQueryExecution::new().metadata().clone())?;
+    // Forensics kernel (1) - Batch (same as above)
+    registry.register_ring_metadata_from(forensics::ForensicQueryExecution::new)?;
 
-    // Correlation kernel (1)
-    registry.register_metadata(
-        correlation::EventCorrelationKernel::new()
-            .metadata()
-            .clone(),
-    )?;
+    // Correlation kernel (1) - Ring
+    registry.register_ring_metadata_from(correlation::EventCorrelationKernel::new)?;
 
     tracing::info!("Registered 6 behavioral analytics kernels");
     Ok(())
diff --git a/crates/rustkernel-clearing/src/lib.rs b/crates/rustkernel-clearing/src/lib.rs
index 0c46eb3..23c49ea 100644
--- a/crates/rustkernel-clearing/src/lib.rs
+++ b/crates/rustkernel-clearing/src/lib.rs
@@ -54,24 +54,22 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering clearing kernels");
 
-    // Validation kernel (1)
-    registry.register_metadata(validation::ClearingValidation::new().metadata().clone())?;
+    // Validation kernel (1) — Batch
+    registry.register_ring_metadata_from(validation::ClearingValidation::new)?;
 
-    // DVP kernel (1)
-    registry.register_metadata(dvp::DVPMatching::new().metadata().clone())?;
+    // DVP kernel (1) — Ring
+    registry.register_ring_metadata_from(dvp::DVPMatching::new)?;
 
-    // Netting kernel (1)
-    registry.register_metadata(netting::NettingCalculation::new().metadata().clone())?;
+    // Netting kernel (1) — Batch
+    registry.register_ring_metadata_from(netting::NettingCalculation::new)?;
 
-    // Settlement kernel (1)
-    registry.register_metadata(settlement::SettlementExecution::new().metadata().clone())?;
+    // Settlement kernel (1) — Ring
+    registry.register_ring_metadata_from(settlement::SettlementExecution::new)?;
 
-    // Efficiency kernel (1)
-    registry.register_metadata(efficiency::ZeroBalanceFrequency::new().metadata().clone())?;
+    // Efficiency kernel (1) — Batch
+    registry.register_ring_metadata_from(efficiency::ZeroBalanceFrequency::new)?;
 
     tracing::info!("Registered 5 clearing kernels");
     Ok(())
diff --git a/crates/rustkernel-compliance/src/lib.rs b/crates/rustkernel-compliance/src/lib.rs
index e56e873..b5c2154 100644
--- a/crates/rustkernel-compliance/src/lib.rs
+++ b/crates/rustkernel-compliance/src/lib.rs
@@ -57,28 +57,26 @@ pub use sanctions::{PEPScreening, SanctionsScreening};
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering compliance kernels");
 
     // AML kernels (6)
-    registry.register_metadata(aml::CircularFlowRatio::new().metadata().clone())?;
-    registry.register_metadata(aml::ReciprocityFlowRatio::new().metadata().clone())?;
-    registry.register_metadata(aml::RapidMovement::new().metadata().clone())?;
-    registry.register_metadata(aml::AMLPatternDetection::new().metadata().clone())?;
-    registry.register_metadata(aml::FlowReversalPattern::new().metadata().clone())?;
-    registry.register_metadata(aml::FlowSplitRatio::new().metadata().clone())?;
+    registry.register_ring_metadata_from(aml::CircularFlowRatio::new)?;
+    registry.register_ring_metadata_from(aml::ReciprocityFlowRatio::new)?;
+    registry.register_ring_metadata_from(aml::RapidMovement::new)?;
+    registry.register_ring_metadata_from(aml::AMLPatternDetection::new)?;
+    registry.register_batch_metadata_from(aml::FlowReversalPattern::new)?;
+    registry.register_batch_metadata_from(aml::FlowSplitRatio::new)?;
 
     // KYC kernels (2)
-    registry.register_metadata(kyc::KYCScoring::new().metadata().clone())?;
-    registry.register_metadata(kyc::EntityResolution::new().metadata().clone())?;
+    registry.register_batch_typed(kyc::KYCScoring::new)?;
+    registry.register_batch_typed(kyc::EntityResolution::new)?;
 
     // Sanctions kernels (2)
-    registry.register_metadata(sanctions::SanctionsScreening::new().metadata().clone())?;
-    registry.register_metadata(sanctions::PEPScreening::new().metadata().clone())?;
+    registry.register_ring_metadata_from(sanctions::SanctionsScreening::new)?;
+    registry.register_ring_metadata_from(sanctions::PEPScreening::new)?;
 
     // Monitoring kernel (1)
-    registry.register_metadata(monitoring::TransactionMonitoring::new().metadata().clone())?;
+    registry.register_ring_metadata_from(monitoring::TransactionMonitoring::new)?;
 
     tracing::info!("Registered 11 compliance kernels");
     Ok(())
diff --git a/crates/rustkernel-core/src/error.rs b/crates/rustkernel-core/src/error.rs
index 1bc3ef8..e6ed440 100644
--- a/crates/rustkernel-core/src/error.rs
+++ b/crates/rustkernel-core/src/error.rs
@@ -169,7 +169,7 @@ impl KernelError {
         KernelError::K2KError(msg.into())
     }
 
-    /// Returns true if this is a recoverable error.
+    /// Returns true if this is a recoverable error (safe to retry).
     #[must_use]
     pub fn is_recoverable(&self) -> bool {
         matches!(
@@ -177,7 +177,21 @@ impl KernelError {
             KernelError::QueueFull { .. }
                 | KernelError::QueueEmpty
                 | KernelError::Timeout(_)
+                | KernelError::ServiceUnavailable(_)
+                | KernelError::ResourceExhausted(_)
+        )
+    }
+
+    /// Returns true if this is a client error (invalid input, not found, etc.).
+    #[must_use]
+    pub fn is_client_error(&self) -> bool {
+        matches!(
+            self,
+            KernelError::KernelNotFound(_)
                 | KernelError::ValidationError(_)
+                | KernelError::DeserializationError(_)
+                | KernelError::Unauthorized(_)
+                | KernelError::DomainNotSupported(_)
         )
     }
 
@@ -186,6 +200,64 @@ impl KernelError {
     pub fn is_license_error(&self) -> bool {
         matches!(self, KernelError::LicenseError(_))
     }
+
+    /// Returns the suggested HTTP status code for this error.
+    ///
+    /// Centralizes HTTP status mapping so that all ecosystem integrations
+    /// (Axum, Tower, gRPC, Actix) use consistent status codes.
+    #[must_use]
+    pub fn http_status_code(&self) -> u16 {
+        match self {
+            KernelError::KernelNotFound(_) => 404,
+            KernelError::KernelAlreadyRegistered(_) => 409,
+            KernelError::ValidationError(_) => 400,
+            KernelError::DeserializationError(_) => 400,
+            KernelError::SerializationError(_) => 500,
+            KernelError::Unauthorized(_) => 401,
+            KernelError::ResourceExhausted(_) => 429,
+            KernelError::ServiceUnavailable(_) => 503,
+            KernelError::Timeout(_) => 504,
+            KernelError::LicenseError(_) => 403,
+            KernelError::DomainNotSupported(_) => 403,
+            KernelError::QueueFull { .. } => 503,
+            KernelError::MessageTooLarge { .. } => 413,
+            _ => 500,
+        }
+    }
+
+    /// Returns a machine-readable error code string.
+    #[must_use]
+    pub fn error_code(&self) -> &'static str {
+        match self {
+            KernelError::KernelNotFound(_) => "KERNEL_NOT_FOUND",
+            KernelError::KernelAlreadyRegistered(_) => "KERNEL_ALREADY_REGISTERED",
+            KernelError::InvalidStateTransition { .. } => "INVALID_STATE_TRANSITION",
+            KernelError::KernelNotActive(_) => "KERNEL_NOT_ACTIVE",
+            KernelError::ValidationError(_) => "VALIDATION_ERROR",
+            KernelError::SerializationError(_) => "SERIALIZATION_ERROR",
+            KernelError::DeserializationError(_) => "DESERIALIZATION_ERROR",
+            KernelError::QueueFull { .. } => "QUEUE_FULL",
+            KernelError::QueueEmpty => "QUEUE_EMPTY",
+            KernelError::MessageTooLarge { .. } => "MESSAGE_TOO_LARGE",
+            KernelError::Timeout(_) => "TIMEOUT",
+            KernelError::LaunchFailed(_) => "LAUNCH_FAILED",
+            KernelError::CompilationError(_) => "COMPILATION_ERROR",
+            KernelError::DeviceError(_) => "DEVICE_ERROR",
+            KernelError::BackendNotAvailable(_) => "BACKEND_NOT_AVAILABLE",
+            KernelError::LicenseError(_) => "LICENSE_ERROR",
+            KernelError::SLOViolation(_) => "SLO_VIOLATION",
+            KernelError::DomainNotSupported(_) => "DOMAIN_NOT_SUPPORTED",
+            KernelError::InternalError(_) => "INTERNAL_ERROR",
+            KernelError::IoError(_) => "IO_ERROR",
+            KernelError::ConfigError(_) => "CONFIG_ERROR",
+            KernelError::ActorError(_) => "ACTOR_ERROR",
+            KernelError::RingKernelError(_) => "RINGKERNEL_ERROR",
+            KernelError::K2KError(_) => "K2K_ERROR",
+            KernelError::Unauthorized(_) => "UNAUTHORIZED",
+            KernelError::ResourceExhausted(_) => "RESOURCE_EXHAUSTED",
+            KernelError::ServiceUnavailable(_) => "SERVICE_UNAVAILABLE",
+        }
+    }
 }
 
 /// Convert from ringkernel-core errors.
diff --git a/crates/rustkernel-core/src/lib.rs b/crates/rustkernel-core/src/lib.rs
index 0916699..d9bb1ce 100644
--- a/crates/rustkernel-core/src/lib.rs
+++ b/crates/rustkernel-core/src/lib.rs
@@ -85,8 +85,9 @@ pub mod prelude {
     pub use crate::slo::{SLOResult, SLOValidator};
     pub use crate::test_kernels::{EchoKernel, MatMul, ReduceSum, VectorAdd};
     pub use crate::traits::{
-        BatchKernel, CheckpointableKernel, DegradableKernel, ExecutionContext, GpuKernel,
-        HealthStatus, IterativeKernel, KernelConfig, RingKernelHandler, SecureRingContext,
+        BatchKernel, BatchKernelDyn, CheckpointableKernel, DegradableKernel, ExecutionContext,
+        GpuKernel, HealthStatus, IterativeKernel, KernelConfig, RingKernelDyn, RingKernelHandler,
+        SecureRingContext, TypeErasedBatchKernel, TypeErasedRingKernel,
     };
 
     // Runtime lifecycle
diff --git a/crates/rustkernel-core/src/registry.rs b/crates/rustkernel-core/src/registry.rs
index f508136..d09f78d 100644
--- a/crates/rustkernel-core/src/registry.rs
+++ b/crates/rustkernel-core/src/registry.rs
@@ -7,7 +7,9 @@ use crate::domain::Domain;
 use crate::error::{KernelError, Result};
 use crate::kernel::{KernelMetadata, KernelMode};
 use crate::license::{LicenseError, LicenseValidator, SharedLicenseValidator};
-use crate::traits::{BatchKernelDyn, RingKernelDyn};
+use crate::traits::{
+    BatchKernel, BatchKernelDyn, GpuKernel, RingKernelDyn, TypeErasedBatchKernel,
+};
 use hashbrown::HashMap;
 use std::sync::{Arc, RwLock};
 use tracing::{debug, info, warn};
@@ -395,6 +397,101 @@ impl KernelRegistry {
         }
     }
 
+    /// Register a batch kernel using a typed factory function.
+    ///
+    /// This is the preferred way to register batch kernels. The factory closure
+    /// creates kernel instances on demand, and type erasure is handled automatically
+    /// via [`TypeErasedBatchKernel`].
+    ///
+    /// # Type Inference
+    ///
+    /// Rust infers `I` and `O` from the `BatchKernel<I, O>` implementation on `K`,
+    /// so turbofish syntax is typically not needed:
+    ///
+    /// ```ignore
+    /// registry.register_batch_typed(|| MyKernel::new())?;
+    /// ```
+    ///
+    /// # Errors
+    ///
+    /// Returns an error if the kernel ID is already registered or fails license validation.
+    pub fn register_batch_typed<K, I, O>(
+        &self,
+        factory: impl Fn() -> K + Send + Sync + 'static,
+    ) -> Result<()>
+    where
+        K: BatchKernel<I, O> + 'static,
+        I: serde::de::DeserializeOwned + Send + Sync + 'static,
+        O: serde::Serialize + Send + Sync + 'static,
+    {
+        let sample = factory();
+        let metadata = sample.metadata().clone();
+        drop(sample);
+        let entry = BatchKernelEntry::new(metadata, move || {
+            Arc::new(TypeErasedBatchKernel::new(factory()))
+        });
+        self.register_batch(entry)
+    }
+
+    /// Register a batch kernel's metadata from a factory function.
+    ///
+    /// This is for batch-mode kernels that implement `GpuKernel` but not
+    /// the full `BatchKernel<I, O>` trait. The metadata is stored for
+    /// discovery and health checking.
+    ///
+    /// # Errors
+    ///
+    /// Returns an error if the kernel ID is already registered or fails license validation.
+    pub fn register_batch_metadata_from<K>(&self, factory: impl Fn() -> K) -> Result<()>
+    where
+        K: GpuKernel,
+    {
+        let sample = factory();
+        let metadata = sample.metadata().clone();
+        self.register_metadata(metadata)
+    }
+
+    /// Register a ring kernel's metadata from a factory function.
+    ///
+    /// Ring kernels require the RingKernel runtime for persistent actor execution
+    /// and cannot be invoked directly via REST. This method registers the kernel's
+    /// metadata for discovery and health checking.
+    ///
+    /// For full ring kernel deployment, use the RingKernel runtime directly.
+    ///
+    /// # Errors
+    ///
+    /// Returns an error if the kernel ID is already registered or fails license validation.
+    pub fn register_ring_metadata_from<K>(&self, factory: impl Fn() -> K) -> Result<()>
+    where
+        K: GpuKernel,
+    {
+        let sample = factory();
+        let metadata = sample.metadata().clone();
+        self.register_metadata(metadata)
+    }
+
+    /// Execute a batch kernel by ID with JSON input/output.
+    ///
+    /// Looks up the kernel in the registry, creates an instance, and executes it
+    /// with type-erased JSON serialization.
+    ///
+    /// # Errors
+    ///
+    /// Returns `KernelNotFound` if no batch kernel with this ID exists, or
+    /// propagates any execution error from the kernel.
+    pub async fn execute_batch(
+        &self,
+        kernel_id: &str,
+        input_json: &[u8],
+    ) -> Result<Vec<u8>> {
+        let entry = self
+            .get_batch(kernel_id)
+            .ok_or_else(|| KernelError::KernelNotFound(kernel_id.to_string()))?;
+        let kernel = entry.create();
+        kernel.execute_dyn(input_json).await
+    }
+
     /// Total number of registered kernels.
     #[must_use]
     pub fn total_count(&self) -> usize {
@@ -404,6 +501,61 @@ impl KernelRegistry {
         batch.len() + ring.len() + metadata.len()
     }
 
+    /// Get all kernel metadata across all categories, sorted by ID.
+    ///
+    /// Returns metadata for batch, ring, and metadata-only kernels.
+    #[must_use]
+    pub fn all_metadata(&self) -> Vec<KernelMetadata> {
+        let mut result = Vec::new();
+
+        let batch = self.batch_kernels.read().unwrap();
+        for entry in batch.values() {
+            result.push(entry.metadata.clone());
+        }
+
+        let ring = self.ring_kernels.read().unwrap();
+        for entry in ring.values() {
+            result.push(entry.metadata.clone());
+        }
+
+        let metadata = self.metadata_only.read().unwrap();
+        for entry in metadata.values() {
+            result.push(entry.clone());
+        }
+
+        result.sort_by(|a, b| a.id.cmp(&b.id));
+        result
+    }
+
+    /// Search kernels by pattern (case-insensitive substring match on ID and description).
+    #[must_use]
+    pub fn search(&self, pattern: &str) -> Vec<KernelMetadata> {
+        let pattern_lower = pattern.to_lowercase();
+        self.all_metadata()
+            .into_iter()
+            .filter(|m| {
+                m.id.to_lowercase().contains(&pattern_lower)
+                    || m.description.to_lowercase().contains(&pattern_lower)
+            })
+            .collect()
+    }
+
+    /// Get all executable batch kernel IDs (kernels with factory functions).
+    ///
+    /// These are the kernels that can be invoked via REST/gRPC through the
+    /// type-erased `BatchKernelDyn` interface.
+    #[must_use]
+    pub fn executable_kernel_ids(&self) -> Vec<String> {
+        self.batch_kernel_ids()
+    }
+
+    /// Check if a kernel is executable via REST/gRPC (has a `BatchKernelDyn` factory).
+    #[must_use]
+    pub fn is_executable(&self, id: &str) -> bool {
+        let batch = self.batch_kernels.read().unwrap();
+        batch.contains_key(id)
+    }
+
     /// Clear all registered kernels.
     pub fn clear(&self) {
         let mut batch = self.batch_kernels.write().unwrap();
diff --git a/crates/rustkernel-core/src/traits.rs b/crates/rustkernel-core/src/traits.rs
index ffa1668..d3f5355 100644
--- a/crates/rustkernel-core/src/traits.rs
+++ b/crates/rustkernel-core/src/traits.rs
@@ -13,12 +13,13 @@
 //! - Secure message handling with authentication
 //! - Checkpoint/restore for recovery
 
-use crate::error::Result;
+use crate::error::{KernelError, Result};
 use crate::kernel::KernelMetadata;
 use async_trait::async_trait;
 use ringkernel_core::{RingContext, RingMessage};
 use serde::{Deserialize, Serialize};
-use std::fmt::Debug;
+use std::fmt::{self, Debug};
+use std::marker::PhantomData;
 use std::time::Duration;
 use uuid::Uuid;
 
@@ -580,6 +581,170 @@ pub trait RingKernelDyn: GpuKernel {
     async fn handle_dyn(&self, ctx: &mut RingContext, msg: &[u8]) -> Result<Vec<u8>>;
 }
 
+// ============================================================================
+// Type-Erased Kernel Adapters
+// ============================================================================
+
+/// Type-erased wrapper for batch kernels enabling dynamic dispatch.
+///
+/// Wraps any `BatchKernel<I, O>` implementation and provides the
+/// `BatchKernelDyn` interface for type-erased execution through
+/// JSON serialization/deserialization.
+///
+/// This enables batch kernels to be stored in the registry and invoked
+/// via REST, gRPC, and other service interfaces without compile-time
+/// knowledge of the kernel's input/output types.
+///
+/// # Example
+///
+/// ```ignore
+/// use rustkernel_core::traits::TypeErasedBatchKernel;
+///
+/// let kernel = TypeErasedBatchKernel::new(MyKernel::new());
+/// let output = kernel.execute_dyn(b"{\"field\": 42}").await?;
+/// ```
+pub struct TypeErasedBatchKernel<K, I, O> {
+    inner: K,
+    // fn(I) -> O is always Send + Sync regardless of I/O bounds
+    _phantom: PhantomData<fn(I) -> O>,
+}
+
+impl<K: Debug, I, O> Debug for TypeErasedBatchKernel<K, I, O> {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        f.debug_struct("TypeErasedBatchKernel")
+            .field("inner", &self.inner)
+            .finish()
+    }
+}
+
+impl<K, I, O> TypeErasedBatchKernel<K, I, O> {
+    /// Wrap a typed batch kernel for type-erased execution.
+    pub fn new(kernel: K) -> Self {
+        Self {
+            inner: kernel,
+            _phantom: PhantomData,
+        }
+    }
+
+    /// Access the inner kernel.
+    pub fn inner(&self) -> &K {
+        &self.inner
+    }
+}
+
+impl<K, I, O> GpuKernel for TypeErasedBatchKernel<K, I, O>
+where
+    K: GpuKernel,
+    I: Send + Sync + 'static,
+    O: Send + Sync + 'static,
+{
+    fn metadata(&self) -> &KernelMetadata {
+        self.inner.metadata()
+    }
+
+    fn validate(&self) -> Result<()> {
+        self.inner.validate()
+    }
+
+    fn health_check(&self) -> HealthStatus {
+        self.inner.health_check()
+    }
+
+    fn shutdown(&self) -> Result<()> {
+        self.inner.shutdown()
+    }
+
+    fn refresh_config(&mut self, config: &KernelConfig) -> Result<()> {
+        self.inner.refresh_config(config)
+    }
+}
+
+#[async_trait]
+impl<K, I, O> BatchKernelDyn for TypeErasedBatchKernel<K, I, O>
+where
+    K: BatchKernel<I, O> + 'static,
+    I: serde::de::DeserializeOwned + Send + Sync + 'static,
+    O: serde::Serialize + Send + Sync + 'static,
+{
+    async fn execute_dyn(&self, input: &[u8]) -> Result<Vec<u8>> {
+        let typed_input: I = serde_json::from_slice(input)
+            .map_err(|e| KernelError::DeserializationError(e.to_string()))?;
+        let output = self.inner.execute(typed_input).await?;
+        serde_json::to_vec(&output)
+            .map_err(|e| KernelError::SerializationError(e.to_string()))
+    }
+}
+
+/// Type-erased wrapper for ring kernels enabling dynamic dispatch.
+///
+/// Similar to [`TypeErasedBatchKernel`] but for ring kernels that handle
+/// messages through the RingKernel persistent actor model.
+pub struct TypeErasedRingKernel<K, M, R> {
+    inner: K,
+    _phantom: PhantomData<fn(M) -> R>,
+}
+
+impl<K: Debug, M, R> Debug for TypeErasedRingKernel<K, M, R> {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        f.debug_struct("TypeErasedRingKernel")
+            .field("inner", &self.inner)
+            .finish()
+    }
+}
+
+impl<K, M, R> TypeErasedRingKernel<K, M, R> {
+    /// Wrap a typed ring kernel for type-erased message handling.
+    pub fn new(kernel: K) -> Self {
+        Self {
+            inner: kernel,
+            _phantom: PhantomData,
+        }
+    }
+}
+
+impl<K, M, R> GpuKernel for TypeErasedRingKernel<K, M, R>
+where
+    K: GpuKernel,
+    M: Send + Sync + 'static,
+    R: Send + Sync + 'static,
+{
+    fn metadata(&self) -> &KernelMetadata {
+        self.inner.metadata()
+    }
+
+    fn validate(&self) -> Result<()> {
+        self.inner.validate()
+    }
+
+    fn health_check(&self) -> HealthStatus {
+        self.inner.health_check()
+    }
+
+    fn shutdown(&self) -> Result<()> {
+        self.inner.shutdown()
+    }
+
+    fn refresh_config(&mut self, config: &KernelConfig) -> Result<()> {
+        self.inner.refresh_config(config)
+    }
+}
+
+#[async_trait]
+impl<K, M, R> RingKernelDyn for TypeErasedRingKernel<K, M, R>
+where
+    K: RingKernelHandler<M, R> + 'static,
+    M: RingMessage + serde::de::DeserializeOwned + Send + Sync + 'static,
+    R: RingMessage + serde::Serialize + Send + Sync + 'static,
+{
+    async fn handle_dyn(&self, ctx: &mut RingContext, msg: &[u8]) -> Result<Vec<u8>> {
+        let typed_msg: M = serde_json::from_slice(msg)
+            .map_err(|e| KernelError::DeserializationError(e.to_string()))?;
+        let response = self.inner.handle(ctx, typed_msg).await?;
+        serde_json::to_vec(&response)
+            .map_err(|e| KernelError::SerializationError(e.to_string()))
+    }
+}
+
 // ============================================================================
 // Enterprise Traits (0.3.1)
 // ============================================================================
diff --git a/crates/rustkernel-ecosystem/src/actix_integration.rs b/crates/rustkernel-ecosystem/src/actix_integration.rs
index 33bcf4f..62cee84 100644
--- a/crates/rustkernel-ecosystem/src/actix_integration.rs
+++ b/crates/rustkernel-ecosystem/src/actix_integration.rs
@@ -139,30 +139,58 @@ impl Handler<ExecuteKernel> for KernelActor {
         self.messages_processed += 1;
 
         let request_id = uuid::Uuid::new_v4().to_string();
+        let timeout = Duration::from_millis(
+            msg.metadata.timeout_ms.unwrap_or(self.config.default_timeout.as_millis() as u64),
+        );
 
-        // Validate kernel exists
-        let _kernel_meta = self
-            .registry
-            .get(&msg.kernel_id)
-            .ok_or_else(|| ActorError::KernelNotFound(msg.kernel_id.clone()))?;
-
-        // Execute (placeholder - actual execution will use runtime)
-        let duration_us = start.elapsed().as_micros() as u64;
-
-        Ok(ExecuteResult {
-            request_id,
-            kernel_id: msg.kernel_id,
-            output: serde_json::json!({
-                "status": "executed",
-                "input": msg.input
-            }),
-            metadata: ResponseMetadata {
-                duration_us,
-                backend: "CPU".to_string(),
-                gpu_memory_bytes: None,
-                trace_id: msg.metadata.trace_id,
-            },
-        })
+        // Try batch kernel execution
+        if let Some(entry) = self.registry.get_batch(&msg.kernel_id) {
+            let kernel = entry.create();
+
+            let input_bytes = serde_json::to_vec(&msg.input)
+                .map_err(|e| ActorError::InvalidInput(format!("Invalid input: {}", e)))?;
+
+            // Execute synchronously by blocking on the async operation.
+            // Actix actor handlers are synchronous; bridge to async via block_in_place.
+            let result = tokio::task::block_in_place(|| {
+                tokio::runtime::Handle::current().block_on(async {
+                    tokio::time::timeout(timeout, kernel.execute_dyn(&input_bytes)).await
+                })
+            });
+
+            match result {
+                Ok(Ok(output_bytes)) => {
+                    let output: serde_json::Value =
+                        serde_json::from_slice(&output_bytes).map_err(|e| {
+                            ActorError::ExecutionFailed(format!("Output deserialization: {}", e))
+                        })?;
+
+                    let duration_us = start.elapsed().as_micros() as u64;
+
+                    Ok(ExecuteResult {
+                        request_id,
+                        kernel_id: msg.kernel_id,
+                        output,
+                        metadata: ResponseMetadata {
+                            duration_us,
+                            backend: entry.metadata.mode.as_str().to_uppercase(),
+                            gpu_memory_bytes: None,
+                            trace_id: msg.metadata.trace_id,
+                        },
+                    })
+                }
+                Ok(Err(e)) => Err(ActorError::ExecutionFailed(e.to_string())),
+                Err(_) => Err(ActorError::Timeout),
+            }
+        } else if self.registry.get(&msg.kernel_id).is_some() {
+            Err(ActorError::InvalidInput(format!(
+                "Kernel '{}' is a Ring kernel and cannot be executed via actor message. \
+                 Use the Ring protocol for persistent kernel dispatch.",
+                msg.kernel_id
+            )))
+        } else {
+            Err(ActorError::KernelNotFound(msg.kernel_id))
+        }
     }
 }
 
diff --git a/crates/rustkernel-ecosystem/src/axum_integration.rs b/crates/rustkernel-ecosystem/src/axum_integration.rs
index b8f3ca2..bf6a70b 100644
--- a/crates/rustkernel-ecosystem/src/axum_integration.rs
+++ b/crates/rustkernel-ecosystem/src/axum_integration.rs
@@ -19,14 +19,16 @@
 //! ```
 
 use crate::{
-    ErrorResponse, HealthResponse, HealthStatus, KernelResponse, RequestMetadata, ResponseMetadata,
+    ComponentHealth, ErrorResponse, HealthResponse, HealthStatus, KernelResponse, RequestMetadata,
+    ResponseMetadata,
     common::{ServiceConfig, ServiceMetrics, headers, paths},
 };
 use axum::{
     Router,
     extract::{Path, State},
-    http::{HeaderMap, StatusCode},
-    response::Json,
+    http::{HeaderMap, HeaderValue, StatusCode, header},
+    middleware::{self, Next},
+    response::{Json, Response},
     routing::{get, post},
 };
 use rustkernel_core::registry::KernelRegistry;
@@ -126,7 +128,8 @@ impl KernelRouter {
 
     /// Build the router
     pub fn build(self) -> Router {
-        let state = AppState::new(self.registry, self.service_config);
+        let cors_enabled = self.config.cors_enabled;
+        let state = AppState::new(self.registry, self.service_config.clone());
 
         let mut router = Router::new();
 
@@ -151,21 +154,129 @@ impl KernelRouter {
             router = router.route(paths::METRICS, get(metrics_endpoint));
         }
 
+        // Add request-ID middleware (echoes X-Request-ID back in response)
+        router = router.layer(middleware::from_fn(request_id_middleware));
+
+        // Add CORS middleware if enabled
+        if cors_enabled {
+            let cors = build_cors(&self.service_config);
+            router = router.layer(cors);
+        }
+
         router.with_state(state)
     }
 }
 
+// Middleware
+
+/// Build CORS layer from service configuration
+fn build_cors(config: &ServiceConfig) -> tower_http::cors::CorsLayer {
+    use tower_http::cors::{Any, CorsLayer};
+
+    let cors = CorsLayer::new()
+        .allow_methods([
+            axum::http::Method::GET,
+            axum::http::Method::POST,
+            axum::http::Method::OPTIONS,
+        ])
+        .allow_headers([
+            header::CONTENT_TYPE,
+            header::AUTHORIZATION,
+            headers::X_REQUEST_ID.parse().unwrap(),
+            headers::X_TENANT_ID.parse().unwrap(),
+            headers::X_API_KEY.parse().unwrap(),
+        ]);
+
+    if config.cors_origins.iter().any(|o| o == "*") {
+        cors.allow_origin(Any)
+    } else {
+        let origins: Vec<HeaderValue> = config
+            .cors_origins
+            .iter()
+            .filter_map(|o| o.parse().ok())
+            .collect();
+        cors.allow_origin(origins)
+    }
+}
+
+/// Middleware that propagates X-Request-ID from request to response headers
+async fn request_id_middleware(
+    req: axum::extract::Request,
+    next: Next,
+) -> Response {
+    let request_id = req
+        .headers()
+        .get(headers::X_REQUEST_ID)
+        .and_then(|v| v.to_str().ok())
+        .map(|s| s.to_string())
+        .unwrap_or_else(|| uuid::Uuid::new_v4().to_string());
+
+    let mut response = next.run(req).await;
+
+    if let Ok(val) = HeaderValue::from_str(&request_id) {
+        response.headers_mut().insert(headers::X_REQUEST_ID, val);
+    }
+
+    response
+}
+
 // Handler implementations
 
 /// Health check handler
 async fn health_check(State(state): State<AppState>) -> Json<HealthResponse> {
     let uptime = state.start_time.elapsed().as_secs();
+    let stats = state.registry.stats();
+
+    let mut components = Vec::new();
+    let mut overall_status = HealthStatus::Healthy;
+
+    // Registry health
+    let registry_status = if stats.total > 0 {
+        HealthStatus::Healthy
+    } else {
+        overall_status = HealthStatus::Degraded;
+        HealthStatus::Degraded
+    };
+    components.push(ComponentHealth {
+        name: "kernel_registry".to_string(),
+        status: registry_status,
+        message: Some(format!(
+            "{} kernels ({} batch, {} ring)",
+            stats.total, stats.batch_kernels, stats.ring_kernels
+        )),
+    });
+
+    // Error rate health
+    let error_rate = if state.metrics.request_count() > 0 {
+        state.metrics.error_count() as f64 / state.metrics.request_count() as f64
+    } else {
+        0.0
+    };
+    let execution_status = if error_rate < 0.1 {
+        HealthStatus::Healthy
+    } else if error_rate < 0.5 {
+        overall_status = HealthStatus::Degraded;
+        HealthStatus::Degraded
+    } else {
+        overall_status = HealthStatus::Unhealthy;
+        HealthStatus::Unhealthy
+    };
+    components.push(ComponentHealth {
+        name: "execution_engine".to_string(),
+        status: execution_status,
+        message: Some(format!(
+            "{} requests, {:.1}% error rate, {:.0}us avg latency",
+            state.metrics.request_count(),
+            error_rate * 100.0,
+            state.metrics.avg_latency_us()
+        )),
+    });
 
     Json(HealthResponse {
-        status: HealthStatus::Healthy,
+        status: overall_status,
         version: state.config.version.clone(),
         uptime_secs: uptime,
-        components: vec![],
+        components,
     })
 }
 
@@ -188,25 +299,100 @@ async fn readiness_check(State(state): State<AppState>) -> StatusCode {
 async fn metrics_endpoint(State(state): State<AppState>) -> String {
     let metrics = &state.metrics;
     let uptime = state.start_time.elapsed().as_secs();
+    let stats = state.registry.stats();
+
+    let error_rate = if metrics.request_count() > 0 {
+        metrics.error_count() as f64 / metrics.request_count() as f64
+    } else {
+        0.0
+    };
 
-    format!(
+    let mut output = String::with_capacity(2048);
+
+    // Request metrics
+    output += &format!(
         "# HELP rustkernels_requests_total Total number of requests\n\
          # TYPE rustkernels_requests_total counter\n\
-         rustkernels_requests_total {}\n\
-         # HELP rustkernels_errors_total Total number of errors\n\
+         rustkernels_requests_total {}\n",
+        metrics.request_count()
+    );
+
+    output += &format!(
+        "# HELP rustkernels_errors_total Total number of errors\n\
          # TYPE rustkernels_errors_total counter\n\
-         rustkernels_errors_total {}\n\
-         # HELP rustkernels_avg_latency_us Average request latency in microseconds\n\
-         # TYPE rustkernels_avg_latency_us gauge\n\
-         rustkernels_avg_latency_us {:.2}\n\
-         # HELP rustkernels_uptime_seconds Service uptime in seconds\n\
+         rustkernels_errors_total {}\n",
+        metrics.error_count()
+    );
+
+    output += &format!(
+        "# HELP rustkernels_request_duration_us Average request duration in microseconds\n\
+         # TYPE rustkernels_request_duration_us gauge\n\
+         rustkernels_request_duration_us {:.2}\n",
+        metrics.avg_latency_us()
+    );
+
+    output += &format!(
+        "# HELP rustkernels_request_duration_min_us Minimum request duration in microseconds\n\
+         # TYPE rustkernels_request_duration_min_us gauge\n\
+         rustkernels_request_duration_min_us {}\n",
+        metrics.min_latency_us()
+    );
+
+    output += &format!(
+        "# HELP rustkernels_request_duration_max_us Maximum request duration in microseconds\n\
+         # TYPE rustkernels_request_duration_max_us gauge\n\
+         rustkernels_request_duration_max_us {}\n",
+        metrics.max_latency_us()
+    );
+
+    output += &format!(
+        "# HELP rustkernels_error_rate Current error rate (0.0-1.0)\n\
+         # TYPE rustkernels_error_rate gauge\n\
+         rustkernels_error_rate {:.6}\n",
+        error_rate
+    );
+
+    // Uptime
+    output += &format!(
+        "# HELP rustkernels_uptime_seconds Service uptime in seconds\n\
          # TYPE rustkernels_uptime_seconds gauge\n\
          rustkernels_uptime_seconds {}\n",
-        metrics.request_count(),
-        metrics.error_count(),
-        metrics.avg_latency_us(),
         uptime
-    )
+    );
+
+    // Registry metrics
+    output += &format!(
+        "# HELP rustkernels_kernels_registered Total registered kernels\n\
+         # TYPE rustkernels_kernels_registered gauge\n\
+         rustkernels_kernels_registered {}\n",
+        stats.total
+    );
+
+    output += &format!(
+        "# HELP rustkernels_batch_kernels Batch kernels available for execution\n\
+         # TYPE rustkernels_batch_kernels gauge\n\
+         rustkernels_batch_kernels {}\n",
+        stats.batch_kernels
+    );
+
+    output += &format!(
+        "# HELP rustkernels_ring_kernels Ring kernels registered\n\
+         # TYPE rustkernels_ring_kernels gauge\n\
+         rustkernels_ring_kernels {}\n",
+        stats.ring_kernels
+    );
+
+    // Per-domain kernel counts
+    output += "# HELP rustkernels_kernels_by_domain Kernels by domain\n\
+               # TYPE rustkernels_kernels_by_domain gauge\n";
+    for (domain, count) in &stats.by_domain {
+        output += &format!(
+            "rustkernels_kernels_by_domain{{domain=\"{}\"}} {}\n",
+            domain, count
+        );
+    }
+
+    output
 }
 
 /// List available kernels
@@ -269,42 +455,136 @@ async fn execute_kernel(
     let start = Instant::now();
     let request_id = extract_request_id(&headers);
 
-    // Check if kernel exists
-    let _kernel_meta = state.registry.get(&kernel_id).ok_or_else(|| {
+    // Try batch kernel execution first (batch kernels have factories for on-demand instantiation)
+    if let Some(entry) = state.registry.get_batch(&kernel_id) {
+        let kernel = entry.create();
+
+        // Serialize input JSON to bytes for the type-erased kernel interface
+        let input_bytes = serde_json::to_vec(&request.input).map_err(|e| {
+            state
+                .metrics
+                .record_request(start.elapsed().as_micros() as u64, true);
+            (
+                StatusCode::BAD_REQUEST,
+                Json(ErrorResponse {
+                    code: "INVALID_INPUT".to_string(),
+                    message: format!("Failed to serialize input: {}", e),
+                    request_id: Some(request_id.clone()),
+                    details: None,
+                }),
+            )
+        })?;
+
+        // Execute with timeout
+        let timeout_ms = request.metadata.timeout_ms.unwrap_or(
+            state.config.default_timeout.as_millis() as u64,
+        );
+        let timeout = std::time::Duration::from_millis(timeout_ms);
+
+        let result = tokio::time::timeout(timeout, kernel.execute_dyn(&input_bytes)).await;
+
+        match result {
+            Ok(Ok(output_bytes)) => {
+                let output: serde_json::Value =
+                    serde_json::from_slice(&output_bytes).map_err(|e| {
+                        state
+                            .metrics
+                            .record_request(start.elapsed().as_micros() as u64, true);
+                        (
+                            StatusCode::INTERNAL_SERVER_ERROR,
+                            Json(ErrorResponse {
+                                code: "OUTPUT_DESERIALIZATION_ERROR".to_string(),
+                                message: format!(
+                                    "Failed to deserialize kernel output: {}",
+                                    e
+                                ),
+                                request_id: Some(request_id.clone()),
+                                details: None,
+                            }),
+                        )
+                    })?;
+
+                let duration_us = start.elapsed().as_micros() as u64;
+                state.metrics.record_request(duration_us, false);
+
+                Ok(Json(KernelResponse {
+                    request_id,
+                    kernel_id,
+                    output,
+                    metadata: ResponseMetadata {
+                        duration_us,
+                        backend: entry.metadata.mode.as_str().to_uppercase(),
+                        gpu_memory_bytes: None,
+                        trace_id: extract_trace_id(&headers),
+                    },
+                }))
+            }
+            Ok(Err(e)) => {
+                let duration_us = start.elapsed().as_micros() as u64;
+                state.metrics.record_request(duration_us, true);
+                Err((
+                    StatusCode::INTERNAL_SERVER_ERROR,
+                    Json(ErrorResponse {
+                        code: "EXECUTION_FAILED".to_string(),
+                        message: format!("Kernel execution failed: {}", e),
+                        request_id: Some(request_id),
+                        details: None,
+                    }),
+                ))
+            }
+            Err(_) => {
+                state
+                    .metrics
+                    .record_request(start.elapsed().as_micros() as u64, true);
+                Err((
+                    StatusCode::GATEWAY_TIMEOUT,
+                    Json(ErrorResponse {
+                        code: "EXECUTION_TIMEOUT".to_string(),
+                        message: format!(
+                            "Kernel execution timed out after {}ms",
+                            timeout_ms
+                        ),
+                        request_id: Some(request_id),
+                        details: None,
+                    }),
+                ))
+            }
+        }
+    } else if let Some(meta) = state.registry.get(&kernel_id) {
+        // Kernel exists but is not a batch kernel (Ring or metadata-only)
+        state
+            .metrics
+            .record_request(start.elapsed().as_micros() as u64, true);
+        Err((
+            StatusCode::UNPROCESSABLE_ENTITY,
+            Json(ErrorResponse {
+                code: "RING_KERNEL_REST_UNSUPPORTED".to_string(),
+                message: format!(
+                    "Kernel '{}' is a {} mode kernel. Ring kernels require persistent \
+                     deployment via the Ring protocol or gRPC streaming API.",
+                    kernel_id, meta.mode
+                ),
+                request_id: Some(request_id),
+                details: Some(serde_json::json!({
+                    "kernel_mode": meta.mode.as_str(),
+                    "kernel_domain": format!("{:?}", meta.domain),
+                })),
+            }),
+        ))
+    } else {
         state
             .metrics
             .record_request(start.elapsed().as_micros() as u64, true);
-        (
+        Err((
             StatusCode::NOT_FOUND,
             Json(ErrorResponse {
                 code: "KERNEL_NOT_FOUND".to_string(),
                 message: format!("Kernel not found: {}", kernel_id),
-                request_id: Some(request_id.clone()),
+                request_id: Some(request_id),
                 details: None,
             }),
-        )
-    })?;
-
-    // For now, return a mock response
-    // Actual kernel execution will be implemented with the runtime
-    let duration_us = start.elapsed().as_micros() as u64;
-    state.metrics.record_request(duration_us, false);
-
-    Ok(Json(KernelResponse {
-        request_id,
-        kernel_id: kernel_id.clone(),
-        output: serde_json::json!({
-            "status": "executed",
-            "kernel": kernel_id,
-            "input_size": request.input.to_string().len()
-        }),
-        metadata: ResponseMetadata {
-            duration_us,
-            backend: "CPU".to_string(),
-            gpu_memory_bytes: None,
-            trace_id: extract_trace_id(&headers),
-        },
-    }))
+        ))
+    }
 }
 
 // Helper functions
diff --git a/crates/rustkernel-ecosystem/src/common.rs b/crates/rustkernel-ecosystem/src/common.rs
index 034236b..2c136de 100644
--- a/crates/rustkernel-ecosystem/src/common.rs
+++ b/crates/rustkernel-ecosystem/src/common.rs
@@ -164,11 +164,15 @@ impl Default for RateLimitConfig {
 /// Metrics collector for service endpoints
 pub struct ServiceMetrics {
     /// Total requests
-    pub total_requests: std::sync::atomic::AtomicU64,
+    total_requests: std::sync::atomic::AtomicU64,
     /// Total errors
-    pub total_errors: std::sync::atomic::AtomicU64,
+    total_errors: std::sync::atomic::AtomicU64,
     /// Total latency (microseconds)
-    pub total_latency_us: std::sync::atomic::AtomicU64,
+    total_latency_us: std::sync::atomic::AtomicU64,
+    /// Min latency (microseconds)
+    min_latency_us: std::sync::atomic::AtomicU64,
+    /// Max latency (microseconds)
+    max_latency_us: std::sync::atomic::AtomicU64,
 }
 
 impl ServiceMetrics {
@@ -178,6 +182,8 @@ impl ServiceMetrics {
             total_requests: std::sync::atomic::AtomicU64::new(0),
             total_errors: std::sync::atomic::AtomicU64::new(0),
             total_latency_us: std::sync::atomic::AtomicU64::new(0),
+            min_latency_us: std::sync::atomic::AtomicU64::new(u64::MAX),
+            max_latency_us: std::sync::atomic::AtomicU64::new(0),
         })
     }
 
@@ -190,6 +196,10 @@ impl ServiceMetrics {
         if is_error {
             self.total_errors.fetch_add(1, Ordering::Relaxed);
         }
+        // Update min latency
+        self.min_latency_us.fetch_min(latency_us, Ordering::Relaxed);
+        // Update max latency
+        self.max_latency_us.fetch_max(latency_us, Ordering::Relaxed);
     }
 
     /// Get request count
@@ -210,6 +220,17 @@ impl ServiceMetrics {
         let count = self.total_requests.load(Ordering::Relaxed) as f64;
         if count > 0.0 { total / count } else { 0.0 }
     }
+
+    /// Get minimum latency in microseconds (returns 0 if no requests)
+    pub fn min_latency_us(&self) -> u64 {
+        let val = self.min_latency_us.load(std::sync::atomic::Ordering::Relaxed);
+        if val == u64::MAX { 0 } else { val }
+    }
+
+    /// Get maximum latency in microseconds
+    pub fn max_latency_us(&self) -> u64 {
+        self.max_latency_us.load(std::sync::atomic::Ordering::Relaxed)
+    }
 }
 
 impl Default for ServiceMetrics {
@@ -218,6 +239,8 @@ impl Default for ServiceMetrics {
             total_requests: std::sync::atomic::AtomicU64::new(0),
             total_errors: std::sync::atomic::AtomicU64::new(0),
             total_latency_us: std::sync::atomic::AtomicU64::new(0),
+            min_latency_us: std::sync::atomic::AtomicU64::new(u64::MAX),
+            max_latency_us: std::sync::atomic::AtomicU64::new(0),
         }
     }
 }
@@ -277,6 +300,10 @@ mod tests {
     fn test_service_metrics() {
         let metrics = ServiceMetrics::new();
 
+        // No requests yet
+        assert_eq!(metrics.min_latency_us(), 0);
+        assert_eq!(metrics.max_latency_us(), 0);
+
         metrics.record_request(1000, false);
         metrics.record_request(2000, false);
         metrics.record_request(3000, true);
@@ -284,5 +311,7 @@ mod tests {
         assert_eq!(metrics.request_count(), 3);
         assert_eq!(metrics.error_count(), 1);
         assert!((metrics.avg_latency_us() - 2000.0).abs() < 0.1);
+        assert_eq!(metrics.min_latency_us(), 1000);
+        assert_eq!(metrics.max_latency_us(), 3000);
     }
 }
diff --git a/crates/rustkernel-ecosystem/src/grpc_integration.rs b/crates/rustkernel-ecosystem/src/grpc_integration.rs
index 99b9c6f..4f6acaf 100644
--- a/crates/rustkernel-ecosystem/src/grpc_integration.rs
+++ b/crates/rustkernel-ecosystem/src/grpc_integration.rs
@@ -188,39 +188,74 @@ impl KernelGrpcServer {
         self
     }
 
-    /// Execute a kernel
+    /// Execute a kernel.
+    ///
+    /// Looks up the batch kernel in the registry, creates an instance, and executes it
+    /// with the provided JSON input. Ring kernels cannot be executed through this unary RPC.
     pub async fn execute_kernel(
         &self,
         request: GrpcKernelRequest,
     ) -> Result<GrpcKernelResponse, GrpcError> {
         let start = Instant::now();
-        let request_id = uuid::Uuid::new_v4().to_string();
-
-        // Validate kernel exists
-        let _kernel_meta = self.registry.get(&request.kernel_id).ok_or_else(|| {
-            GrpcError::from(EcosystemError::KernelNotFound(request.kernel_id.clone()))
-        })?;
-
-        // Parse input
-        let _input: serde_json::Value = serde_json::from_str(&request.input_json).map_err(|e| {
-            GrpcError::from(EcosystemError::InvalidRequest(format!(
-                "Invalid JSON input: {}",
-                e
+        let request_id = request
+            .trace_id
+            .as_deref()
+            .map(|s| s.to_string())
+            .unwrap_or_else(|| uuid::Uuid::new_v4().to_string());
+
+        // Try batch kernel execution
+        if let Some(entry) = self.registry.get_batch(&request.kernel_id) {
+            let kernel = entry.create();
+
+            let input_bytes = request.input_json.as_bytes();
+
+            // Validate input is valid JSON before passing to kernel
+            if serde_json::from_slice::<serde_json::Value>(input_bytes).is_err() {
+                return Err(GrpcError::from(EcosystemError::InvalidRequest(
+                    "Input must be valid JSON".to_string(),
+                )));
+            }
+
+            // Apply timeout if specified
+            let timeout_ms = request.timeout_ms.unwrap_or(self.config.request_timeout_ms);
+            let timeout = std::time::Duration::from_millis(timeout_ms);
+
+            let result = tokio::time::timeout(timeout, kernel.execute_dyn(input_bytes)).await;
+
+            match result {
+                Ok(Ok(output_bytes)) => {
+                    let duration_us = start.elapsed().as_micros() as u64;
+                    let output_json =
+                        String::from_utf8(output_bytes).unwrap_or_else(|_| "{}".to_string());
+                    Ok(GrpcKernelResponse {
+                        request_id,
+                        kernel_id: request.kernel_id,
+                        output_json,
+                        duration_us,
+                        backend: entry.metadata.mode.as_str().to_uppercase(),
+                        gpu_memory_bytes: None,
+                        trace_id: request.trace_id,
+                    })
+                }
+                Ok(Err(e)) => Err(GrpcError::from(EcosystemError::ExecutionFailed(
+                    e.to_string(),
+                ))),
+                Err(_) => Err(GrpcError {
+                    code: 4, // DEADLINE_EXCEEDED
+                    message: format!("Kernel execution timed out after {}ms", timeout_ms),
+                    details: None,
+                }),
+            }
+        } else if self.registry.get(&request.kernel_id).is_some() {
+            Err(GrpcError::from(EcosystemError::InvalidRequest(format!(
+                "Kernel '{}' is a Ring kernel. Use bidirectional streaming RPC for Ring kernel dispatch.",
+                request.kernel_id
+            ))))
+        } else {
+            Err(GrpcError::from(EcosystemError::KernelNotFound(
+                request.kernel_id,
             )))
-        })?;
-
-        // Execute (placeholder - actual execution will use runtime)
-        let duration_us = start.elapsed().as_micros() as u64;
-
-        Ok(GrpcKernelResponse {
-            request_id,
-            kernel_id: request.kernel_id,
-            output_json: serde_json::json!({ "status": "executed" }).to_string(),
-            duration_us,
-            backend: "CPU".to_string(),
-            gpu_memory_bytes: None,
-            trace_id: request.trace_id,
-        })
+        }
     }
 
     /// Get kernel info
@@ -239,22 +274,21 @@ impl KernelGrpcServer {
         })
     }
 
-    /// List kernels
+    /// List kernels with pagination support.
+    ///
+    /// The `page_token` is the kernel ID to start after (exclusive).
+    /// Results are sorted by kernel ID for deterministic pagination.
     pub async fn list_kernels(
         &self,
         request: ListKernelsRequest,
     ) -> Result<ListKernelsResponse, GrpcError> {
-        let all_kernel_ids = self.registry.all_kernel_ids();
-        let page_size = request.page_size.unwrap_or(100) as usize;
+        let page_size = request.page_size.unwrap_or(100).max(1) as usize;
 
-        // Collect all kernel metadata
-        let all_kernels: Vec<_> = all_kernel_ids
-            .iter()
-            .filter_map(|id| self.registry.get(id))
-            .collect();
+        // Get all metadata sorted by ID for deterministic pagination
+        let all_metadata = self.registry.all_metadata();
 
-        // Apply domain filter
-        let filtered: Vec<_> = all_kernels
+        // Apply domain and mode filters
+        let filtered: Vec<_> = all_metadata
             .iter()
             .filter(|k| {
                 if let Some(ref domain) = request.domain {
@@ -270,6 +304,24 @@ impl KernelGrpcServer {
                     true
                 }
             })
+            .collect();
+
+        let total_count = filtered.len() as i32;
+
+        // Apply page_token: skip past the token ID
+        let start_idx = if let Some(ref token) = request.page_token {
+            filtered
+                .iter()
+                .position(|k| k.id == *token)
+                .map(|pos| pos + 1)
+                .unwrap_or(0)
+        } else {
+            0
+        };
+
+        let page: Vec<KernelInfo> = filtered
+            .iter()
+            .skip(start_idx)
             .take(page_size)
             .map(|k| KernelInfo {
                 id: k.id.clone(),
@@ -281,10 +333,17 @@ impl KernelGrpcServer {
             })
             .collect();
 
+        // Set next_page_token if there are more results
+        let next_page_token = if start_idx + page_size < filtered.len() {
+            page.last().map(|k| k.id.clone())
+        } else {
+            None
+        };
+
         Ok(ListKernelsResponse {
-            total_count: all_kernels.len() as i32,
-            kernels: filtered,
-            next_page_token: None,
+            total_count,
+            kernels: page,
+            next_page_token,
         })
     }
 
diff --git a/crates/rustkernel-ecosystem/src/tower_integration.rs b/crates/rustkernel-ecosystem/src/tower_integration.rs
index 6923ef5..062a583 100644
--- a/crates/rustkernel-ecosystem/src/tower_integration.rs
+++ b/crates/rustkernel-ecosystem/src/tower_integration.rs
@@ -24,42 +24,87 @@ use std::time::Instant;
 /// Kernel service for Tower
 pub struct KernelService {
     registry: Arc<KernelRegistry>,
+    default_timeout: std::time::Duration,
 }
 
 impl KernelService {
     /// Create a new kernel service
     pub fn new(registry: Arc<KernelRegistry>) -> Self {
-        Self { registry }
+        Self {
+            registry,
+            default_timeout: std::time::Duration::from_secs(30),
+        }
     }
 
-    /// Execute a kernel request
+    /// Set the default execution timeout
+    pub fn with_timeout(mut self, timeout: std::time::Duration) -> Self {
+        self.default_timeout = timeout;
+        self
+    }
+
+    /// Execute a kernel request.
+    ///
+    /// Looks up the batch kernel in the registry, creates an instance, and executes it
+    /// with the provided JSON input. Ring kernels cannot be executed through this interface.
     pub async fn execute(&self, request: KernelRequest) -> Result<KernelResponse, EcosystemError> {
         let start = Instant::now();
         let request_id = uuid::Uuid::new_v4().to_string();
 
-        // Validate kernel exists
-        let _kernel_meta = self
-            .registry
-            .get(&request.kernel_id)
-            .ok_or_else(|| EcosystemError::KernelNotFound(request.kernel_id.clone()))?;
-
-        // Execute (placeholder - actual execution will use runtime)
-        let duration_us = start.elapsed().as_micros() as u64;
-
-        Ok(KernelResponse {
-            request_id,
-            kernel_id: request.kernel_id,
-            output: serde_json::json!({
-                "status": "executed",
-                "input": request.input
-            }),
-            metadata: ResponseMetadata {
-                duration_us,
-                backend: "CPU".to_string(),
-                gpu_memory_bytes: None,
-                trace_id: request.metadata.trace_id,
-            },
-        })
+        // Try batch kernel execution
+        if let Some(entry) = self.registry.get_batch(&request.kernel_id) {
+            let kernel = entry.create();
+
+            let input_bytes = serde_json::to_vec(&request.input)
+                .map_err(|e| EcosystemError::InvalidRequest(format!("Invalid input: {}", e)))?;
+
+            // Apply timeout from request metadata or default
+            let timeout_ms = request
+                .metadata
+                .timeout_ms
+                .unwrap_or(self.default_timeout.as_millis() as u64);
+            let timeout = std::time::Duration::from_millis(timeout_ms);
+
+            let result = tokio::time::timeout(timeout, kernel.execute_dyn(&input_bytes)).await;
+
+            match result {
+                Ok(Ok(output_bytes)) => {
+                    let output: serde_json::Value = serde_json::from_slice(&output_bytes)
+                        .map_err(|e| {
+                            EcosystemError::InternalError(format!(
+                                "Output deserialization: {}",
+                                e
+                            ))
+                        })?;
+
+                    let duration_us = start.elapsed().as_micros() as u64;
+
+                    Ok(KernelResponse {
+                        request_id,
+                        kernel_id: request.kernel_id,
+                        output,
+                        metadata: ResponseMetadata {
+                            duration_us,
+                            backend: entry.metadata.mode.as_str().to_uppercase(),
+                            gpu_memory_bytes: None,
+                            trace_id: request.metadata.trace_id,
+                        },
+                    })
+                }
+                Ok(Err(e)) => Err(EcosystemError::ExecutionFailed(e.to_string())),
+                Err(_) => Err(EcosystemError::ServiceUnavailable(format!(
+                    "Kernel execution timed out after {}ms",
+                    timeout_ms
+                ))),
+            }
+        } else if self.registry.get(&request.kernel_id).is_some() {
+            Err(EcosystemError::InvalidRequest(format!(
+                "Kernel '{}' is a Ring kernel and cannot be executed via this interface. \
+                 Use the Ring protocol or gRPC streaming API.",
+                request.kernel_id
+            )))
+        } else {
+            Err(EcosystemError::KernelNotFound(request.kernel_id))
+        }
     }
 }
 
@@ -67,6 +112,7 @@ impl Clone for KernelService {
     fn clone(&self) -> Self {
         Self {
             registry: self.registry.clone(),
+            default_timeout: self.default_timeout,
         }
     }
 }
@@ -208,10 +254,12 @@ pub enum TimeoutError<E> {
     Inner(E),
 }
 
-/// Rate limiter layer
+/// Rate limiter layer using a token bucket algorithm.
+///
+/// Allows up to `burst_size` requests immediately, then refills at
+/// `requests_per_second` rate.
 pub struct RateLimiterLayer {
     requests_per_second: u32,
-    #[allow(dead_code)]
     burst_size: u32,
 }
 
@@ -231,25 +279,37 @@ impl<S> tower::Layer<S> for RateLimiterLayer {
     fn layer(&self, inner: S) -> Self::Service {
         RateLimiterService {
             inner,
-            interval: std::time::Duration::from_secs(1) / self.requests_per_second,
-            last_request: Arc::new(tokio::sync::Mutex::new(std::time::Instant::now())),
+            requests_per_second: self.requests_per_second as f64,
+            burst_size: self.burst_size as f64,
+            state: Arc::new(tokio::sync::Mutex::new(TokenBucketState {
+                tokens: self.burst_size as f64,
+                last_refill: std::time::Instant::now(),
+            })),
         }
     }
 }
 
-/// Rate limiter service
+/// Internal state for the token bucket
+struct TokenBucketState {
+    tokens: f64,
+    last_refill: std::time::Instant,
+}
+
+/// Rate limiter service using a token bucket algorithm
 pub struct RateLimiterService<S> {
     inner: S,
-    interval: std::time::Duration,
-    last_request: Arc<tokio::sync::Mutex<std::time::Instant>>,
+    requests_per_second: f64,
+    burst_size: f64,
+    state: Arc<tokio::sync::Mutex<TokenBucketState>>,
 }
 
 impl<S: Clone> Clone for RateLimiterService<S> {
     fn clone(&self) -> Self {
         Self {
             inner: self.inner.clone(),
-            interval: self.interval,
-            last_request: self.last_request.clone(),
+            requests_per_second: self.requests_per_second,
+            burst_size: self.burst_size,
+            state: self.state.clone(),
         }
     }
 }
@@ -270,20 +330,25 @@ where
 
     fn call(&mut self, req: Request) -> Self::Future {
         let mut inner = self.inner.clone();
-        let interval = self.interval;
-        let last_request = self.last_request.clone();
+        let rps = self.requests_per_second;
+        let burst = self.burst_size;
+        let state = self.state.clone();
 
         Box::pin(async move {
-            // Check rate limit
-            let mut last = last_request.lock().await;
-            let elapsed = last.elapsed();
+            let mut bucket = state.lock().await;
+
+            // Refill tokens based on elapsed time
+            let elapsed = bucket.last_refill.elapsed().as_secs_f64();
+            bucket.tokens = (bucket.tokens + elapsed * rps).min(burst);
+            bucket.last_refill = std::time::Instant::now();
 
-            if elapsed < interval {
+            // Check if we have a token available
+            if bucket.tokens < 1.0 {
                 return Err(RateLimitError::RateLimitExceeded);
             }
 
-            *last = std::time::Instant::now();
-            drop(last);
+            bucket.tokens -= 1.0;
+            drop(bucket);
 
             inner.call(req).await.map_err(RateLimitError::Inner)
         })
diff --git a/crates/rustkernel-graph/src/lib.rs b/crates/rustkernel-graph/src/lib.rs
index 5520394..09eff26 100644
--- a/crates/rustkernel-graph/src/lib.rs
+++ b/crates/rustkernel-graph/src/lib.rs
@@ -86,58 +86,59 @@ pub mod prelude {
 }
 
 /// Register all graph kernels with a registry.
+///
+/// Batch kernels are registered with factories for direct execution via REST/gRPC.
+/// Ring kernels are registered as metadata for discovery (require Ring runtime for execution).
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering graph analytics kernels");
 
-    // Centrality kernels (6)
-    registry.register_metadata(centrality::PageRank::new().metadata().clone())?;
-    registry.register_metadata(centrality::DegreeCentrality::new().metadata().clone())?;
-    registry.register_metadata(centrality::BetweennessCentrality::new().metadata().clone())?;
-    registry.register_metadata(centrality::ClosenessCentrality::new().metadata().clone())?;
-    registry.register_metadata(centrality::EigenvectorCentrality::new().metadata().clone())?;
-    registry.register_metadata(centrality::KatzCentrality::new().metadata().clone())?;
-
-    // Community detection kernels (3)
-    registry.register_metadata(community::ModularityScore::new().metadata().clone())?;
-    registry.register_metadata(community::LouvainCommunity::new().metadata().clone())?;
-    registry.register_metadata(community::LabelPropagation::new().metadata().clone())?;
-
-    // Similarity kernels (5)
-    registry.register_metadata(similarity::JaccardSimilarity::new().metadata().clone())?;
-    registry.register_metadata(similarity::CosineSimilarity::new().metadata().clone())?;
-    registry.register_metadata(similarity::AdamicAdarIndex::new().metadata().clone())?;
-    registry.register_metadata(similarity::CommonNeighbors::new().metadata().clone())?;
-    registry.register_metadata(similarity::ValueSimilarity::new().metadata().clone())?;
-
-    // Metrics kernels (5)
-    registry.register_metadata(metrics::GraphDensity::new().metadata().clone())?;
-    registry.register_metadata(metrics::AveragePathLength::new().metadata().clone())?;
-    registry.register_metadata(metrics::ClusteringCoefficient::new().metadata().clone())?;
-    registry.register_metadata(metrics::ConnectedComponents::new().metadata().clone())?;
-    registry.register_metadata(metrics::FullGraphMetrics::new().metadata().clone())?;
-
-    // Motif detection kernels (3)
-    registry.register_metadata(motif::TriangleCounting::new().metadata().clone())?;
-    registry.register_metadata(motif::MotifDetection::new().metadata().clone())?;
-    registry.register_metadata(motif::KCliqueDetection::new().metadata().clone())?;
-
-    // Topology kernels (2)
-    registry.register_metadata(topology::DegreeRatio::new().metadata().clone())?;
-    registry.register_metadata(topology::StarTopologyScore::new().metadata().clone())?;
-
-    // Cycle detection kernels (1)
-    registry.register_metadata(cycles::ShortCycleParticipation::new().metadata().clone())?;
-
-    // Path kernels (1)
-    registry.register_metadata(paths::ShortestPath::new().metadata().clone())?;
-
-    // GNN kernels (2)
-    registry.register_metadata(gnn::GNNInference::new().metadata().clone())?;
-    registry.register_metadata(gnn::GraphAttention::new().metadata().clone())?;
+    // Centrality kernels (6) — Ring: PageRank, DegreeCentrality; Batch: rest
+    registry.register_ring_metadata_from(centrality::PageRank::new)?;
+    registry.register_ring_metadata_from(centrality::DegreeCentrality::new)?;
+    registry.register_batch_typed(centrality::BetweennessCentrality::new)?;
+    registry.register_batch_typed(centrality::ClosenessCentrality::new)?;
+    registry.register_batch_typed(centrality::EigenvectorCentrality::new)?;
+    registry.register_batch_typed(centrality::KatzCentrality::new)?;
+
+    // Community detection kernels (3) — Batch (GpuKernel only)
+    registry.register_batch_metadata_from(community::ModularityScore::new)?;
+    registry.register_batch_metadata_from(community::LouvainCommunity::new)?;
+    registry.register_batch_metadata_from(community::LabelPropagation::new)?;
+
+    // Similarity kernels (5) — Batch (GpuKernel only)
+    registry.register_batch_metadata_from(similarity::JaccardSimilarity::new)?;
+    registry.register_batch_metadata_from(similarity::CosineSimilarity::new)?;
+    registry.register_batch_metadata_from(similarity::AdamicAdarIndex::new)?;
+    registry.register_batch_metadata_from(similarity::CommonNeighbors::new)?;
+    registry.register_batch_metadata_from(similarity::ValueSimilarity::new)?;
+
+    // Metrics kernels (5) — Batch (GpuKernel only)
+    registry.register_batch_metadata_from(metrics::GraphDensity::new)?;
+    registry.register_batch_metadata_from(metrics::AveragePathLength::new)?;
+    registry.register_batch_metadata_from(metrics::ClusteringCoefficient::new)?;
+    registry.register_batch_metadata_from(metrics::ConnectedComponents::new)?;
+    registry.register_batch_metadata_from(metrics::FullGraphMetrics::new)?;
+
+    // Motif detection kernels (3) — Ring: TriangleCounting; Batch: rest
+    registry.register_ring_metadata_from(motif::TriangleCounting::new)?;
+    registry.register_batch_metadata_from(motif::MotifDetection::new)?;
+    registry.register_batch_metadata_from(motif::KCliqueDetection::new)?;
+
+    // Topology kernels (2) — Ring: DegreeRatio; Batch: StarTopologyScore
+    registry.register_ring_metadata_from(topology::DegreeRatio::new)?;
+    registry.register_batch_metadata_from(topology::StarTopologyScore::new)?;
+
+    // Cycle detection kernels (1) — Batch
+    registry.register_batch_metadata_from(cycles::ShortCycleParticipation::new)?;
+
+    // Path kernels (1) — Batch
+    registry.register_batch_metadata_from(paths::ShortestPath::new)?;
+
+    // GNN kernels (2) — Batch (GpuKernel only)
+    registry.register_batch_metadata_from(gnn::GNNInference::new)?;
+    registry.register_batch_metadata_from(gnn::GraphAttention::new)?;
 
     tracing::info!("Registered 28 graph analytics kernels");
     Ok(())
diff --git a/crates/rustkernel-ml/src/lib.rs b/crates/rustkernel-ml/src/lib.rs
index 7b62c74..fb4fd2f 100644
--- a/crates/rustkernel-ml/src/lib.rs
+++ b/crates/rustkernel-ml/src/lib.rs
@@ -74,56 +74,42 @@ pub mod prelude {
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering statistical ML kernels");
 
-    // Clustering kernels (3)
-    registry.register_metadata(clustering::KMeans::new().metadata().clone())?;
-    registry.register_metadata(clustering::DBSCAN::new().metadata().clone())?;
-    registry.register_metadata(clustering::HierarchicalClustering::new().metadata().clone())?;
+    // Clustering kernels (3) - implement BatchKernel<I, O>
+    registry.register_batch_typed(clustering::KMeans::new)?;
+    registry.register_batch_typed(clustering::DBSCAN::new)?;
+    registry.register_batch_typed(clustering::HierarchicalClustering::new)?;
 
     // Anomaly detection kernels (2)
-    registry.register_metadata(anomaly::IsolationForest::new().metadata().clone())?;
-    registry.register_metadata(anomaly::LocalOutlierFactor::new().metadata().clone())?;
+    registry.register_batch_metadata_from(anomaly::IsolationForest::new)?;
+    registry.register_batch_metadata_from(anomaly::LocalOutlierFactor::new)?;
 
     // Streaming anomaly detection kernels (2)
-    registry.register_metadata(
-        streaming::StreamingIsolationForest::new()
-            .metadata()
-            .clone(),
-    )?;
-    registry.register_metadata(streaming::AdaptiveThreshold::new().metadata().clone())?;
+    registry.register_batch_metadata_from(streaming::StreamingIsolationForest::new)?;
+    registry.register_batch_metadata_from(streaming::AdaptiveThreshold::new)?;
 
     // Ensemble kernel (1)
-    registry.register_metadata(ensemble::EnsembleVoting::new().metadata().clone())?;
+    registry.register_batch_metadata_from(ensemble::EnsembleVoting::new)?;
 
     // Regression kernels (2)
-    registry.register_metadata(regression::LinearRegression::new().metadata().clone())?;
-    registry.register_metadata(regression::RidgeRegression::new().metadata().clone())?;
+    registry.register_batch_metadata_from(regression::LinearRegression::new)?;
+    registry.register_batch_metadata_from(regression::RidgeRegression::new)?;
 
     // Explainability kernels (2)
-    registry.register_metadata(explainability::SHAPValues::new().metadata().clone())?;
-    registry.register_metadata(explainability::FeatureImportance::new().metadata().clone())?;
+    registry.register_batch_metadata_from(explainability::SHAPValues::new)?;
+    registry.register_batch_metadata_from(explainability::FeatureImportance::new)?;
 
     // NLP / LLM Integration kernels (2)
-    registry.register_metadata(nlp::EmbeddingGeneration::new().metadata().clone())?;
-    registry.register_metadata(nlp::SemanticSimilarity::new().metadata().clone())?;
+    registry.register_batch_metadata_from(nlp::EmbeddingGeneration::new)?;
+    registry.register_batch_metadata_from(nlp::SemanticSimilarity::new)?;
 
     // Federated Learning kernels (1)
-    registry.register_metadata(federated::SecureAggregation::new().metadata().clone())?;
+    registry.register_batch_metadata_from(federated::SecureAggregation::new)?;
 
     // Healthcare Analytics kernels (2)
-    registry.register_metadata(
-        healthcare::DrugInteractionPrediction::new()
-            .metadata()
-            .clone(),
-    )?;
-    registry.register_metadata(
-        healthcare::ClinicalPathwayConformance::new()
-            .metadata()
-            .clone(),
-    )?;
+    registry.register_batch_metadata_from(healthcare::DrugInteractionPrediction::new)?;
+    registry.register_batch_metadata_from(healthcare::ClinicalPathwayConformance::new)?;
 
     tracing::info!("Registered 17 statistical ML kernels");
     Ok(())
diff --git a/crates/rustkernel-orderbook/src/lib.rs b/crates/rustkernel-orderbook/src/lib.rs
index e6df335..0e547d1 100644
--- a/crates/rustkernel-orderbook/src/lib.rs
+++ b/crates/rustkernel-orderbook/src/lib.rs
@@ -39,12 +39,10 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering order matching kernels");
 
-    // Order matching kernel (1)
-    registry.register_metadata(matching::OrderMatchingEngine::new().metadata().clone())?;
+    // Order matching kernel (1) - Ring (also implements BatchKernel for multiple types)
+    registry.register_ring_metadata_from(matching::OrderMatchingEngine::new)?;
 
     tracing::info!("Registered 1 order matching kernel");
     Ok(())
diff --git a/crates/rustkernel-payments/src/lib.rs b/crates/rustkernel-payments/src/lib.rs
index f167fa3..10b7abf 100644
--- a/crates/rustkernel-payments/src/lib.rs
+++ b/crates/rustkernel-payments/src/lib.rs
@@ -20,15 +20,13 @@ pub use types::*;
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering payment processing kernels");
 
-    // Processing kernel (1)
-    registry.register_metadata(processing::PaymentProcessing::new().metadata().clone())?;
+    // Processing kernel (1) - Ring
+    registry.register_ring_metadata_from(processing::PaymentProcessing::new)?;
 
-    // Flow analysis kernel (1)
-    registry.register_metadata(flow::FlowAnalysis::new().metadata().clone())?;
+    // Flow analysis kernel (1) - Batch
+    registry.register_batch_metadata_from(flow::FlowAnalysis::new)?;
 
     tracing::info!("Registered 2 payment processing kernels");
     Ok(())
diff --git a/crates/rustkernel-procint/src/lib.rs b/crates/rustkernel-procint/src/lib.rs
index 4653925..d035fa4 100644
--- a/crates/rustkernel-procint/src/lib.rs
+++ b/crates/rustkernel-procint/src/lib.rs
@@ -62,34 +62,28 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering process intelligence kernels");
 
-    // DFG kernel (1)
-    registry.register_metadata(dfg::DFGConstruction::new().metadata().clone())?;
+    // DFG kernel (1) - Batch
+    registry.register_batch_metadata_from(dfg::DFGConstruction::new)?;
 
-    // Partial order kernel (1)
-    registry.register_metadata(
-        partial_order::PartialOrderAnalysis::new()
-            .metadata()
-            .clone(),
-    )?;
+    // Partial order kernel (1) - Batch
+    registry.register_batch_metadata_from(partial_order::PartialOrderAnalysis::new)?;
 
-    // Conformance kernel (1)
-    registry.register_metadata(conformance::ConformanceChecking::new().metadata().clone())?;
+    // Conformance kernel (1) - Ring
+    registry.register_ring_metadata_from(conformance::ConformanceChecking::new)?;
 
-    // OCPM kernel (1)
-    registry.register_metadata(ocpm::OCPMPatternMatching::new().metadata().clone())?;
+    // OCPM kernel (1) - Batch
+    registry.register_batch_metadata_from(ocpm::OCPMPatternMatching::new)?;
 
-    // Prediction kernel (1)
-    registry.register_metadata(prediction::NextActivityPrediction::new().metadata().clone())?;
+    // Prediction kernel (1) - Batch
+    registry.register_batch_metadata_from(prediction::NextActivityPrediction::new)?;
 
-    // Imputation kernel (1)
-    registry.register_metadata(imputation::EventLogImputation::new().metadata().clone())?;
+    // Imputation kernel (1) - Batch
+    registry.register_batch_metadata_from(imputation::EventLogImputation::new)?;
 
-    // Simulation kernel (1)
-    registry.register_metadata(simulation::DigitalTwin::new().metadata().clone())?;
+    // Simulation kernel (1) - Batch
+    registry.register_batch_metadata_from(simulation::DigitalTwin::new)?;
 
     tracing::info!("Registered 7 process intelligence kernels");
     Ok(())
diff --git a/crates/rustkernel-risk/src/lib.rs b/crates/rustkernel-risk/src/lib.rs
index 26c00ce..493cf89 100644
--- a/crates/rustkernel-risk/src/lib.rs
+++ b/crates/rustkernel-risk/src/lib.rs
@@ -52,20 +52,18 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering risk analytics kernels");
 
-    // Credit kernel (1)
-    registry.register_metadata(credit::CreditRiskScoring::new().metadata().clone())?;
+    // Credit kernel (1) - Ring
+    registry.register_ring_metadata_from(credit::CreditRiskScoring::new)?;
 
-    // Market kernels (3)
-    registry.register_metadata(market::MonteCarloVaR::new().metadata().clone())?;
-    registry.register_metadata(market::PortfolioRiskAggregation::new().metadata().clone())?;
-    registry.register_metadata(correlation::RealTimeCorrelation::new().metadata().clone())?;
+    // Market kernels (3) - Ring
+    registry.register_ring_metadata_from(market::MonteCarloVaR::new)?;
+    registry.register_ring_metadata_from(market::PortfolioRiskAggregation::new)?;
+    registry.register_ring_metadata_from(correlation::RealTimeCorrelation::new)?;
 
-    // Stress kernel (1)
-    registry.register_metadata(stress::StressTesting::new().metadata().clone())?;
+    // Stress kernel (1) - Batch
+    registry.register_batch_typed::<StressTesting, messages::StressTestingInput, messages::StressTestingOutput>(stress::StressTesting::new)?;
 
     tracing::info!("Registered 5 risk analytics kernels");
     Ok(())
diff --git a/crates/rustkernel-temporal/src/lib.rs b/crates/rustkernel-temporal/src/lib.rs
index dee1b19..ff49750 100644
--- a/crates/rustkernel-temporal/src/lib.rs
+++ b/crates/rustkernel-temporal/src/lib.rs
@@ -58,32 +58,22 @@ pub use types::{
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering temporal analysis kernels");
 
-    // Forecasting kernels (2)
-    registry.register_metadata(forecasting::ARIMAForecast::new().metadata().clone())?;
-    registry.register_metadata(forecasting::ProphetDecomposition::new().metadata().clone())?;
+    // Forecasting kernels (2) - Batch
+    registry.register_batch_typed(forecasting::ARIMAForecast::new)?;
+    registry.register_batch_typed(forecasting::ProphetDecomposition::new)?;
 
     // Detection kernels (2)
-    registry.register_metadata(detection::ChangePointDetection::new().metadata().clone())?;
-    registry.register_metadata(
-        detection::TimeSeriesAnomalyDetection::new()
-            .metadata()
-            .clone(),
-    )?;
+    registry.register_batch_typed(detection::ChangePointDetection::new)?; // Batch
+    registry.register_ring_metadata_from(detection::TimeSeriesAnomalyDetection::new)?; // Ring
 
-    // Decomposition kernels (2)
-    registry.register_metadata(
-        decomposition::SeasonalDecomposition::new()
-            .metadata()
-            .clone(),
-    )?;
-    registry.register_metadata(decomposition::TrendExtraction::new().metadata().clone())?;
+    // Decomposition kernels (2) - Batch
+    registry.register_batch_typed(decomposition::SeasonalDecomposition::new)?;
+    registry.register_batch_typed(decomposition::TrendExtraction::new)?;
 
-    // Volatility kernel (1)
-    registry.register_metadata(volatility::VolatilityAnalysis::new().metadata().clone())?;
+    // Volatility kernel (1) - Ring
+    registry.register_ring_metadata_from(volatility::VolatilityAnalysis::new)?;
 
     tracing::info!("Registered 7 temporal analysis kernels");
     Ok(())
diff --git a/crates/rustkernel-treasury/src/lib.rs b/crates/rustkernel-treasury/src/lib.rs
index db95d24..257453e 100644
--- a/crates/rustkernel-treasury/src/lib.rs
+++ b/crates/rustkernel-treasury/src/lib.rs
@@ -28,24 +28,22 @@ pub use liquidity::LiquidityOptimization;
 pub fn register_all(
     registry: &rustkernel_core::registry::KernelRegistry,
 ) -> rustkernel_core::error::Result<()> {
-    use rustkernel_core::traits::GpuKernel;
-
     tracing::info!("Registering treasury management kernels");
 
-    // Cash flow kernel (1)
-    registry.register_metadata(cashflow::CashFlowForecasting::new().metadata().clone())?;
+    // Cash flow kernel (1) — Batch
+    registry.register_ring_metadata_from(cashflow::CashFlowForecasting::new)?;
 
-    // Collateral kernel (1)
-    registry.register_metadata(collateral::CollateralOptimization::new().metadata().clone())?;
+    // Collateral kernel (1) — Batch
+    registry.register_ring_metadata_from(collateral::CollateralOptimization::new)?;
 
-    // FX kernel (1)
-    registry.register_metadata(fx::FXHedging::new().metadata().clone())?;
+    // FX kernel (1) — Batch
+    registry.register_ring_metadata_from(fx::FXHedging::new)?;
 
-    // Interest rate kernel (1)
-    registry.register_metadata(interest_rate::InterestRateRisk::new().metadata().clone())?;
+    // Interest rate kernel (1) — Batch
+    registry.register_ring_metadata_from(interest_rate::InterestRateRisk::new)?;
 
-    // Liquidity kernel (1)
-    registry.register_metadata(liquidity::LiquidityOptimization::new().metadata().clone())?;
+    // Liquidity kernel (1) — Batch
+    registry.register_ring_metadata_from(liquidity::LiquidityOptimization::new)?;
 
     tracing::info!("Registered 5 treasury management kernels");
     Ok(())
diff --git a/crates/rustkernels/src/lib.rs b/crates/rustkernels/src/lib.rs
index 80bf009..7692680 100644
--- a/crates/rustkernels/src/lib.rs
+++ b/crates/rustkernels/src/lib.rs
@@ -2,82 +2,75 @@
 //!
 //! GPU-accelerated kernel library for financial services, analytics, and compliance workloads.
 //!
-//! RustKernels is a Rust port of the DotCompute GPU kernel library, leveraging the
-//! RustCompute (RingKernel) framework for GPU-native persistent actors.
+//! RustKernels is built on [RingKernel 0.4.2](https://crates.io/crates/ringkernel-core)
+//! and provides 106 kernels across 14 domain-specific crates.
 //!
 //! ## Features
 //!
-//! - **16 Domain Categories**: Graph analytics, ML, compliance, risk, temporal analysis, and more
-//! - **173+ Kernels**: Comprehensive coverage of financial and analytical algorithms
+//! - **14 Domain Categories**: Graph analytics, ML, compliance, risk, temporal analysis, and more
+//! - **106 Kernels**: Comprehensive coverage of financial and analytical algorithms
 //! - **Dual Execution Modes**:
 //!   - **Batch**: CPU-orchestrated, 10-50μs overhead, for periodic heavy computation
 //!   - **Ring**: GPU-persistent actor, 100-500ns latency, for high-frequency operations
-//! - **Enterprise Licensing**: Domain-based licensing and feature gating
-//! - **Multi-Backend**: CUDA, WebGPU, and CPU backends via RustCompute
+//! - **Type-Erased Execution**: REST/gRPC dispatch via `TypeErasedBatchKernel`
+//! - **Enterprise Features**: Security, observability, resilience, production configuration
+//! - **Multi-Backend**: CUDA, WebGPU, and CPU backends via RingKernel
 //!
 //! ## Quick Start
 //!
 //! ```rust,ignore
-//! use rustkernel::prelude::*;
+//! use rustkernels::prelude::*;
+//! use rustkernels::graph::centrality::{BetweennessCentrality, BetweennessCentralityInput};
 //!
 //! #[tokio::main]
-//! async fn main() -> Result<()> {
-//!     // Create a kernel registry
-//!     let registry = KernelRegistry::new();
+//! async fn main() -> Result<(), Box<dyn std::error::Error>> {
+//!     let kernel = BetweennessCentrality::new();
 //!
-//!     // Get runtime with best available backend
-//!     let runtime = RingKernel::builder()
-//!         .backend(Backend::Auto)
-//!         .build()
-//!         .await?;
-//!
-//!     // Launch a specific kernel
-//!     let pagerank = runtime.launch("graph/pagerank", LaunchOptions::default()).await?;
-//!
-//!     // Use it
-//!     pagerank.send(PageRankRequest { node_id: 42, operation: PageRankOp::Query }).await?;
-//!     let response: PageRankResponse = pagerank.receive().await?;
+//!     let input = BetweennessCentralityInput {
+//!         num_nodes: 4,
+//!         edges: vec![(0, 1), (1, 2), (2, 3), (0, 3)],
+//!         normalized: true,
+//!     };
 //!
+//!     let result = kernel.execute(input).await?;
+//!     for (node, score) in result.scores.iter().enumerate() {
+//!         println!("Node {}: {:.4}", node, score);
+//!     }
 //!     Ok(())
 //! }
 //! ```
 //!
 //! ## Domain Organization
 //!
-//! Kernels are organized into domains representing different business/analytical areas:
-//!
-//! ### Priority 1 (High Value)
-//! - **GraphAnalytics**: Centrality, community detection, motifs, similarity
-//! - **StatisticalML**: Clustering, anomaly detection, regression
-//! - **Compliance**: AML, KYC, sanctions screening
-//! - **TemporalAnalysis**: Forecasting, change detection, decomposition
-//! - **RiskAnalytics**: Credit risk, VaR, portfolio risk
+//! Kernels are organized into 14 domains:
 //!
-//! ### Priority 2 (Medium)
-//! - **Banking**: Fraud detection
-//! - **BehavioralAnalytics**: Profiling, forensics
-//! - **OrderMatching**: HFT order book
-//! - **ProcessIntelligence**: Process mining
-//! - **Clearing**: Settlement, netting
-//!
-//! ### Priority 3 (Lower)
-//! - **TreasuryManagement**: Cash flow, hedging
-//! - **Accounting**: Chart of accounts, reconciliation
-//! - **PaymentProcessing**: Transaction execution
-//! - **FinancialAudit**: Feature extraction
+//! | Domain | Kernels | Description |
+//! |--------|---------|-------------|
+//! | GraphAnalytics | 28 | Centrality, GNN inference, community detection |
+//! | StatisticalML | 17 | Clustering, NLP embeddings, federated learning |
+//! | Compliance | 11 | AML, KYC, sanctions screening |
+//! | TemporalAnalysis | 7 | Forecasting, change-point detection |
+//! | RiskAnalytics | 5 | Credit risk, VaR, stress testing |
+//! | ProcessIntelligence | 7 | DFG, conformance, digital twin |
+//! | BehavioralAnalytics | 6 | Profiling, forensics |
+//! | Clearing | 5 | Netting, settlement, DVP |
+//! | Treasury | 5 | Cash flow, FX hedging |
+//! | Accounting | 9 | Network generation, reconciliation |
+//! | Payments | 2 | Payment processing |
+//! | Banking | 1 | Fraud detection |
+//! | OrderMatching | 1 | HFT order book |
+//! | FinancialAudit | 2 | Feature extraction |
 //!
 //! ## Feature Flags
 //!
-//! Enable domain crates via Cargo features:
-//!
 //! ```toml
 //! [dependencies]
-//! rustkernel = { version = "0.1", features = ["graph", "ml", "risk"] }
+//! rustkernels = { version = "0.4", features = ["graph", "ml", "risk"] }
 //! ```
 //!
 //! Available features:
 //! - `default`: graph, ml, compliance, temporal, risk
-//! - `full`: All domains
+//! - `full`: All 14 domains
 //! - Individual: `graph`, `ml`, `compliance`, `temporal`, `risk`, `banking`, etc.
 
 #![warn(missing_docs)]
@@ -161,8 +154,8 @@ pub mod version {
     /// Crate version.
     pub const VERSION: &str = env!("CARGO_PKG_VERSION");
 
-    /// Minimum supported RustCompute version.
-    pub const MIN_RINGKERNEL_VERSION: &str = "0.1.0";
+    /// Minimum supported RingKernel version.
+    pub const MIN_RINGKERNEL_VERSION: &str = "0.4.2";
 }
 
 /// Kernel catalog providing overview of all available kernels.
@@ -190,36 +183,36 @@ pub mod catalog {
             DomainInfo {
                 domain: Domain::GraphAnalytics,
                 name: "Graph Analytics",
-                description: "Centrality measures, community detection, motifs, similarity",
-                kernel_count: 15,
+                description: "Centrality, GNN inference, community detection, similarity, topology",
+                kernel_count: 28,
                 feature: "graph",
             },
             DomainInfo {
                 domain: Domain::StatisticalML,
                 name: "Statistical ML",
-                description: "Clustering, anomaly detection, regression, ensemble methods",
-                kernel_count: 6,
+                description: "Clustering, NLP embeddings, federated learning, anomaly detection, explainability",
+                kernel_count: 17,
                 feature: "ml",
             },
             DomainInfo {
                 domain: Domain::Compliance,
                 name: "Compliance",
-                description: "AML, KYC, sanctions screening, transaction monitoring",
-                kernel_count: 9,
+                description: "AML pattern detection, KYC scoring, sanctions screening, transaction monitoring",
+                kernel_count: 11,
                 feature: "compliance",
             },
             DomainInfo {
                 domain: Domain::TemporalAnalysis,
                 name: "Temporal Analysis",
-                description: "Forecasting, change detection, seasonal decomposition",
+                description: "ARIMA, Prophet decomposition, change-point detection, seasonal decomposition",
                 kernel_count: 7,
                 feature: "temporal",
             },
             DomainInfo {
                 domain: Domain::RiskAnalytics,
                 name: "Risk Analytics",
-                description: "Credit risk, Monte Carlo VaR, portfolio risk aggregation",
-                kernel_count: 4,
+                description: "Credit risk, Monte Carlo VaR, portfolio risk, stress testing, correlation",
+                kernel_count: 5,
                 feature: "risk",
             },
             DomainInfo {
@@ -232,7 +225,7 @@ pub mod catalog {
             DomainInfo {
                 domain: Domain::BehavioralAnalytics,
                 name: "Behavioral Analytics",
-                description: "User profiling, anomaly profiling, forensics",
+                description: "Profiling, anomaly profiling, forensics, event correlation",
                 kernel_count: 6,
                 feature: "behavioral",
             },
@@ -246,8 +239,8 @@ pub mod catalog {
             DomainInfo {
                 domain: Domain::ProcessIntelligence,
                 name: "Process Intelligence",
-                description: "Process mining, conformance checking, DFG construction",
-                kernel_count: 4,
+                description: "DFG construction, conformance checking, digital twin, next-activity prediction",
+                kernel_count: 7,
                 feature: "procint",
             },
             DomainInfo {
@@ -267,8 +260,8 @@ pub mod catalog {
             DomainInfo {
                 domain: Domain::Accounting,
                 name: "Accounting",
-                description: "Chart of accounts mapping, reconciliation, network analysis",
-                kernel_count: 5,
+                description: "Network generation, reconciliation, duplicate detection, currency conversion",
+                kernel_count: 9,
                 feature: "accounting",
             },
             DomainInfo {
@@ -293,6 +286,11 @@ pub mod catalog {
         domains().iter().map(|d| d.kernel_count).sum()
     }
 
+    /// Check if a specific domain is enabled via compile-time features.
+    pub fn is_domain_enabled(feature: &str) -> bool {
+        enabled_domains().contains(&feature)
+    }
+
     /// Get enabled domains based on compile-time features.
     #[allow(clippy::vec_init_then_push)]
     pub fn enabled_domains() -> Vec<&'static str> {
@@ -331,6 +329,62 @@ pub mod catalog {
     }
 }
 
+/// Register all enabled domain kernels into a registry.
+///
+/// This is the primary entrypoint for populating a `KernelRegistry` with
+/// all available kernels based on compile-time feature flags.
+///
+/// # Errors
+///
+/// Returns an error if any kernel registration fails.
+pub fn register_all(
+    registry: &rustkernel_core::registry::KernelRegistry,
+) -> rustkernel_core::error::Result<()> {
+    #[cfg(feature = "graph")]
+    rustkernel_graph::register_all(registry)?;
+
+    #[cfg(feature = "ml")]
+    rustkernel_ml::register_all(registry)?;
+
+    #[cfg(feature = "compliance")]
+    rustkernel_compliance::register_all(registry)?;
+
+    #[cfg(feature = "temporal")]
+    rustkernel_temporal::register_all(registry)?;
+
+    #[cfg(feature = "risk")]
+    rustkernel_risk::register_all(registry)?;
+
+    #[cfg(feature = "banking")]
+    rustkernel_banking::register_all(registry)?;
+
+    #[cfg(feature = "behavioral")]
+    rustkernel_behavioral::register_all(registry)?;
+
+    #[cfg(feature = "orderbook")]
+    rustkernel_orderbook::register_all(registry)?;
+
+    #[cfg(feature = "procint")]
+    rustkernel_procint::register_all(registry)?;
+
+    #[cfg(feature = "clearing")]
+    rustkernel_clearing::register_all(registry)?;
+
+    #[cfg(feature = "treasury")]
+    rustkernel_treasury::register_all(registry)?;
+
+    #[cfg(feature = "accounting")]
+    rustkernel_accounting::register_all(registry)?;
+
+    #[cfg(feature = "payments")]
+    rustkernel_payments::register_all(registry)?;
+
+    #[cfg(feature = "audit")]
+    rustkernel_audit::register_all(registry)?;
+
+    Ok(())
+}
+
 #[cfg(test)]
 mod tests {
     use super::*;
@@ -348,13 +402,14 @@ mod tests {
     #[allow(clippy::const_is_empty)]
     fn test_version() {
         assert!(!version::VERSION.is_empty());
+        assert_eq!(version::MIN_RINGKERNEL_VERSION, "0.4.2");
     }
 
     #[test]
     fn test_catalog() {
         let domains = catalog::domains();
         assert_eq!(domains.len(), 14);
-        assert_eq!(catalog::total_kernel_count(), 72);
+        assert_eq!(catalog::total_kernel_count(), 106);
     }
 
     #[test]
@@ -364,4 +419,12 @@ mod tests {
         assert!(enabled.contains(&"graph"));
         assert!(enabled.contains(&"ml"));
     }
+
+    #[test]
+    fn test_register_all() {
+        let registry = rustkernel_core::registry::KernelRegistry::new();
+        register_all(&registry).unwrap();
+        // Verify kernels were registered (at least default features)
+        assert!(registry.total_count() > 0);
+    }
 }
diff --git a/docs/src/README.md b/docs/src/README.md
index a7acda7..1ce3ad0 100644
--- a/docs/src/README.md
+++ b/docs/src/README.md
@@ -1,18 +1,18 @@
 # RustKernels
 
-**GPU-accelerated kernel library for financial services and analytics**
+**GPU-accelerated kernel library for financial services, compliance, and enterprise analytics**
 
-**Version 0.2.0** - Enterprise-ready with security, observability, and resilience
+**Version 0.4.0** | RingKernel 0.4.2 | 106 kernels | 14 domains | 19 crates
 
 ---
 
 ## Overview
 
-RustKernels provides **106 GPU-accelerated algorithms** across **14 domain-specific crates**, designed for financial services, compliance, and enterprise analytics. Ported from the DotCompute C# implementation to Rust, using the RingKernel 0.3.1 framework.
+RustKernels provides **106 GPU-accelerated algorithms** across **14 domain-specific crates**, engineered for financial services, regulatory compliance, and enterprise analytics workloads. Built on [RingKernel 0.4.2](https://crates.io/crates/ringkernel-core), it delivers both CPU-orchestrated batch execution and GPU-persistent ring execution with sub-microsecond message latency.
 
 <div class="warning">
 
-This is a specialized compute library for financial and enterprise workloads, not a general-purpose GPU compute framework.
+RustKernels is a specialized compute library for financial and enterprise workloads. It is not a general-purpose GPU compute framework.
 
 </div>
 
@@ -21,41 +21,44 @@ This is a specialized compute library for financial and enterprise workloads, no
 | Feature | Description |
 |---------|-------------|
 | **14 Domain Categories** | Graph analytics, ML, compliance, risk, treasury, and more |
-| **106 Kernels** | Comprehensive coverage of financial algorithms |
-| **Dual Execution Modes** | Batch (CPU-orchestrated) and Ring (GPU-persistent) |
-| **Enterprise Security** | Auth, RBAC, multi-tenancy, secrets management |
-| **Production Observability** | Metrics, tracing, logging, alerting |
-| **Resilience Patterns** | Circuit breakers, retry, timeouts, health checks |
-| **Service Deployment** | REST (Axum), gRPC (Tonic), Actix actors |
-| **K2K Messaging** | Cross-kernel coordination patterns |
-| **Fixed-Point Arithmetic** | Exact financial calculations |
-
-## Execution Modes
-
-Kernels operate in one of two modes:
+| **106 Kernels** | Comprehensive coverage of financial and analytical algorithms |
+| **Dual Execution Modes** | Batch (CPU-orchestrated) and Ring (GPU-persistent actor) |
+| **Type-Erased Execution** | `TypeErasedBatchKernel` enables REST/gRPC dispatch without compile-time types |
+| **Factory Registration** | `register_batch_typed()` with automatic type inference |
+| **Enterprise Security** | JWT/API key auth, RBAC, multi-tenancy, secrets management |
+| **Production Observability** | Prometheus metrics, OTLP tracing, structured logging, SLO alerting |
+| **Resilience Patterns** | Circuit breakers, retry with backoff, deadline propagation, health probes |
+| **Service Deployment** | REST (Axum), gRPC (Tonic), Tower middleware, Actix actors |
+| **K2K Messaging** | Cross-kernel coordination: iterative, scatter-gather, fan-out, pipeline |
+| **Fixed-Point Arithmetic** | GPU-compatible exact financial calculations |
+| **RingKernel 0.4.2** | Deep integration with GPU-native persistent actor runtime |
+
+## Execution Model
+
+Kernels operate in one of two modes, selected based on latency and throughput requirements:
 
 | Mode | Latency | Overhead | State Location | Best For |
 |------|---------|----------|----------------|----------|
-| **Batch** | 10-50μs | Higher | CPU memory | Heavy periodic computation |
-| **Ring** | 100-500ns | Minimal | GPU memory | High-frequency streaming |
+| **Batch** | 10–50 μs | Higher (CPU round-trip) | CPU memory | Heavy periodic computation |
+| **Ring** | 100–500 ns | Minimal (lock-free) | GPU memory | High-frequency streaming |
 
-Most kernels support both modes. Choose based on your latency requirements.
+Batch kernels implementing `BatchKernel<I, O>` can be executed directly via typed calls or through the type-erased `BatchKernelDyn` interface used by REST and gRPC endpoints. Ring kernels require the RingKernel persistent actor runtime.
 
 ## Domains at a Glance
 
 | Domain | Crate | Kernels | Description |
 |--------|-------|---------|-------------|
 | Graph Analytics | `rustkernel-graph` | 28 | PageRank, community detection, GNN inference, graph attention |
-| Statistical ML | `rustkernel-ml` | 17 | Clustering, NLP embeddings, federated learning, healthcare analytics |
-| Compliance | `rustkernel-compliance` | 11 | AML patterns, KYC, sanctions screening |
-| Temporal Analysis | `rustkernel-temporal` | 7 | Forecasting, anomaly detection, decomposition |
-| Risk Analytics | `rustkernel-risk` | 5 | Credit scoring, VaR, stress testing, correlation |
+| Statistical ML | `rustkernel-ml` | 17 | Clustering, NLP embeddings, federated learning, healthcare |
+| Compliance | `rustkernel-compliance` | 11 | AML patterns, KYC scoring, sanctions screening |
+| Temporal Analysis | `rustkernel-temporal` | 7 | ARIMA, Prophet decomposition, change-point detection |
+| Risk Analytics | `rustkernel-risk` | 5 | Credit scoring, Monte Carlo VaR, stress testing, correlation |
 | Banking | `rustkernel-banking` | 1 | Fraud pattern matching |
 | Behavioral Analytics | `rustkernel-behavioral` | 6 | Profiling, forensics, event correlation |
 | Order Matching | `rustkernel-orderbook` | 1 | Order book matching engine |
 | Process Intelligence | `rustkernel-procint` | 7 | DFG, conformance, digital twin simulation |
 | Clearing | `rustkernel-clearing` | 5 | Netting, settlement, DVP matching |
-| Treasury | `rustkernel-treasury` | 5 | Cash flow, FX hedging, liquidity |
+| Treasury | `rustkernel-treasury` | 5 | Cash flow, FX hedging, liquidity optimization |
 | Accounting | `rustkernel-accounting` | 9 | Network generation, reconciliation |
 | Payments | `rustkernel-payments` | 2 | Payment processing, flow analysis |
 | Audit | `rustkernel-audit` | 2 | Feature extraction, hypergraph construction |
@@ -66,80 +69,87 @@ Add to your `Cargo.toml`:
 
 ```toml
 [dependencies]
-rustkernels = "0.2.0"
+rustkernels = "0.4.0"
 ```
 
 Basic usage:
 
 ```rust
 use rustkernels::prelude::*;
-use rustkernels::graph::centrality::PageRank;
-
-// Create a kernel instance
-let kernel = PageRank::new();
-
-// Access kernel metadata
-let metadata = kernel.metadata();
-println!("Kernel: {}", metadata.id);
-println!("Domain: {:?}", metadata.domain);
-
-// Execute (batch mode)
-let result = kernel.execute(input).await?;
+use rustkernels::graph::centrality::{BetweennessCentrality, BetweennessCentralityInput};
+
+#[tokio::main]
+async fn main() -> Result<(), Box<dyn std::error::Error>> {
+    let kernel = BetweennessCentrality::new();
+    println!("Kernel: {}", kernel.metadata().id);
+
+    let input = BetweennessCentralityInput {
+        num_nodes: 4,
+        edges: vec![(0, 1), (1, 2), (2, 3), (0, 3)],
+        normalized: true,
+    };
+
+    let result = kernel.execute(input).await?;
+    for (node, score) in result.scores.iter().enumerate() {
+        println!("  Node {}: {:.4}", node, score);
+    }
+    Ok(())
+}
 ```
 
 ### Feature Flags
 
-Control which domains are compiled:
-
 ```toml
-# Only what you need
-rustkernels = { version = "0.2.0", features = ["graph", "risk"] }
+# Default domains (graph, ml, compliance, temporal, risk)
+rustkernels = "0.4.0"
+
+# Selective compilation
+rustkernels = { version = "0.4.0", features = ["graph", "accounting"] }
 
-# Everything
-rustkernels = { version = "0.2.0", features = ["full"] }
+# All 14 domains
+rustkernels = { version = "0.4.0", features = ["full"] }
 
 # Service deployment
-rustkernel-ecosystem = { version = "0.2.0", features = ["axum", "grpc"] }
+rustkernel-ecosystem = { version = "0.4.0", features = ["axum", "grpc"] }
 ```
 
-Default features: `graph`, `ml`, `compliance`, `temporal`, `risk`.
+## Enterprise Features
 
-## Enterprise Features (0.2.0)
-
-Version 0.2.0 introduces production-ready enterprise capabilities:
+Version 0.4.0 provides production-ready enterprise capabilities with deep RingKernel 0.4.2 integration:
 
 | Module | Features |
 |--------|----------|
 | **Security** | JWT/API key auth, RBAC, multi-tenancy, secrets management |
-| **Observability** | Prometheus metrics, OTLP tracing, structured logging, alerting |
-| **Resilience** | Circuit breakers, retry with backoff, deadline propagation, health checks |
-| **Runtime** | Lifecycle management, graceful shutdown, configuration presets |
-| **Ecosystem** | Axum REST API, Tower middleware, Tonic gRPC, Actix actors |
+| **Observability** | Prometheus metrics, OTLP tracing, structured logging, SLO alerting |
+| **Resilience** | Circuit breakers, retry with backoff, deadline propagation, health probes |
+| **Runtime** | Lifecycle state machine, graceful shutdown, configuration presets |
+| **Memory** | Size-stratified pooling, pressure handling, multi-phase reductions |
+| **Ecosystem** | Axum REST with real execution, Tower middleware, Tonic gRPC, Actix actors |
 
 See [Enterprise Features](enterprise/security.md) for detailed documentation.
 
 ## Requirements
 
 - **Rust 1.85** or later
-- **RustCompute** (RingKernel framework)
-- **CUDA toolkit** (optional, falls back to CPU execution)
+- **RingKernel 0.4.2** (from [crates.io](https://crates.io/crates/ringkernel-core))
+- **CUDA toolkit** (optional; falls back to CPU execution)
 
 ## Project Structure
 
 ```
 crates/
-├── rustkernels/          # Facade crate, re-exports all domains
-├── rustkernel-core/      # Core traits, registry, enterprise modules
-│   ├── security/         # Auth, RBAC, multi-tenancy
-│   ├── observability/    # Metrics, tracing, logging
-│   ├── resilience/       # Circuit breaker, retry, health
-│   ├── runtime/          # Lifecycle, configuration
-│   ├── memory/           # Pooling, reductions
-│   └── config/           # Production configuration
-├── rustkernel-ecosystem/ # Service integrations (Axum, gRPC, Actix)
-├── rustkernel-derive/    # Procedural macros
-├── rustkernel-cli/       # Command-line interface
-└── rustkernel-{domain}/  # 14 domain-specific crates
+├── rustkernels/             # Facade crate — re-exports all domains
+├── rustkernel-core/         # Core traits, registry, enterprise modules
+│   ├── security/            # Auth, RBAC, multi-tenancy
+│   ├── observability/       # Metrics, tracing, logging
+│   ├── resilience/          # Circuit breaker, retry, health
+│   ├── runtime/             # Lifecycle, configuration
+│   ├── memory/              # Pooling, reductions
+│   └── config/              # Production configuration
+├── rustkernel-ecosystem/    # Service integrations (Axum, gRPC, Actix)
+├── rustkernel-derive/       # Procedural macros
+├── rustkernel-cli/          # Command-line interface
+└── rustkernel-{domain}/     # 14 domain-specific crates
 ```
 
 ## Building
@@ -148,12 +158,15 @@ crates/
 # Build entire workspace
 cargo build --workspace
 
-# Run all tests
+# Run all tests (895 tests)
 cargo test --workspace
 
-# Test single domain
+# Test a single domain
 cargo test --package rustkernel-graph
 
+# Lint with warnings as errors
+cargo clippy --all-targets --all-features -- -D warnings
+
 # Generate API documentation
 cargo doc --workspace --no-deps --open
 ```
@@ -166,6 +179,6 @@ Licensed under [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0). See LI
 
 <div style="text-align: center; margin-top: 2rem;">
 
-[Getting Started](getting-started/installation.md) | [Enterprise Features](enterprise/security.md) | [Kernel Catalogue](domains/README.md) | [Articles](articles/README.md)
+[Getting Started](getting-started/installation.md) | [Architecture](architecture/overview.md) | [Enterprise Features](enterprise/security.md) | [Kernel Catalogue](domains/README.md)
 
 </div>
diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md
index 3786644..9ec34ba 100644
--- a/docs/src/SUMMARY.md
+++ b/docs/src/SUMMARY.md
@@ -18,7 +18,7 @@
 
 ---
 
-# Enterprise Features (0.2.0)
+# Enterprise Features
 
 - [Security](enterprise/security.md)
 - [Observability](enterprise/observability.md)
diff --git a/docs/src/appendix/changelog.md b/docs/src/appendix/changelog.md
index 2bad788..37d9d23 100644
--- a/docs/src/appendix/changelog.md
+++ b/docs/src/appendix/changelog.md
@@ -9,6 +9,67 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ---
 
+## [0.4.0] - 2026-02-07
+
+### Added
+
+#### Production-Ready Kernel Execution
+- **TypeErasedBatchKernel**: Bridges typed `BatchKernel<I, O>` to `BatchKernelDyn` via JSON serialization, enabling REST/gRPC dispatch without compile-time type knowledge
+- **TypeErasedRingKernel**: Equivalent wrapper for ring kernels
+- **Factory Registration**: `register_batch_typed(factory)` with automatic type inference from `BatchKernel<I, O>` implementations
+- **Metadata Registration**: `register_batch_metadata_from(factory)` and `register_ring_metadata_from(factory)` for discovery-only registration
+- **Registry Execution**: `execute_batch(kernel_id, input_json)` convenience method for type-erased execution
+
+#### Real Ecosystem Execution (Replacing All Stubs)
+- **Axum**: `execute_kernel()` now performs real batch kernel dispatch with configurable timeout; ring kernels return HTTP 422 with guidance
+- **Tower**: `KernelService::execute()` performs real batch kernel dispatch
+- **Tonic gRPC**: `execute_kernel()` performs real execution with gRPC deadline support; exceeded deadlines return `DEADLINE_EXCEEDED`
+- **Actix**: Actor handler performs real execution via `tokio::task::block_in_place` bridge
+- **Health endpoint**: Aggregated component health with registry status and error rate
+- **Metrics endpoint**: Per-domain kernel counts, batch/ring breakdown, error rate
+
+#### Deep RingKernel 0.4.2 Integration
+- Bidirectional domain conversion (`Domain::to_ring_domain()`, `Domain::from_ring_domain()`)
+- New re-exports: `ControlBlock`, `Backend`, `KernelStatus`, `RuntimeMetrics`, `K2KConfig`, `Priority`
+- Submodule re-exports: `checkpoint`, `dispatcher`, `health`, `pubsub`
+- Ring message type ID ranges aligned with `ringkernel_core::domain::Domain` base offsets
+
+### Changed
+- Upgraded to RingKernel 0.4.2 from 0.3.1
+- All 14 domain crates now use factory-based registration (`register_batch_typed`, `register_batch_metadata_from`, `register_ring_metadata_from`)
+- Ecosystem integrations execute real kernels instead of returning mock responses
+- Updated prelude with `BatchKernelDyn`, `RingKernelDyn`, `TypeErasedBatchKernel`, `TypeErasedRingKernel`
+
+---
+
+## [0.3.0] - 2026-01-28
+
+### Added
+
+#### New Kernels (24 kernels added, bringing total to 106)
+- **Graph**: GNNInference, GraphAttention, TopologicalSort, CycleDetection, ShortestPath, BipartiteMatching, GraphColoring
+- **ML**: EmbeddingGeneration, SemanticSimilarity, SecureAggregation, DrugInteractionPrediction, ClinicalPathwayConformance, StreamingIsolationForest, AdaptiveThreshold, SHAPValues, FeatureImportance
+- **Compliance**: FlowReversalPattern, FlowSplitRatio
+- **Process Intelligence**: DigitalTwin, NextActivityPrediction, EventLogImputation
+- **Risk**: CorrelationStress
+- **Accounting**: DuplicateDetection, CurrencyConversion
+
+#### Enterprise Enhancements
+- Ring message definitions for all 14 domains with `#[derive(RingMessage)]`
+- K2K coordination patterns: `IterativeState`, `ScatterGatherState`, `FanOutTracker`, `PipelineTracker`
+- Domain-specific Ring message type ID ranges (100–799)
+
+### Changed
+- Upgraded to RingKernel 0.3.1
+- Graph analytics expanded from 21 to 28 kernels
+- ML expanded from 8 to 17 kernels
+- Risk analytics expanded from 4 to 5 kernels
+- Process intelligence expanded from 4 to 7 kernels
+- Accounting expanded from 7 to 9 kernels
+- Compliance expanded from 9 to 11 kernels
+
+---
+
 ## [0.2.0] - 2026-01-19
 
 ### Added
@@ -34,7 +95,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **Recovery Policies**: Configurable recovery strategies for kernel failures
 
 #### Runtime Lifecycle (`rustkernel-core/src/runtime/`)
-- **Lifecycle State Machine**: Starting → Running → Draining → Stopped
+- **Lifecycle State Machine**: Starting -> Running -> Draining -> Stopped
 - **Runtime Presets**: Development, production, and high-performance configurations
 - **Graceful Shutdown**: Drain period with active connection tracking
 - **Configuration Validation**: Runtime parameter validation with hot reload support
@@ -42,14 +103,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 #### Memory Management (`rustkernel-core/src/memory/`)
 - **Size-Stratified Pooling**: `KernelMemoryManager` with bucket-based allocation
 - **Pressure Handling**: Configurable thresholds with `PressureLevel` enum
-- **Multi-Phase Reductions**: `InterPhaseReduction<T>` for iterative algorithms (PageRank, K-Means)
+- **Multi-Phase Reductions**: `InterPhaseReduction<T>` for iterative algorithms
 - **Analytics Contexts**: Workload-specific buffer management via `AnalyticsContextManager`
 - **Sync Modes**: Cooperative, SoftwareBarrier, and MultiLaunch synchronization
 
 #### Production Configuration (`rustkernel-core/src/config/`)
 - **Unified Config**: `ProductionConfig` combining all enterprise settings
 - **Builder Pattern**: `ProductionConfigBuilder` with fluent API
-- **Environment Loading**: `from_env()` with RUSTKERNEL_* variable overrides
+- **Environment Loading**: `from_env()` with `RUSTKERNEL_*` variable overrides
 - **File Loading**: TOML configuration file support via `from_file()`
 
 #### Ecosystem Integrations (`rustkernel-ecosystem/`)
@@ -57,7 +118,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **Axum REST API**: `KernelRouter` with endpoints for kernels, execute, health, metrics
 - **Tower Middleware**: `TimeoutLayer`, `RateLimiterLayer`, `KernelService`
 - **gRPC Server**: `KernelGrpcServer` via Tonic
-- **Actix Actors**: `KernelActor` with message handlers for GPU-persistent actors
+- **Actix Actors**: `KernelActor` with message handlers
 
 #### Enhanced Core Traits
 - `GpuKernel`: Added `health_check()`, `shutdown()`, `refresh_config()` methods
@@ -68,9 +129,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **New Trait**: `IterativeKernel` for multi-pass algorithms
 
 #### CLI Enhancements
-- `rustkernel runtime status|show|init` - Runtime lifecycle management
-- `rustkernel health [--format json]` - Component health checks
-- `rustkernel config show|validate|generate|env` - Configuration management
+- `rustkernel runtime status|show|init` — Runtime lifecycle management
+- `rustkernel health [--format json]` — Component health checks
+- `rustkernel config show|validate|generate|env` — Configuration management
 
 ### Changed
 - Upgraded to RingKernel 0.3.1 from 0.2.0
@@ -78,10 +139,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Updated Tokio to 1.48
 - Enhanced prelude with all enterprise module exports
 
-### Documentation
-- Updated CLAUDE.md with enterprise features
-- Added code examples for all new modules
-
 ---
 
 ## [0.1.1] - 2026-01-15
@@ -104,73 +161,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `rustkernel-derive` with `#[gpu_kernel]` and `#[derive(KernelMessage)]` macros
 - `rustkernel-cli` command-line interface
 
-#### Graph Analytics (21 kernels)
-- PageRank, DegreeCentrality, BetweennessCentrality
-- ClosenessCentrality, EigenvectorCentrality, KatzCentrality
-- ModularityScore, LouvainCommunity, LabelPropagation
-- JaccardSimilarity, CosineSimilarity, AdamicAdarIndex, CommonNeighbors
-- GraphDensity, AveragePathLength, ClusteringCoefficient
-- ConnectedComponents, FullGraphMetrics
-- TriangleCounting, MotifDetection, KCliqueDetection
-
-#### Statistical ML (8 kernels)
-- KMeans, DBSCAN, HierarchicalClustering
-- IsolationForest, LocalOutlierFactor, EnsembleVoting
-- LinearRegression, RidgeRegression
-
-#### Compliance (9 kernels)
-- CircularFlowRatio, ReciprocityFlowRatio, RapidMovement
-- AMLPatternDetection, KYCScoring, EntityResolution
-- SanctionsScreening, PEPScreening, TransactionMonitoring
-
-#### Temporal Analysis (7 kernels)
-- ARIMAForecast, ProphetDecomposition, ChangePointDetection
-- TimeSeriesAnomalyDetection, SeasonalDecomposition
-- TrendExtraction, VolatilityAnalysis
-
-#### Risk Analytics (4 kernels)
-- CreditRiskScoring, MonteCarloVaR
-- PortfolioRiskAggregation, StressTesting
-
-#### Banking (1 kernel)
-- FraudPatternMatch
-
-#### Behavioral Analytics (6 kernels)
-- BehavioralProfiling, AnomalyProfiling, FraudSignatureDetection
-- CausalGraphConstruction, ForensicQueryExecution, EventCorrelationKernel
-
-#### Order Matching (1 kernel)
-- OrderMatchingEngine
-
-#### Process Intelligence (4 kernels)
-- DFGConstruction, PartialOrderAnalysis
-- ConformanceChecking, OCPMPatternMatching
-
-#### Clearing (5 kernels)
-- ClearingValidation, DVPMatching, NettingCalculation
-- SettlementExecution, ZeroBalanceFrequency
-
-#### Treasury (5 kernels)
-- CashFlowForecasting, CollateralOptimization
-- FXHedging, InterestRateRisk, LiquidityOptimization
-
-#### Accounting (7 kernels)
-- ChartOfAccountsMapping, JournalTransformation
-- GLReconciliation, NetworkAnalysis, TemporalCorrelation
-- NetworkGeneration with enhanced features:
-  - Account classification (11 classes)
-  - VAT/tax detection (EU, GST/HST rates)
-  - Transaction pattern recognition (14 patterns)
-  - Confidence boosting
-- NetworkGenerationRing (streaming mode)
-
-#### Payments (2 kernels)
-- PaymentProcessing, FlowAnalysis
-
-#### Audit (2 kernels)
-- FeatureExtraction, HypergraphConstruction
-
-### Infrastructure Features
+#### 82 Kernels across 14 Domains
+- **Graph Analytics** (21): PageRank, DegreeCentrality, BetweennessCentrality, ClosenessCentrality, EigenvectorCentrality, KatzCentrality, ModularityScore, LouvainCommunity, LabelPropagation, JaccardSimilarity, CosineSimilarity, AdamicAdarIndex, CommonNeighbors, GraphDensity, AveragePathLength, ClusteringCoefficient, ConnectedComponents, FullGraphMetrics, TriangleCounting, MotifDetection, KCliqueDetection
+- **Statistical ML** (8): KMeans, DBSCAN, HierarchicalClustering, IsolationForest, LocalOutlierFactor, EnsembleVoting, LinearRegression, RidgeRegression
+- **Compliance** (9): CircularFlowRatio, ReciprocityFlowRatio, RapidMovement, AMLPatternDetection, KYCScoring, EntityResolution, SanctionsScreening, PEPScreening, TransactionMonitoring
+- **Temporal Analysis** (7): ARIMAForecast, ProphetDecomposition, ChangePointDetection, TimeSeriesAnomalyDetection, SeasonalDecomposition, TrendExtraction, VolatilityAnalysis
+- **Risk Analytics** (4): CreditRiskScoring, MonteCarloVaR, PortfolioRiskAggregation, StressTesting
+- **Banking** (1): FraudPatternMatch
+- **Behavioral Analytics** (6): BehavioralProfiling, AnomalyProfiling, FraudSignatureDetection, CausalGraphConstruction, ForensicQueryExecution, EventCorrelationKernel
+- **Order Matching** (1): OrderMatchingEngine
+- **Process Intelligence** (4): DFGConstruction, PartialOrderAnalysis, ConformanceChecking, OCPMPatternMatching
+- **Clearing** (5): ClearingValidation, DVPMatching, NettingCalculation, SettlementExecution, ZeroBalanceFrequency
+- **Treasury** (5): CashFlowForecasting, CollateralOptimization, FXHedging, InterestRateRisk, LiquidityOptimization
+- **Accounting** (7): ChartOfAccountsMapping, JournalTransformation, GLReconciliation, NetworkAnalysis, TemporalCorrelation, NetworkGeneration, NetworkGenerationRing
+- **Payments** (2): PaymentProcessing, FlowAnalysis
+- **Audit** (2): FeatureExtraction, HypergraphConstruction
+
+#### Infrastructure Features
 - Batch and Ring execution modes
 - K2K (kernel-to-kernel) messaging patterns
 - Fixed-point arithmetic for financial precision
@@ -183,14 +190,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 | Version | Date | Highlights |
 |---------|------|------------|
-| 0.2.0 | 2026-01-19 | Enterprise features: security, observability, resilience, ecosystem |
+| 0.4.0 | 2026-02-07 | Production execution, type erasure, RingKernel 0.4.2, real ecosystem dispatch |
+| 0.3.0 | 2026-01-28 | 24 new kernels (106 total), Ring messages, K2K coordination |
+| 0.2.0 | 2026-01-19 | Enterprise features: security, observability, resilience, ecosystem crate |
 | 0.1.1 | 2026-01-15 | Crate rename to rustkernels, documentation |
-| 0.1.0 | 2026-01-12 | Initial release, 106 kernels across 14 domains |
+| 0.1.0 | 2026-01-12 | Initial release, 82 kernels across 14 domains |
 
 ---
 
 ## Migration Guides
 
+### From 0.3.x to 0.4.0
+
+1. **Registration**: Replace `register_metadata(kernel.metadata().clone())` with factory-based methods:
+   - `register_batch_typed(MyKernel::new)` for kernels implementing `BatchKernel<I, O>`
+   - `register_batch_metadata_from(MyKernel::new)` for metadata-only batch kernels
+   - `register_ring_metadata_from(MyKernel::new)` for ring kernels
+2. **Execution**: Use `registry.execute_batch(id, json_bytes)` for type-erased execution
+3. **Ecosystem**: All service endpoints now execute real kernels — remove any mock/fallback code
+
 ### From DotCompute (C#)
 
 RustKernels is a Rust port of DotCompute. Key differences:
@@ -199,5 +217,3 @@ RustKernels is a Rust port of DotCompute. Key differences:
 2. **Ownership**: Rust ownership model affects API design
 3. **Error handling**: Uses `Result<T, E>` instead of exceptions
 4. **Ring messages**: Use rkyv serialization instead of protobuf
-
-See migration guide (coming soon) for detailed instructions.
diff --git a/docs/src/architecture/overview.md b/docs/src/architecture/overview.md
index a702e99..9f882f2 100644
--- a/docs/src/architecture/overview.md
+++ b/docs/src/architecture/overview.md
@@ -1,12 +1,12 @@
 # Architecture Overview
 
-RustKernels is designed as a modular, high-performance GPU kernel library for financial and enterprise workloads. This document explains the system architecture and key design decisions.
+RustKernels is a modular, high-performance GPU kernel library for financial and enterprise workloads. This document describes the system architecture and key design decisions.
 
 ## System Design
 
 ```
 ┌─────────────────────────────────────────────────────────────────┐
-│                        rustkernel (facade)                       │
+│                       rustkernels (facade)                       │
 │                    Re-exports all domain crates                  │
 └─────────────────────────────────────────────────────────────────┘
                                   │
@@ -19,37 +19,43 @@ RustKernels is designed as a modular, high-performance GPU kernel library for fi
 │ - Traits        │   │ - #[gpu_kernel] │   │ - CLI tool      │
 │ - Registry      │   │ - #[derive(...)]│   │ - Management    │
 │ - K2K messaging │   │                 │   │                 │
-│ - Licensing     │   │                 │   │                 │
+│ - Enterprise    │   │                 │   │                 │
+│   modules       │   │                 │   │                 │
 └─────────────────┘   └─────────────────┘   └─────────────────┘
           │
-          ▼
-┌─────────────────────────────────────────────────────────────────┐
-│                     14 Domain Crates                             │
-│                                                                  │
-│  graph  │  ml  │ compliance │ temporal │ risk │ banking │ ...   │
-│                                                                  │
-│  Each domain implements domain-specific kernels using the core   │
-│  traits and infrastructure                                       │
-└─────────────────────────────────────────────────────────────────┘
+          ├──────────────────────────────────────┐
+          │                                      │
+          ▼                                      ▼
+┌───────────────────────────────────┐   ┌─────────────────┐
+│          14 Domain Crates         │   │   rustkernel-   │
+│                                   │   │   ecosystem     │
+│  graph │ ml │ compliance │ risk  │   │                 │
+│  temporal │ banking │ procint    │   │ - Axum REST     │
+│  behavioral │ orderbook │ ...   │   │ - Tower         │
+│                                   │   │ - Tonic gRPC    │
+│  Each implements domain-specific  │   │ - Actix actors  │
+│  kernels using core traits        │   │                 │
+└───────────────────────────────────┘   └─────────────────┘
           │
           ▼
 ┌─────────────────────────────────────────────────────────────────┐
-│                    RustCompute (RingKernel)                      │
-│                    GPU execution framework                       │
+│                    RingKernel 0.4.2 (crates.io)                  │
+│          GPU-native persistent actor runtime framework           │
 └─────────────────────────────────────────────────────────────────┘
 ```
 
 ## Workspace Structure
 
-The workspace contains 18 crates organized by concern:
+The workspace contains **19 crates** organized by concern:
 
 ### Infrastructure Crates
 
 | Crate | Purpose |
 |-------|---------|
-| `rustkernel` | Facade crate, re-exports all domains |
-| `rustkernel-core` | Core traits, registry, licensing, K2K coordination |
+| `rustkernels` | Facade crate — re-exports all domains |
+| `rustkernel-core` | Core traits, registry, licensing, K2K coordination, enterprise modules |
 | `rustkernel-derive` | Procedural macros for kernel definition |
+| `rustkernel-ecosystem` | Service integrations (Axum, Tower, Tonic, Actix) |
 | `rustkernel-cli` | Command-line interface for kernel management |
 
 ### Domain Crates
@@ -58,18 +64,18 @@ The workspace contains 18 crates organized by concern:
 
 ```
 crates/
-├── rustkernel-graph/        # Graph analytics (21 kernels)
-├── rustkernel-ml/           # Statistical ML (8 kernels)
-├── rustkernel-compliance/   # AML/KYC (9 kernels)
+├── rustkernel-graph/        # Graph analytics (28 kernels)
+├── rustkernel-ml/           # Statistical ML (17 kernels)
+├── rustkernel-compliance/   # AML/KYC (11 kernels)
 ├── rustkernel-temporal/     # Time series (7 kernels)
-├── rustkernel-risk/         # Risk analytics (4 kernels)
+├── rustkernel-risk/         # Risk analytics (5 kernels)
 ├── rustkernel-banking/      # Banking (1 kernel)
 ├── rustkernel-behavioral/   # Behavioral (6 kernels)
 ├── rustkernel-orderbook/    # Order matching (1 kernel)
-├── rustkernel-procint/      # Process intelligence (4 kernels)
+├── rustkernel-procint/      # Process intelligence (7 kernels)
 ├── rustkernel-clearing/     # Clearing/settlement (5 kernels)
 ├── rustkernel-treasury/     # Treasury (5 kernels)
-├── rustkernel-accounting/   # Accounting (7 kernels)
+├── rustkernel-accounting/   # Accounting (9 kernels)
 ├── rustkernel-payments/     # Payments (2 kernels)
 └── rustkernel-audit/        # Audit (2 kernels)
 ```
@@ -84,11 +90,20 @@ The base trait for all kernels:
 
 ```rust
 pub trait GpuKernel: Send + Sync + Debug {
-    /// Returns kernel metadata (ID, domain, mode, etc.)
+    /// Returns kernel metadata (ID, domain, mode, performance targets)
     fn metadata(&self) -> &KernelMetadata;
 
     /// Validates kernel configuration
     fn validate(&self) -> Result<()>;
+
+    /// Health check (enterprise)
+    fn health_check(&self) -> HealthStatus { HealthStatus::Healthy }
+
+    /// Graceful shutdown
+    async fn shutdown(&self) -> Result<()> { Ok(()) }
+
+    /// Hot-reload configuration
+    fn refresh_config(&mut self, config: &KernelConfig) -> Result<()> { Ok(()) }
 }
 ```
 
@@ -98,11 +113,33 @@ For CPU-orchestrated batch execution:
 
 ```rust
 pub trait BatchKernel<I, O>: GpuKernel {
-    /// Execute the kernel with given input
+    /// Execute the kernel with typed input
     async fn execute(&self, input: I) -> Result<O>;
+
+    /// Execute with auth, tenant, and tracing context
+    async fn execute_with_context(&self, ctx: &ExecutionContext, input: I) -> Result<O>;
+
+    /// Validate input before execution
+    fn validate_input(&self, input: &I) -> Result<()> { Ok(()) }
+}
+```
+
+### BatchKernelDyn and TypeErasedBatchKernel
+
+For type-erased execution via REST/gRPC:
+
+```rust
+/// Dynamic dispatch trait — JSON bytes in, JSON bytes out
+pub trait BatchKernelDyn: GpuKernel {
+    async fn execute_dyn(&self, input: &[u8]) -> Result<Vec<u8>>;
 }
+
+/// Bridges typed BatchKernel<I,O> to BatchKernelDyn via JSON serialization
+pub struct TypeErasedBatchKernel<K, I, O> { /* ... */ }
 ```
 
+Kernels registered via `register_batch_typed()` are automatically wrapped in `TypeErasedBatchKernel`, enabling execution through the ecosystem service layer without compile-time knowledge of input and output types.
+
 ### RingKernelHandler
 
 For GPU-persistent actor execution:
@@ -115,6 +152,9 @@ where
 {
     /// Handle a message and produce a response
     async fn handle(&self, ctx: &mut RingContext, msg: M) -> Result<R>;
+
+    /// Handle with security context
+    async fn handle_secure(&self, ctx: &mut SecureRingContext, msg: M) -> Result<R>;
 }
 ```
 
@@ -130,48 +170,65 @@ pub trait IterativeKernel<S, I, O>: GpuKernel {
     /// Perform one iteration
     async fn iterate(&self, state: &mut S, input: &I) -> Result<IterationResult<O>>;
 
-    /// Check if algorithm has converged
+    /// Check convergence
     fn converged(&self, state: &S, threshold: f64) -> bool;
 }
 ```
 
-## Kernel Metadata
+### Additional Traits
 
-Every kernel has associated metadata:
+| Trait | Purpose |
+|-------|---------|
+| `CheckpointableKernel` | Save/restore kernel state for recovery |
+| `DegradableKernel` | Graceful degradation under pressure |
 
-```rust
-pub struct KernelMetadata {
-    /// Unique identifier (e.g., "graph/pagerank")
-    pub id: String,
+## Kernel Registration
 
-    /// Execution mode
-    pub mode: KernelMode,
+The `KernelRegistry` provides three registration methods:
 
-    /// Business domain
-    pub domain: Domain,
+| Method | Use Case |
+|--------|----------|
+| `register_batch_typed(factory)` | Kernels with `BatchKernel<I, O>` — full execution support via REST/gRPC |
+| `register_batch_metadata_from(factory)` | Batch kernels with `GpuKernel` only — metadata and discovery |
+| `register_ring_metadata_from(factory)` | Ring kernels — metadata only (require Ring runtime for execution) |
 
-    /// Human-readable description
-    pub description: String,
+Example:
 
-    /// Expected throughput (ops/sec)
-    pub expected_throughput: u64,
+```rust
+pub fn register_all(registry: &KernelRegistry) -> Result<()> {
+    // Full execution support — callable via REST/gRPC
+    registry.register_batch_typed(BetweennessCentrality::new)?;
 
-    /// Target latency in microseconds
-    pub target_latency_us: f64,
+    // Metadata-only — discoverable but not directly executable via REST
+    registry.register_batch_metadata_from(GraphDensity::new)?;
 
-    /// Whether GPU-native execution is required
-    pub requires_gpu_native: bool,
+    // Ring kernel — requires RingKernel runtime
+    registry.register_ring_metadata_from(PageRankRing::new)?;
 
-    /// Kernel version
-    pub version: u32,
+    Ok(())
 }
 ```
 
-## K2K (Kernel-to-Kernel) Messaging
+## Kernel Metadata
 
-RustKernels supports cross-kernel coordination through K2K messaging patterns:
+Every kernel carries associated metadata:
 
-### Available Patterns
+```rust
+pub struct KernelMetadata {
+    pub id: String,                  // e.g., "graph/pagerank"
+    pub mode: KernelMode,           // Batch or Ring
+    pub domain: Domain,             // Business domain
+    pub description: String,        // Human-readable description
+    pub expected_throughput: u64,    // Operations per second
+    pub target_latency_us: f64,     // Target latency in microseconds
+    pub requires_gpu_native: bool,  // GPU-only or CPU fallback available
+    pub version: u32,               // Kernel implementation version
+}
+```
+
+## K2K (Kernel-to-Kernel) Messaging
+
+Cross-kernel coordination patterns for complex multi-stage computations:
 
 | Pattern | Use Case |
 |---------|----------|
@@ -188,10 +245,7 @@ use rustkernel_core::k2k::IterativeState;
 let mut state = IterativeState::new(max_iterations);
 
 while !state.converged() {
-    // Execute iteration across kernels
     let results = execute_iteration(&mut state).await?;
-
-    // Update convergence tracking
     state.update(results.delta);
 }
 ```
@@ -206,7 +260,7 @@ rustkernel-{domain}/
 └── src/
     ├── lib.rs           # Module exports, register_all()
     ├── messages.rs      # Batch kernel input/output types
-    ├── ring_messages.rs # Ring message types
+    ├── ring_messages.rs # Ring message types with #[derive(RingMessage)]
     ├── types.rs         # Common domain types
     └── {feature}.rs     # Kernel implementations
 ```
@@ -220,28 +274,68 @@ rustkernel-graph/
     ├── messages.rs
     ├── ring_messages.rs
     ├── types.rs
-    ├── centrality.rs    # PageRank, Betweenness, etc.
+    ├── centrality.rs    # PageRank, Betweenness, Closeness, etc.
     ├── community.rs     # Louvain, Label Propagation
     ├── similarity.rs    # Jaccard, Cosine, Adamic-Adar
     ├── metrics.rs       # Density, Clustering Coefficient
-    └── motif.rs         # Triangle counting, k-cliques
+    ├── motif.rs         # Triangle counting, k-cliques
+    ├── topology.rs      # Connected components, cycles, paths
+    └── gnn.rs           # GNN inference, graph attention
 ```
 
 ## Ring Message Type IDs
 
-Each domain has a reserved range for Ring message type IDs to avoid collisions:
+Each domain has a reserved range for Ring message type IDs, aligned with `ringkernel_core::domain::Domain` base offsets (0.4.2):
+
+| Domain | Range | RingKernel Domain |
+|--------|-------|-------------------|
+| Graph Analytics | 100–199 | `GraphAnalytics` |
+| Statistical ML | 200–299 | `StatisticalML` |
+| Compliance | 300–399 | `Compliance` |
+| Risk Analytics | 400–499 | `RiskManagement` |
+| Temporal Analysis | 500–599 | `TimeSeries` |
+| Order Matching | 600–699 | `OrderMatching` |
+| Clearing | 700–799 | `Clearing` |
+
+## RingKernel 0.4.2 Integration
+
+RustKernels 0.4.0 deeply integrates with RingKernel 0.4.2:
+
+### Domain Conversion
 
-| Domain | Range |
-|--------|-------|
-| Graph | 200-299 |
-| Compliance | 300-399 |
-| Temporal | 400-499 |
-| Risk | 600-699 |
-| ML | 700-799 |
+Bidirectional conversion between RustKernels and RingKernel domain types:
+
+```rust
+use rustkernel_core::domain::Domain;
+
+let domain = Domain::TemporalAnalysis;
+let ring_domain = domain.to_ring_domain();  // → ringkernel_core::domain::Domain::TimeSeries
+let back = Domain::from_ring_domain(ring_domain);  // → Domain::TemporalAnalysis
+```
+
+### Re-exports from RingKernel
+
+| Type | Description |
+|------|-------------|
+| `ControlBlock` | GPU control block for persistent kernel state |
+| `Backend` | Runtime backend selection (CUDA, CPU, WebGPU) |
+| `KernelStatus` | Detailed kernel status information |
+| `RuntimeMetrics` | Runtime performance metrics |
+| `K2KConfig` | Kernel-to-kernel messaging configuration |
+| `Priority` | Message priority levels |
+
+### Submodule Re-exports
+
+| Module | Description |
+|--------|-------------|
+| `rustkernel_core::checkpoint` | Kernel checkpointing and recovery |
+| `rustkernel_core::dispatcher` | Message dispatching |
+| `rustkernel_core::health` | Health checking (circuit breaker, degradation) |
+| `rustkernel_core::pubsub` | Pub/sub messaging patterns |
 
 ## Licensing System
 
-RustKernels includes an enterprise licensing system:
+Enterprise licensing in `rustkernel-core/src/license.rs`:
 
 - **DevelopmentLicense**: All features enabled (default for local development)
 - **ProductionLicense**: Domain-based feature gating
@@ -256,29 +350,23 @@ assert!(validator.is_domain_licensed(Domain::GraphAnalytics));
 
 ## Fixed-Point Arithmetic
 
-For GPU-compatible and exact financial calculations, Ring messages use fixed-point arithmetic:
+For GPU-compatible exact financial calculations, Ring messages use fixed-point arithmetic:
 
 ```rust
+// 8 decimal places (standard kernels)
+fn to_fixed_point(value: f64) -> i64 { (value * 100_000_000.0) as i64 }
+fn from_fixed_point(fp: i64) -> f64 { fp as f64 / 100_000_000.0 }
+
 // 18 decimal places (accounting kernels)
 const SCALE: i128 = 1_000_000_000_000_000_000;
 
 pub struct FixedPoint128 {
     pub value: i128,
 }
-
-impl FixedPoint128 {
-    pub fn from_f64(v: f64) -> Self {
-        Self { value: (v * SCALE as f64) as i128 }
-    }
-
-    pub fn to_f64(&self) -> f64 {
-        self.value as f64 / SCALE as f64
-    }
-}
 ```
 
 ## Next Steps
 
-- [Execution Modes](execution-modes.md) - Deep dive into Batch vs Ring
-- [Kernel Catalogue](../domains/README.md) - Browse available kernels
-- [Quick Start](../getting-started/quick-start.md) - Run your first kernel
+- [Execution Modes](execution-modes.md) — Deep dive into Batch vs Ring
+- [Kernel Catalogue](../domains/README.md) — Browse available kernels
+- [Quick Start](../getting-started/quick-start.md) — Run your first kernel
diff --git a/docs/src/domains/README.md b/docs/src/domains/README.md
index 28223be..c4262ce 100644
--- a/docs/src/domains/README.md
+++ b/docs/src/domains/README.md
@@ -10,7 +10,7 @@ RustKernels provides **106 GPU-accelerated kernels** across **14 domain-specific
 | [Statistical ML](statistical-ml.md) | `rustkernel-ml` | 17 | Clustering, NLP, federated learning, healthcare |
 | [Compliance](compliance.md) | `rustkernel-compliance` | 11 | AML, KYC, sanctions screening |
 | [Temporal Analysis](temporal-analysis.md) | `rustkernel-temporal` | 7 | Forecasting, seasonality, anomalies |
-| [Risk Analytics](risk-analytics.md) | `rustkernel-risk` | 5 | Credit, market, portfolio risk |
+| [Risk Analytics](risk-analytics.md) | `rustkernel-risk` | 5 | Credit, market, portfolio risk, correlation |
 | [Banking](banking.md) | `rustkernel-banking` | 1 | Fraud pattern matching |
 | [Behavioral Analytics](behavioral-analytics.md) | `rustkernel-behavioral` | 6 | Profiling, forensics, correlation |
 | [Order Matching](order-matching.md) | `rustkernel-orderbook` | 1 | Order book matching engine |
@@ -21,40 +21,36 @@ RustKernels provides **106 GPU-accelerated kernels** across **14 domain-specific
 | [Payments](payments.md) | `rustkernel-payments` | 2 | Payment processing, flow analysis |
 | [Audit](audit.md) | `rustkernel-audit` | 2 | Feature extraction, hypergraph |
 
-## Kernels by Execution Mode
+## Execution Support
 
-### Batch-Only Kernels (35)
+Kernels fall into three registration categories based on their trait implementations:
 
-Heavy computation kernels that only support batch mode:
+### Fully Executable (via REST/gRPC)
 
-- Graph: BetweennessCentrality, FullGraphMetrics, GNNInference, GraphAttention
-- ML: DBSCAN, HierarchicalClustering, IsolationForest, SecureAggregation, DrugInteractionPrediction
-- Compliance: EntityResolution, TransactionMonitoring
-- Process: NextActivityPrediction, EventLogImputation, DigitalTwin
-- And more...
+Kernels implementing `BatchKernel<I, O>` are registered with `register_batch_typed()` and can be executed through the type-erased `BatchKernelDyn` interface used by REST and gRPC endpoints.
 
-### Ring-Only Kernels (0)
+Examples: BetweennessCentrality, KMeans, DBSCAN, KYCScoring, ARIMAForecast, StressTesting
 
-Currently all Ring-capable kernels also support Batch mode.
+### Metadata-Only (Batch)
 
-### Dual-Mode Kernels (71)
+Kernels implementing `GpuKernel` only are registered with `register_batch_metadata_from()`. They are discoverable through metadata endpoints but require direct Rust API calls for execution.
 
-Kernels supporting both Batch and Ring execution:
+Examples: GraphDensity, LouvainCommunity, IsolationForest, AMLPatternDetection
 
-- All centrality measures (PageRank, Degree, Closeness, etc.)
-- All clustering algorithms (KMeans, Louvain, etc.)
-- All risk calculations (VaR, Credit Scoring, etc.)
-- Streaming ML (StreamingIsolationForest, AdaptiveThreshold)
-- And more...
+### Ring Kernels
+
+Ring kernels are registered with `register_ring_metadata_from()`. They require the RingKernel persistent actor runtime for execution and communicate via lock-free ring buffers.
+
+Examples: PageRankRing, DegreeCentralityRing, OrderMatchingRing, NetworkGenerationRing
 
 ## Using the Catalogue
 
 Each domain page includes:
 
-1. **Domain Overview** - Purpose and key use cases
-2. **Kernel List** - All kernels with brief descriptions
-3. **Kernel Details** - For each kernel:
-   - Kernel ID and modes
+1. **Domain Overview** — Purpose and key use cases
+2. **Kernel List** — All kernels with brief descriptions
+3. **Kernel Details** — For each kernel:
+   - Kernel ID and execution mode
    - Input/output types
    - Usage examples
    - Performance characteristics
@@ -64,14 +60,14 @@ Each domain page includes:
 Enable specific domains via Cargo features:
 
 ```toml
-# Default domains
-rustkernel = "0.1.0"  # graph, ml, compliance, temporal, risk
+# Default domains (graph, ml, compliance, temporal, risk)
+rustkernels = "0.4.0"
 
 # Selective
-rustkernel = { version = "0.1.0", features = ["accounting", "treasury"] }
+rustkernels = { version = "0.4.0", features = ["accounting", "treasury"] }
 
 # All domains
-rustkernel = { version = "0.1.0", features = ["full"] }
+rustkernels = { version = "0.4.0", features = ["full"] }
 ```
 
 ## Kernel ID Convention
diff --git a/docs/src/enterprise/ecosystem.md b/docs/src/enterprise/ecosystem.md
index bcd8d75..66f85da 100644
--- a/docs/src/enterprise/ecosystem.md
+++ b/docs/src/enterprise/ecosystem.md
@@ -1,21 +1,23 @@
 # Service Deployment
 
-RustKernels 0.2.0 includes the `rustkernel-ecosystem` crate for deploying kernels as production services.
+RustKernels 0.4.0 includes the `rustkernel-ecosystem` crate for deploying kernels as production services. All service integrations perform **real kernel execution** — requests are routed through the `KernelRegistry`, dispatched to type-erased `BatchKernelDyn` implementations, and return actual computation results.
 
 ## Overview
 
-| Integration | Description |
-|-------------|-------------|
-| **Axum** | REST API endpoints |
-| **Tower** | Middleware services |
-| **Tonic** | gRPC server |
-| **Actix** | Actor-based integration |
+| Integration | Description | Execution |
+|-------------|-------------|-----------|
+| **Axum** | REST API endpoints | Real batch kernel dispatch with timeout |
+| **Tower** | Middleware services | Real batch kernel dispatch via Tower `Service` |
+| **Tonic** | gRPC server | Real batch kernel dispatch with deadline support |
+| **Actix** | Actor-based integration | Real batch kernel dispatch via actor messages |
+
+Ring kernels are discoverable through metadata endpoints but require the RingKernel persistent actor runtime for execution. REST/gRPC endpoints return an informative error (HTTP 422 / gRPC `UNIMPLEMENTED`) with guidance to use the Ring protocol.
 
 ## Installation
 
 ```toml
 [dependencies]
-rustkernel-ecosystem = { version = "0.2.0", features = ["axum", "grpc"] }
+rustkernel-ecosystem = { version = "0.4.0", features = ["axum", "grpc"] }
 ```
 
 ### Feature Flags
@@ -28,6 +30,19 @@ rustkernel-ecosystem = { version = "0.2.0", features = ["axum", "grpc"] }
 | `actix` | Actix actor integration |
 | `full` | All integrations |
 
+## How Execution Works
+
+All four integrations follow the same execution path:
+
+1. **Registry lookup** — Find the kernel by ID in the `KernelRegistry`
+2. **Mode check** — Verify the kernel is a Batch kernel (Ring kernels return an error)
+3. **Factory create** — Instantiate the kernel via its registered factory closure
+4. **JSON serialize** — Serialize the request input to JSON bytes
+5. **Type-erased dispatch** — Call `execute_dyn(&input_bytes)` on the `BatchKernelDyn` trait object
+6. **Deserialize response** — Convert the output bytes back to a JSON response
+
+The `TypeErasedBatchKernel<K, I, O>` wrapper bridges the typed `BatchKernel<I, O>` interface to the type-erased `BatchKernelDyn` trait using serde JSON serialization.
+
 ## Axum REST API
 
 ### Quick Start
@@ -39,14 +54,14 @@ use std::sync::Arc;
 
 #[tokio::main]
 async fn main() {
-    // Create registry with kernels
+    // Create and populate registry
     let registry = Arc::new(KernelRegistry::new());
+    rustkernels::register_all(&registry).unwrap();
 
-    // Build router
+    // Build router — all endpoints perform real kernel execution
     let router = KernelRouter::new(registry, RouterConfig::default());
     let app = router.into_router();
 
-    // Serve
     let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await.unwrap();
     axum::serve(listener, app).await.unwrap();
 }
@@ -56,11 +71,11 @@ async fn main() {
 
 | Method | Path | Description |
 |--------|------|-------------|
-| GET | `/kernels` | List available kernels |
-| GET | `/kernels/{id}` | Get kernel info |
-| POST | `/execute` | Execute a kernel |
-| GET | `/health` | Health check |
-| GET | `/metrics` | Prometheus metrics |
+| GET | `/kernels` | List available kernels with metadata |
+| GET | `/kernels/{id}` | Get kernel info and capabilities |
+| POST | `/execute` | Execute a batch kernel |
+| GET | `/health` | Aggregated health check with component status |
+| GET | `/metrics` | Prometheus-compatible metrics with per-domain breakdown |
 
 ### Execute Request
 
@@ -68,11 +83,11 @@ async fn main() {
 curl -X POST http://localhost:8080/execute \
   -H "Content-Type: application/json" \
   -d '{
-    "kernel_id": "graph/pagerank",
+    "kernel_id": "graph/betweenness-centrality",
     "input": {
-      "num_nodes": 1000,
-      "edges": [[0,1], [1,2], [2,0]],
-      "damping_factor": 0.85
+      "num_nodes": 4,
+      "edges": [[0, 1], [1, 2], [2, 3], [0, 3]],
+      "normalized": true
     }
   }'
 ```
@@ -81,16 +96,28 @@ curl -X POST http://localhost:8080/execute \
 
 ```json
 {
-  "request_id": "req-123",
-  "kernel_id": "graph/pagerank",
+  "request_id": "req-abc123",
+  "kernel_id": "graph/betweenness-centrality",
   "output": {
-    "scores": [0.33, 0.33, 0.33],
-    "iterations": 10
+    "scores": [0.3333, 0.6667, 0.6667, 0.3333]
   },
   "metadata": {
-    "duration_us": 1500,
-    "backend": "CUDA",
-    "trace_id": "abc123"
+    "duration_us": 850,
+    "backend": "CPU"
+  }
+}
+```
+
+### Health Check
+
+The health endpoint aggregates component status:
+
+```json
+{
+  "status": "healthy",
+  "components": {
+    "registry": { "status": "healthy", "kernel_count": 106 },
+    "execution": { "status": "healthy", "error_rate": 0.0 }
   }
 }
 ```
@@ -103,7 +130,7 @@ let config = RouterConfig {
     enable_metrics: true,
     enable_health: true,
     cors_enabled: true,
-    max_request_size: 10 * 1024 * 1024, // 10MB
+    max_request_size: 10 * 1024 * 1024, // 10 MB
 };
 ```
 
@@ -117,7 +144,7 @@ use tower::ServiceExt;
 
 let service = KernelService::new(registry);
 
-// Use as Tower service
+// Execute via Tower Service trait — dispatches to real kernels
 let response = service
     .ready()
     .await?
@@ -131,7 +158,6 @@ let response = service
 use rustkernel_ecosystem::tower::TimeoutLayer;
 
 let layer = TimeoutLayer::new(Duration::from_secs(30));
-
 let service = ServiceBuilder::new()
     .layer(layer)
     .service(kernel_service);
@@ -176,14 +202,15 @@ Server::builder()
     .await?;
 ```
 
+gRPC execution includes deadline support — if the client sets a gRPC deadline, the server applies it as a timeout around kernel execution. Exceeded deadlines return `DEADLINE_EXCEEDED`.
+
 ### Client Usage
 
 ```rust
-// Generated from proto
 let mut client = KernelClient::connect("http://[::1]:50051").await?;
 
 let request = tonic::Request::new(ExecuteRequest {
-    kernel_id: "graph/pagerank".to_string(),
+    kernel_id: "graph/betweenness-centrality".to_string(),
     input: serde_json::to_string(&input)?,
 });
 
@@ -195,7 +222,6 @@ let response = client.execute(request).await?;
 ```rust
 use rustkernel_ecosystem::grpc::HealthService;
 
-// gRPC health checking protocol
 Server::builder()
     .add_service(HealthService::new())
     .add_service(kernel_server.into_service())
@@ -221,10 +247,14 @@ let config = KernelActorConfig {
 let actor = KernelActor::new(registry, config);
 let addr = actor.start();
 
-// Send execution request
+// Execute — dispatches to real batch kernel
 let result = addr.send(ExecuteKernel {
-    kernel_id: "graph/pagerank".to_string(),
-    input: serde_json::json!({ ... }),
+    kernel_id: "graph/betweenness-centrality".to_string(),
+    input: serde_json::json!({
+        "num_nodes": 4,
+        "edges": [[0, 1], [1, 2], [2, 3]],
+        "normalized": true
+    }),
     metadata: Default::default(),
 }).await??;
 ```
@@ -236,7 +266,6 @@ use rustkernel_ecosystem::actix::KernelActorSupervisor;
 
 let mut supervisor = KernelActorSupervisor::new(registry);
 
-// Spawn worker pool
 for i in 0..num_workers {
     supervisor.spawn(KernelActorConfig {
         name: format!("worker-{}", i),
@@ -244,7 +273,6 @@ for i in 0..num_workers {
     });
 }
 
-// Get addresses for load balancing
 let workers = supervisor.actors();
 ```
 
@@ -252,7 +280,7 @@ let workers = supervisor.actors();
 
 | Message | Description |
 |---------|-------------|
-| `ExecuteKernel` | Execute a kernel computation |
+| `ExecuteKernel` | Execute a batch kernel computation |
 | `GetKernelInfo` | Get kernel metadata |
 | `ListKernels` | List available kernels |
 | `GetStats` | Get actor statistics |
@@ -306,7 +334,7 @@ spec:
     spec:
       containers:
       - name: rustkernels
-        image: rustkernels:0.2.0
+        image: rustkernels:0.4.0
         ports:
         - containerPort: 8080
         - containerPort: 50051
@@ -328,6 +356,6 @@ spec:
 
 ## Next Steps
 
-- [Security](security.md) - Secure your deployment
-- [Observability](observability.md) - Monitor service health
-- [Runtime](runtime.md) - Configure for production
+- [Security](security.md) — Secure your deployment
+- [Observability](observability.md) — Monitor service health
+- [Runtime](runtime.md) — Configure for production
diff --git a/docs/src/getting-started/installation.md b/docs/src/getting-started/installation.md
index e1da352..0b8f82c 100644
--- a/docs/src/getting-started/installation.md
+++ b/docs/src/getting-started/installation.md
@@ -19,29 +19,16 @@ rustup update stable
 rustc --version  # Should be 1.85.0 or higher
 ```
 
-### RustCompute Framework
+### RingKernel Framework
 
-RustKernels depends on the RustCompute (RingKernel) framework for GPU execution:
-
-```bash
-# Clone RustCompute alongside RustKernels
-cd /path/to/your/projects
-git clone https://github.com/mivertowski/RustCompute.git
-
-# Directory structure should be:
-# projects/
-# ├── RustCompute/
-# │   └── RustCompute/
-# └── RustKernels/
-#     └── RustKernels/
-```
+RustKernels depends on [RingKernel 0.4.2](https://crates.io/crates/ringkernel-core) for GPU execution. RingKernel is published on crates.io and is resolved automatically by Cargo — no manual installation is required.
 
 ### CUDA Toolkit (Optional)
 
 For GPU acceleration, install the CUDA toolkit:
 
 - **Linux**: Install via your package manager or from [NVIDIA's website](https://developer.nvidia.com/cuda-downloads)
-- **Windows**: Download installer from NVIDIA
+- **Windows**: Download the installer from NVIDIA
 - **macOS**: Not supported for CUDA (CPU fallback only)
 
 ```bash
@@ -60,18 +47,18 @@ Add to your `Cargo.toml`:
 
 ```toml
 [dependencies]
-rustkernel = "0.1.0"
+rustkernels = "0.4.0"
 ```
 
 This includes the default feature set: `graph`, `ml`, `compliance`, `temporal`, `risk`.
 
 ### Selective Installation
 
-Only include the domains you need to reduce compile time and binary size:
+Include only the domains you need to reduce compile time and binary size:
 
 ```toml
 [dependencies]
-rustkernel = { version = "0.1.0", default-features = false, features = ["graph", "accounting"] }
+rustkernels = { version = "0.4.0", default-features = false, features = ["graph", "accounting"] }
 ```
 
 ### Full Installation
@@ -80,22 +67,31 @@ Include all 14 domains:
 
 ```toml
 [dependencies]
-rustkernel = { version = "0.1.0", features = ["full"] }
+rustkernels = { version = "0.4.0", features = ["full"] }
+```
+
+### Service Deployment
+
+For deploying kernels as REST or gRPC services:
+
+```toml
+[dependencies]
+rustkernel-ecosystem = { version = "0.4.0", features = ["axum", "grpc"] }
 ```
 
 ## Available Features
 
 | Feature | Domain | Description |
 |---------|--------|-------------|
-| `graph` | Graph Analytics | Centrality, community detection, similarity |
-| `ml` | Statistical ML | Clustering, anomaly detection, regression |
+| `graph` | Graph Analytics | Centrality, community detection, GNN inference |
+| `ml` | Statistical ML | Clustering, anomaly detection, NLP embeddings |
 | `compliance` | Compliance | AML, KYC, sanctions screening |
-| `temporal` | Temporal Analysis | Forecasting, anomaly detection |
+| `temporal` | Temporal Analysis | Forecasting, anomaly detection, decomposition |
 | `risk` | Risk Analytics | Credit scoring, VaR, stress testing |
 | `banking` | Banking | Fraud pattern detection |
 | `behavioral` | Behavioral | Profiling, forensics |
 | `orderbook` | Order Matching | Order book engine |
-| `procint` | Process Intelligence | DFG, conformance checking |
+| `procint` | Process Intelligence | DFG, conformance checking, digital twin |
 | `clearing` | Clearing | Netting, settlement |
 | `treasury` | Treasury | Cash flow, FX hedging |
 | `accounting` | Accounting | Network generation, reconciliation |
@@ -118,8 +114,11 @@ cargo build --workspace
 # Build in release mode
 cargo build --workspace --release
 
-# Run tests
+# Run all tests (895 tests)
 cargo test --workspace
+
+# Lint
+cargo clippy --all-targets --all-features -- -D warnings
 ```
 
 ## Verifying Installation
@@ -128,23 +127,11 @@ Create a simple test file:
 
 ```rust
 // src/main.rs
-use rustkernel::prelude::*;
+use rustkernels::prelude::*;
 
 fn main() {
-    println!("RustKernels installed successfully!");
-
-    // List available domains
-    let domains = [
-        "Graph Analytics",
-        "Statistical ML",
-        "Compliance",
-        "Temporal Analysis",
-        "Risk Analytics",
-    ];
-
-    for domain in domains {
-        println!("  - {}", domain);
-    }
+    println!("RustKernels v0.4.0 installed successfully!");
+    println!("RingKernel 0.4.2 — GPU-native persistent actor runtime");
 }
 ```
 
@@ -156,35 +143,39 @@ cargo run
 
 ## Troubleshooting
 
-### RustCompute Not Found
-
-If you see path errors related to RustCompute:
-
-1. Ensure RustCompute is cloned at the expected location
-2. Check that the directory structure matches what's expected in `Cargo.toml`
-3. Verify the RustCompute workspace builds independently
-
 ### CUDA Not Detected
 
-If GPU execution isn't working:
+If GPU execution is not working:
 
 1. Verify CUDA installation with `nvcc --version`
 2. Check GPU availability with `nvidia-smi`
 3. Ensure CUDA libraries are in your PATH
-4. RustKernels will fall back to CPU if CUDA isn't available
+4. RustKernels falls back to CPU automatically if CUDA is not available
 
 ### Compilation Errors
 
 For Rust version issues:
 
 ```bash
-# Ensure you're on the correct toolchain
+# Ensure you are on the correct toolchain
 rustup override set stable
 rustup update
 ```
 
+### Dependency Resolution
+
+RingKernel 0.4.2 is resolved from crates.io. If you encounter resolution issues:
+
+```bash
+# Update the Cargo registry index
+cargo update
+
+# Clear the build cache if needed
+cargo clean && cargo build --workspace
+```
+
 ## Next Steps
 
-- [Quick Start](quick-start.md) - Run your first kernel
-- [Execution Modes](../architecture/execution-modes.md) - Understand Batch vs Ring modes
-- [Kernel Catalogue](../domains/README.md) - Browse available kernels
+- [Quick Start](quick-start.md) — Run your first kernel
+- [Execution Modes](../architecture/execution-modes.md) — Understand Batch vs Ring modes
+- [Kernel Catalogue](../domains/README.md) — Browse available kernels
diff --git a/docs/src/getting-started/quick-start.md b/docs/src/getting-started/quick-start.md
index 1c01104..bd60637 100644
--- a/docs/src/getting-started/quick-start.md
+++ b/docs/src/getting-started/quick-start.md
@@ -1,10 +1,10 @@
 # Quick Start
 
-Get up and running with RustKernels in 5 minutes.
+Get up and running with RustKernels in minutes.
 
 ## Your First Kernel
 
-Let's run a PageRank calculation on a simple graph.
+Run a betweenness centrality calculation on a simple graph.
 
 ### Step 1: Create a New Project
 
@@ -24,8 +24,8 @@ version = "0.1.0"
 edition = "2024"
 
 [dependencies]
-rustkernel = { version = "0.1.0", features = ["graph"] }
-tokio = { version = "1.0", features = ["full"] }
+rustkernels = { version = "0.4.0", features = ["graph"] }
+tokio = { version = "1", features = ["full"] }
 ```
 
 ### Step 3: Write Your Code
@@ -33,47 +33,35 @@ tokio = { version = "1.0", features = ["full"] }
 Edit `src/main.rs`:
 
 ```rust
-use rustkernel::prelude::*;
-use rustkernel::graph::centrality::{PageRank, PageRankInput};
+use rustkernels::prelude::*;
+use rustkernels::graph::centrality::{BetweennessCentrality, BetweennessCentralityInput};
 
 #[tokio::main]
 async fn main() -> Result<(), Box<dyn std::error::Error>> {
-    // Create a PageRank kernel
-    let kernel = PageRank::new();
+    // Create the kernel
+    let kernel = BetweennessCentrality::new();
 
     // Print kernel metadata
     let metadata = kernel.metadata();
     println!("Kernel: {}", metadata.id);
     println!("Domain: {:?}", metadata.domain);
-    println!("Mode: {:?}", metadata.mode);
+    println!("Mode:   {:?}", metadata.mode);
 
     // Prepare input: a simple 4-node graph
-    // Node 0 -> Node 1, Node 2
-    // Node 1 -> Node 2
-    // Node 2 -> Node 0, Node 3
-    // Node 3 -> Node 0
-    let input = PageRankInput {
+    let input = BetweennessCentralityInput {
         num_nodes: 4,
-        edges: vec![
-            (0, 1), (0, 2),
-            (1, 2),
-            (2, 0), (2, 3),
-            (3, 0),
-        ],
-        damping_factor: 0.85,
-        max_iterations: 100,
-        tolerance: 1e-6,
+        edges: vec![(0, 1), (1, 2), (2, 3), (0, 3)],
+        normalized: true,
     };
 
     // Execute the kernel
     let result = kernel.execute(input).await?;
 
     // Print results
-    println!("\nPageRank Scores:");
+    println!("\nBetweenness Centrality Scores:");
     for (node, score) in result.scores.iter().enumerate() {
         println!("  Node {}: {:.4}", node, score);
     }
-    println!("\nConverged in {} iterations", result.iterations);
 
     Ok(())
 }
@@ -85,140 +73,135 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
 cargo run
 ```
 
-Expected output:
+## Using the Registry
 
-```
-Kernel: graph/pagerank
-Domain: GraphAnalytics
-Mode: Batch
+For production deployments, use the `KernelRegistry` to manage kernels centrally:
+
+```rust
+use rustkernels::prelude::*;
+use rustkernel_core::registry::KernelRegistry;
+use std::sync::Arc;
+
+#[tokio::main]
+async fn main() -> Result<(), Box<dyn std::error::Error>> {
+    // Create the registry and register all domains
+    let registry = Arc::new(KernelRegistry::new());
+    rustkernels::register_all(&registry)?;
 
-PageRank Scores:
-  Node 0: 0.3682
-  Node 1: 0.1418
-  Node 2: 0.2879
-  Node 3: 0.2021
+    // Execute via type-erased interface (same path REST/gRPC uses)
+    let input_json = serde_json::to_vec(&serde_json::json!({
+        "num_nodes": 4,
+        "edges": [[0, 1], [1, 2], [2, 3], [0, 3]],
+        "normalized": true
+    }))?;
 
-Converged in 23 iterations
+    let output_json = registry.execute_batch(
+        "graph/betweenness-centrality",
+        &input_json,
+    ).await?;
+
+    let result: serde_json::Value = serde_json::from_slice(&output_json)?;
+    println!("Result: {}", serde_json::to_string_pretty(&result)?);
+
+    Ok(())
+}
 ```
 
-## Using Multiple Kernels
+## Deploying as a REST Service
 
-Combine kernels from different domains:
+Expose kernels via HTTP using the Axum integration:
 
 ```rust
-use rustkernel::prelude::*;
-use rustkernel::graph::centrality::PageRank;
-use rustkernel::graph::community::LouvainCommunity;
-use rustkernel::graph::metrics::GraphDensity;
+use rustkernel_ecosystem::axum::{KernelRouter, RouterConfig};
+use rustkernel_core::registry::KernelRegistry;
+use std::sync::Arc;
 
 #[tokio::main]
-async fn main() -> Result<(), Box<dyn std::error::Error>> {
-    // Analyze the same graph with multiple kernels
-    let edges = vec![
-        (0, 1), (0, 2), (1, 2),
-        (2, 3), (3, 4), (4, 2),
-    ];
-
-    // Centrality analysis
-    let pagerank = PageRank::new();
-    let pr_result = pagerank.execute(PageRankInput {
-        num_nodes: 5,
-        edges: edges.clone(),
-        damping_factor: 0.85,
-        max_iterations: 100,
-        tolerance: 1e-6,
-    }).await?;
-
-    // Community detection
-    let louvain = LouvainCommunity::new();
-    let community_result = louvain.execute(LouvainInput {
-        num_nodes: 5,
-        edges: edges.clone(),
-        resolution: 1.0,
-    }).await?;
-
-    // Graph metrics
-    let density = GraphDensity::new();
-    let density_result = density.execute(DensityInput {
-        num_nodes: 5,
-        num_edges: edges.len(),
-    }).await?;
-
-    println!("Analysis complete:");
-    println!("  Communities found: {}", community_result.num_communities);
-    println!("  Graph density: {:.4}", density_result.density);
-    println!("  Most central node: {}", pr_result.top_node());
+async fn main() {
+    let registry = Arc::new(KernelRegistry::new());
+    rustkernels::register_all(&registry).unwrap();
 
-    Ok(())
+    let router = KernelRouter::new(registry, RouterConfig::default());
+    let app = router.into_router();
+
+    let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await.unwrap();
+    println!("Listening on http://0.0.0.0:8080");
+    axum::serve(listener, app).await.unwrap();
 }
 ```
 
+Then call it:
+
+```bash
+curl -X POST http://localhost:8080/execute \
+  -H "Content-Type: application/json" \
+  -d '{
+    "kernel_id": "graph/betweenness-centrality",
+    "input": {
+      "num_nodes": 4,
+      "edges": [[0, 1], [1, 2], [2, 3], [0, 3]],
+      "normalized": true
+    }
+  }'
+```
+
 ## Kernel Configuration
 
-Most kernels accept configuration options:
+Most kernels accept configuration through their input types:
 
 ```rust
-use rustkernel::ml::clustering::{KMeans, KMeansConfig};
+use rustkernels::ml::clustering::{KMeans, KMeansInput};
 
-let config = KMeansConfig {
-    num_clusters: 5,
+let kernel = KMeans::new();
+let input = KMeansInput {
+    data: vec![/* data points */],
+    k: 5,
     max_iterations: 300,
     tolerance: 1e-4,
-    initialization: KMeansInit::KMeansPlusPlus,
-    ..Default::default()
 };
 
-let kernel = KMeans::with_config(config);
+let result = kernel.execute(input).await?;
 ```
 
 ## Batch vs Ring Mode
 
 ### Batch Mode (Default)
 
-CPU-orchestrated execution. Best for periodic computations:
+CPU-orchestrated execution — best for periodic computations:
 
 ```rust
-// Batch kernels implement BatchKernel trait
-let kernel = PageRank::new();
+// Batch kernels implement BatchKernel<I, O>
+let kernel = BetweennessCentrality::new();
 let result = kernel.execute(input).await?;
 ```
 
 ### Ring Mode
 
-GPU-persistent actors for streaming workloads:
+GPU-persistent actors for streaming workloads. Ring kernels require the RingKernel runtime:
 
 ```rust
-// Ring kernels implement RingKernelHandler trait
-use rustkernel::graph::centrality::PageRankRing;
-
-// Ring kernels maintain persistent GPU state
-let ring = PageRankRing::new();
-
-// Send streaming updates
-ring.add_edge(0, 1).await?;
-ring.add_edge(1, 2).await?;
-
-// Query current state
-let scores = ring.query_scores().await?;
+// Ring kernels implement RingKernelHandler<M, R>
+// They maintain persistent state in GPU memory and communicate
+// via lock-free ring buffers with sub-microsecond latency.
+// See architecture/execution-modes.md for setup details.
 ```
 
-See [Execution Modes](../architecture/execution-modes.md) for detailed comparison.
+See [Execution Modes](../architecture/execution-modes.md) for a detailed comparison.
 
 ## Error Handling
 
 RustKernels uses standard Rust error handling:
 
 ```rust
-use rustkernel::prelude::*;
-use rustkernel::error::KernelError;
+use rustkernel_core::error::KernelError;
 
 match kernel.execute(input).await {
     Ok(result) => println!("Success: {:?}", result),
-    Err(KernelError::InvalidInput(msg)) => {
+    Err(KernelError::ValidationError(msg)) => {
         eprintln!("Invalid input: {}", msg);
     }
-    Err(KernelError::ExecutionFailed(msg)) => {
-        eprintln!("Execution failed: {}", msg);
+    Err(KernelError::Timeout(duration)) => {
+        eprintln!("Timed out after {:?}", duration);
     }
     Err(e) => eprintln!("Error: {}", e),
 }
@@ -226,6 +209,7 @@ match kernel.execute(input).await {
 
 ## Next Steps
 
-- [Architecture Overview](../architecture/overview.md) - Understand the system design
-- [Kernel Catalogue](../domains/README.md) - Explore all 82 kernels
-- [Accounting Network Generation](../articles/accounting-network-generation.md) - Deep-dive article
+- [Architecture Overview](../architecture/overview.md) — Understand the system design
+- [Kernel Catalogue](../domains/README.md) — Explore all 106 kernels across 14 domains
+- [Service Deployment](../enterprise/ecosystem.md) — Deploy as REST/gRPC services
+- [Accounting Network Generation](../articles/accounting-network-generation.md) — Deep-dive article