Skip to content

feat(persistence): MongoDB backend with CRUD, search, history, and transactions#44

Open
doug-helios wants to merge 17 commits intomainfrom
feat/mongo
Open

feat(persistence): MongoDB backend with CRUD, search, history, and transactions#44
doug-helios wants to merge 17 commits intomainfrom
feat/mongo

Conversation

@doug-helios
Copy link
Contributor

@doug-helios doug-helios commented Mar 2, 2026

MongoDB Persistence Backend - Complete Implementation

Overview

This PR introduces MongoDB as a fully-featured primary persistence backend for the Helios FHIR Server (HFS), delivering production-ready FHIR storage capabilities with comprehensive search, versioning, history, conditional operations, and composite Elasticsearch integration.

Summary

MongoDB support has been implemented through a structured 6-phase roadmap plus transaction-bundle parity work, progressing from initial scaffolding to full Inferno test suite validation. The implementation provides:

  • Complete CRUD operations with tenant isolation and soft-delete semantics
  • Full versioning and history (vread, optimistic locking, instance/type/system history)
  • Native search execution for core FHIR parameter types (string, token, date, number, reference, uri)
  • Conditional operations (conditional create, update, delete)
  • Composite mode (MongoDB primary + Elasticsearch secondary for advanced search)
  • Transaction-bundle support with topology-aware session handling
  • Inferno test suite validation on replica-set deployments

Implementation Phases

Phase 1: Backend Skeleton and Feature Wiring ✅

Goal: Establish compile-time and runtime hooks for MongoDB backend

Delivered:

  • Feature-gated MongoDB module in crates/persistence/src/backends/mongodb/
  • Backend trait implementation (kind, name, capabilities, health checks)
  • Configuration structs with serde support
  • Cargo feature integration across workspace

Files:

  • crates/persistence/src/backends/mongodb/mod.rs
  • crates/persistence/src/backends/mongodb/backend.rs
  • crates/persistence/src/backends/mongodb/config.rs
  • crates/persistence/src/backends/mongodb/schema.rs

Phase 2: Core Storage Contract Parity ✅

Goal: Achieve minimum ResourceStorage parity with SQLite/PostgreSQL

Delivered:

  • Complete ResourceStorage implementation (create, read, update, delete, exists, count, read_batch, create_or_update)
  • Strict tenant isolation with tenant-scoped indexes
  • Soft-delete semantics aligned with existing backends (410 Gone behavior)
  • Collection/index bootstrap with idempotent schema initialization
  • Integration test suite (crates/persistence/tests/mongodb_tests.rs)

Key Features:

  • Document model with explicit tenant_id, resource_type, resource_id, version_id fields
  • Unique indexes for tenant + type + id combinations
  • Deterministic error mapping from MongoDB driver to StorageError
  • Cross-tenant isolation validated by negative tests

Phase 3: Versioning, History, and Transaction Semantics ✅

Goal: Implement FHIR version/history semantics with session-based consistency

Delivered:

  • VersionedStorage: vread, update_with_match, delete_with_match, list_versions
  • History Providers: instance_history, type_history, system_history with pagination
  • Optimistic locking: ETag/If-Match support with VersionConflict errors
  • Session-based operations: Topology-aware transaction handling (replica-set/sharded only)
  • History collection with deterministic ordering and time-based filtering

Key Decisions:

  • Conditional operations deferred to Phase 4 (search dependency)
  • Trial Use history-delete methods remain NotSupported
  • Best-effort session handling with graceful standalone fallback

Phase 4: Search, Indexing, and Conditional Semantics ✅

Goal: Enable FHIR search and conditional operations with native MongoDB indexing

Delivered:

  • SearchProvider implementation for core parameter types:
    • String, token, reference, date, number, uri parameters
    • Deterministic paging and sorting with cursor support
    • search_count consistency with search results
  • Conditional operations:
    • conditional_create (zero/one/multiple match semantics)
    • conditional_update (with upsert support)
    • conditional_delete
  • Search indexing: Automatic extraction and persistence during write operations
  • SearchParameter lifecycle: Registry updates on create/update/delete

Files:

  • crates/persistence/src/backends/mongodb/search_impl.rs
  • Enhanced crates/persistence/src/backends/mongodb/storage.rs

Boundaries:

  • Conditional patch: deferred
  • Advanced search (chained, reverse chaining, _include/_revinclude): partial/planned
  • Full-text search: offloaded to Elasticsearch in composite mode

Phase 5: Composite Integration (MongoDB + Elasticsearch) ✅

Goal: Provide robust primary-secondary mode mirroring existing composite backends

Delivered:

  • Composite backend wiring: MongoDB primary + Elasticsearch secondary
  • Operation ownership:
    • Write/read/history → MongoDB (primary)
    • Search → Elasticsearch (secondary)
  • search_offloaded mode: Prevents duplicate indexing when ES is configured
  • Shared SearchParameter registry: Synchronized between MongoDB and ES backends
  • Composite routing tests: Validates delegation, tenant isolation, failure handling

Runtime Configuration:

HFS_STORAGE_BACKEND=mongodb-elasticsearch
HFS_DATABASE_URL=mongodb://localhost:27017/hfs
HFS_ELASTICSEARCH_NODES=http://localhost:9200

Files:

  • crates/persistence/src/backends/composite/ (updated for MongoDB)
  • crates/rest/src/config.rs (mongodb-elasticsearch mode)
  • crates/hfs/src/main.rs (startup wiring)

Phase 6: Server Wiring, Documentation, and Release Readiness ✅

Goal: Expose MongoDB modes to HFS runtime with complete documentation

Delivered:

  • StorageBackendMode enum: Added MongoDB and MongoDBElasticsearch variants
  • Runtime startup paths: start_mongodb() and start_mongodb_elasticsearch()
  • Environment variables:
    • HFS_STORAGE_BACKEND=mongodb or mongodb-elasticsearch
    • HFS_DATABASE_URL=mongodb://...
  • Documentation updates:
    • crates/persistence/README.md capability matrix
    • ROADMAP.md persistence section
    • Operator examples and deployment guidance

Configuration Example:

// Programmatic
let config = ServerConfig {
    storage_backend: "mongodb".to_string(),
    database_url: Some("mongodb://localhost:27017/hfs".to_string()),
    ..Default::default()
};

// Environment
HFS_STORAGE_BACKEND=mongodb
HFS_DATABASE_URL=mongodb://localhost:27017/hfs

Final Phase: Transaction-Bundle Parity and Inferno Validation ✅

Goal: Enable MongoDB in full Inferno test matrix with transaction-bundle support

Delivered:

  • Real transaction bundles: Atomic execution with session boundaries
  • Topology awareness: Requires replica-set or sharded deployment
  • Bundle semantics:
    • POST, PUT, DELETE, GET entry support
    • urn:uuid reference resolution
    • Conditional bundle operations (ifMatch, ifNoneExist)
    • Rollback on failure, commit on success
  • Inferno CI integration:
    • MongoDB runs in replica-set mode (rs.initiate())
    • Full backend matrix coverage alongside SQLite/PostgreSQL
    • Transaction-capable URI: mongodb://localhost:27017/?replicaSet=rs0&directConnection=true

Files:

  • crates/persistence/src/backends/mongodb/storage.rs (process_transaction)
  • .github/workflows/inferno.yml (replica-set topology)
  • crates/persistence/tests/mongodb_tests.rs (transaction coverage)

Capability Matrix

Capability MongoDB Standalone MongoDB + Elasticsearch
CRUD ✅ Implemented ✅ Implemented
Versioning (vread) ✅ Implemented ✅ Implemented
Optimistic Locking ✅ Implemented ✅ Implemented
Instance History ✅ Implemented ✅ Implemented
Type History ✅ Implemented ✅ Implemented
System History ✅ Implemented ✅ Implemented
Basic Search ✅ Implemented (native) ✅ Implemented (offloaded)
Conditional Ops ✅ Implemented ✅ Implemented
Transactions ✅ Implemented (replica-set) ✅ Implemented (replica-set)
Full-Text Search 🔄 Planned ✅ Implemented (ES)
Chained Search 🔄 Partial 🔄 Partial
_include/_revinclude 🔄 Partial 🔄 Partial
Conditional Patch 🔄 Planned 🔄 Planned

Testing Coverage

Unit Tests

  • Configuration serialization/deserialization
  • Query translation for all supported parameter types
  • Tenant filter construction
  • Error mapping (MongoDB → StorageError)
  • Schema bootstrap idempotency

Integration Tests

  • Full CRUD lifecycle with tenant isolation
  • Version conflict detection and optimistic locking
  • History pagination and filtering
  • Search execution with paging/sorting
  • Conditional operation matching (zero/one/multiple)
  • Composite routing and delegation
  • Transaction bundle execution and rollback
  • SearchParameter lifecycle and registry updates

CI Coverage

# Feature-gated builds
cargo check -p helios-persistence --features mongodb
cargo check -p helios-hfs --features R4,mongodb,elasticsearch

# Integration tests
cargo test -p helios-persistence --features mongodb --test mongodb_tests

# Inferno validation
# MongoDB in replica-set mode with full test suite

Architecture

Data Model

Resources Collection:

{
  _id: ObjectId,
  tenant_id: "default",
  resource_type: "Patient",
  resource_id: "123",
  version_id: "1",
  last_updated: ISODate("2026-03-10T..."),
  is_deleted: false,
  resource: { /* FHIR resource JSON */ }
}

History Collection:

{
  _id: ObjectId,
  tenant_id: "default",
  resource_type: "Patient",
  resource_id: "123",
  version_id: "1",
  last_updated: ISODate("2026-03-10T..."),
  method: "POST",
  resource: { /* Historical FHIR resource */ }
}

Search Indexes Collection (when not offloaded):

{
  tenant_id: "default",
  resource_type: "Patient",
  resource_id: "123",
  parameter_name: "name",
  value_string: "john",
  // ... parameter-specific fields
}

Indexes

  • {tenant_id: 1, resource_type: 1, resource_id: 1} (unique, active resources)
  • {tenant_id: 1, resource_type: 1, resource_id: 1, version_id: 1} (history)
  • {tenant_id: 1, resource_type: 1, last_updated: -1} (type history)
  • {tenant_id: 1, last_updated: -1} (system history)
  • Search parameter indexes (when native search enabled)

Configuration

Standalone MongoDB

HFS_STORAGE_BACKEND=mongodb
HFS_DATABASE_URL=mongodb://localhost:27017/hfs

MongoDB + Elasticsearch Composite

HFS_STORAGE_BACKEND=mongodb-elasticsearch
HFS_DATABASE_URL=mongodb://localhost:27017/hfs
HFS_ELASTICSEARCH_NODES=http://localhost:9200
HFS_ELASTICSEARCH_INDEX_PREFIX=hfs

Transaction Support (Replica Set)

# Start MongoDB in replica-set mode
docker run -d --name mongo-rs \
  -p 27017:27017 \
  mongo:7 --replSet rs0

# Initialize replica set
docker exec mongo-rs mongosh --eval "rs.initiate()"

# Connect with transaction-capable URI
HFS_DATABASE_URL=mongodb://localhost:27017/hfs?replicaSet=rs0&directConnection=true

Migration Path

Existing deployments can adopt MongoDB incrementally:

  1. Evaluation: Deploy MongoDB standalone for non-production workloads
  2. Search offload: Add Elasticsearch secondary for advanced search
  3. Production: Deploy replica-set for transaction support
  4. Migration: Use FHIR $export/$import for data migration from SQLite/PostgreSQL

Breaking Changes

None. MongoDB is an additive backend option. Existing SQLite and PostgreSQL deployments are unaffected.


Documentation Updates

  • crates/persistence/README.md - Capability matrix and MongoDB status
  • ROADMAP.md - Persistence roadmap with MongoDB phases
  • crates/rest/src/config.rs - Configuration documentation
  • ✅ Phase roadmaps (phase2_roadmap.xml through phase6_roadmap.xml, final_roadmap.xml)
  • ✅ Umbrella roadmap (roadmap_mongo.xml)

Validation

All validation commands pass:

✅ cargo check -p helios-persistence --features mongodb
✅ cargo check -p helios-hfs --features R4,mongodb
✅ cargo check -p helios-hfs --features R4,sqlite,elasticsearch,postgres,mongodb
✅ cargo test -p helios-persistence --features mongodb --test mongodb_tests
✅ Inferno test suite (replica-set topology)

Roadmap Artifacts

Detailed planning and execution tracking:

  • roadmap_mongo.xml - Umbrella 6-phase roadmap (completed)
  • phase2_roadmap.xml - Core storage parity (completed)
  • phase3_roadmap.xml - Versioning and history (completed)
  • phase4_roadmap.xml - Search and conditional ops (completed)
  • phase5_roadmap.xml - Composite integration (completed)
  • phase6_roadmap.xml - Runtime wiring (completed)
  • final_roadmap.xml - Transaction-bundle parity (completed)

Future Work

Planned enhancements (out of scope for this PR):

  • Advanced chained search and reverse chaining
  • Full _include/_revinclude support
  • Conditional PATCH operations
  • Database-per-tenant architecture
  • Performance benchmarking and optimization
  • Sharded cluster deployment guidance

Contributors

Implementation followed structured phase-based delivery with comprehensive testing, documentation, and CI integration at each milestone.


Related Issues

Closes: [MongoDB persistence backend tracking issue]

@doug-helios doug-helios marked this pull request as draft March 2, 2026 01:29
dougc95 added 8 commits March 5, 2026 14:59
…nal operations

Add search_index collection with 11 specialized indexes for FHIR search parameters (string, token, date, number, quantity, reference, uri, composite, resource, token_display, identifier_type). Implement SearchProvider with cursor-based pagination, sort support, and count queries. Add ConditionalStorage for conditional create/update/delete with search parameter matching. Bump schema version to 4.
…pagination tests

Add 6 new integration tests covering search parameter registration/unregistration on create/update/delete (active/draft/retired status handling) and bidirectional cursor-based pagination with has_next/has_previous flags. Tests verify registry state changes and cursor roundtrip navigation.
…test coverage

Update roadmap status from planned to completed for phases 2-4 (core storage parity, versioning/history/conditional semantics, search/indexing/conditional operations). Update last_updated to 2026-03-09 and progress summary to reflect full implementation and validation through MongoDB integration tests.
Add MongoDB standalone and MongoDB+Elasticsearch composite modes to HFS server. Implement start_mongodb() and start_mongodb_elasticsearch() functions with connection string/env detection, schema initialization, and search offloading. Add empty Elasticsearch node validation for all composite modes. Update documentation to reflect MongoDB Phase 5 completion (CRUD, versioning, search, conditional ops, composite integration). Add El
…ntation

Update ROADMAP.md to reflect MongoDB standalone and MongoDB+Elasticsearch as shipped persistence options. Update persistence README with MongoDB runtime configuration examples, architecture tree, and search offloading documentation. Add phase6_roadmap.xml as detailed closure artifact. Update roadmap_mongo.xml to reference completed Phase 5/6 artifacts and synchronize progress statements across all roadmap documents.
…bundle support

Add MongoDB to Inferno workflow matrix alongside sqlite/postgres backends. Implement MongoDB replica set initialization with 60s primary election timeout and container lifecycle management. Add transaction bundle support to MongoDB backend with ClientSession-based ACID semantics, reference resolution, SearchParameter registry integration, and rollback on entry failures. Declare Transactions capability in backend
@doug-helios doug-helios changed the title feat(persistence/mongodb): complete Phase 2 core storage parity (CRUD, tenant isolation, soft-delete) feat(persistence): MongoDB backend with CRUD, search, history, and transactions Mar 10, 2026
@doug-helios doug-helios marked this pull request as ready for review March 10, 2026 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants