From d92c04491c831f25456744c8765f1e953b7e8920 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 1 Mar 2026 15:57:10 -0400 Subject: [PATCH 01/17] feat: init mongo roadmap --- roadmap_mongo.xml | 276 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 276 insertions(+) create mode 100644 roadmap_mongo.xml diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml new file mode 100644 index 00000000..3e68fe94 --- /dev/null +++ b/roadmap_mongo.xml @@ -0,0 +1,276 @@ + + + + HeliosSoftware/hfs + draft + TBD + date-agnostic + 2026-03-01 + + SQLite primary, PostgreSQL primary, Elasticsearch secondary + + + + + + + + + + + + + + + + + MongoDB as a primary persistence backend in helios-persistence. + Tenant-aware CRUD, versioning, history, and conditional operation semantics compatible with existing backends. + Search strategy for MongoDB (native indexing and/or search offloading to Elasticsearch). + Composite integration for MongoDB + Elasticsearch. + Server runtime wiring via HFS_STORAGE_BACKEND options for MongoDB modes. + Test and CI coverage comparable to SQLite/PostgreSQL/Elasticsearch patterns. + + + Neo4j graph traversal optimizations in initial MongoDB delivery. + Non-FHIR custom query DSL extensions. + Full terminology service implementation beyond current external-service expectations. + Hard production SLO guarantees before baseline performance characterization is complete. + + + + + + + + + + + + + + + + + + + + + + + + + Introduce MongoDB backend module scaffolding with compile-time and runtime hooks. + + Enable mongodb module export in crates/persistence/src/backends/mod.rs. + Create crates/persistence/src/backends/mongodb with backend/config/schema skeleton. + Define MongoDB backend config with defaults and serde support. + Implement Backend trait basics (kind/name/capabilities/health checks). + Wire feature-gated compile paths for mongo in helios-persistence and helios-hfs. + + + Existing Cargo feature 'mongodb' and optional dependency in crates/persistence/Cargo.toml. + + + cargo check -p helios-persistence --features mongodb passes. + BackendKind::MongoDB compile and display behavior verified. + + + + + Reach minimum parity with SQLite/PostgreSQL ResourceStorage behavior. + + Implement create/read/update/delete/exists/count/read_batch/create_or_update. + Enforce tenant isolation in all collection queries and indexes. + Implement soft-delete semantics aligned with existing Gone behavior. + Create core collection/index strategy: resources, resource_history, search indexes (if native). + + + Phase 1 backend skeleton. + + + Mongo integration CRUD tests pass in isolation. + Tenant isolation behavior matches sqlite_tests/postgres_tests expectations. + + + + + Implement FHIR version/history semantics and concurrency expectations. + + Implement VersionedStorage (vread + update_with_match semantics). + Implement instance/type/system history providers. + Implement conditional create/update/delete and If-Match handling. + Define transaction behavior using Mongo sessions where feasible. + + + Phase 2 core contract implementation. + + + History and versioning test suites pass for MongoDB feature mode. + Explicitly documented deviations from PostgreSQL transaction semantics (if any). + + + + + Support FHIR search behavior with clear native/offloaded boundaries. + + Implement SearchParameter extraction + indexing path for Mongo resources. + Implement basic parameter types (string/token/date/number/quantity/reference/uri/composite) per priority. + Support paging and sorting; define practical limits. + Implement full-text path via Mongo text indexes OR formalize Elasticsearch offload-first strategy. + Define support levels (implemented/partial/planned) in capability matrix and docs. + + + Phase 2 and 3 collections + version model. + + + Search contract tests pass for implemented parameter classes. + Capability matrix updated with truthful MongoDB support levels. + + + + + Provide robust primary-secondary mode mirroring sqlite-elasticsearch and postgres-elasticsearch. + + Implement Mongo search_offloaded mode to avoid duplicate indexing when ES secondary is configured. + Create composite wiring with Mongo primary and Elasticsearch search backend. + Ensure search registry sharing between Mongo backend and ES backend initialization. + Validate write-primary/read-primary/search-secondary routing and sync behavior. + + + Phase 4 search model clarity. + + + Composite tests verify routing and result consistency for Mongo + Elasticsearch. + Startup path for mongo-elasticsearch mode is implemented and feature-gated. + + + + + Expose Mongo modes to HFS runtime and document operational guidance. + + Add StorageBackendMode values for mongodb and mongodb-elasticsearch in crates/rest/src/config.rs. + Add start_mongodb and start_mongodb_elasticsearch flows in crates/hfs/src/main.rs. + Update persistence README capability matrix and role matrix to reflect implemented Mongo status. + Update top-level ROADMAP.md persistence section when milestones ship. + Document deployment examples, environment variables, and feature flags. + + + Phases 1 through 5 complete or explicitly scoped. + + + HFS_STORAGE_BACKEND accepts mongodb and mongodb-elasticsearch values. + All relevant docs and examples are consistent with actual implementation. + + + + + + + Prefer parity with existing SQLite/PostgreSQL behavioral contracts over backend-specific shortcuts. + Use capability-driven tests to skip only what is explicitly planned/not planned. + Run fast unit coverage first, then containerized integration, then full regression. + + + + + Mongo config defaults + serde roundtrip (mirror Postgres/ES config tests). + Backend capability declarations and support checks. + Query translation tests (FHIR search params to Mongo query documents). + Error conversion tests (mongodb::error::Error to StorageError). + + + All unit tests deterministic and runnable without Docker. + Coverage includes every capability claimed as implemented/partial. + + + + + + Create crates/persistence/tests/mongodb_tests.rs analogous to postgres_tests.rs and elasticsearch_tests.rs. + Use shared Mongo container lifecycle for speed and isolation via unique tenant IDs per test. + CRUD, tenant isolation, version increments, delete semantics (Gone/not found expectations). + History and conditional operation behavior for implemented levels. + + + Mongo integration suite passes on self-hosted CI with Docker/testcontainers. + No cross-test data contamination detected. + + + + + + Reuse tests/common harness, fixtures, assertions, and capability matrix for Mongo runs. + Validate parity with SQLite/PostgreSQL for tenant isolation, versioning, and error semantics. + + + All contract tests for implemented Mongo capabilities pass without Mongo-only exceptions. + + + + + + Add/extend composite tests to cover Mongo primary + ES search secondary. + Verify write/read/search ownership split and provider delegation. + Verify index synchronization behavior and eventual consistency expectations. + + + Composite routing and result-merging tests pass for Mongo+ES mode. + No duplicated search indexing in Mongo when offloaded. + + + + + + Compile + unit tests for mongodb feature. + Mongo integration tests (Docker/testcontainers). + All-features/full-workspace regression. + + + Add mongodb-featured test invocations in CI jobs. + Maintain container cleanup parity with existing label-based cleanup steps. + Ensure failures in Mongo-specific stage block merge when feature is enabled. + + + + + + Baseline latency benchmarks for create/read/search operations. + Concurrency soak tests across multiple tenants and mixed read/write load. + Migration/index evolution tests for backward-compatible rollout behavior. + + + Benchmarks show no critical regressions versus declared targets for initial release. + No data corruption or tenant leakage under concurrent load tests. + + + + + + + Mongo transaction semantics may diverge from ACID expectations used by PostgreSQL/SQLite flows. + Document support level clearly; enforce behavior with dedicated transaction and rollback tests. + + + FHIR search parity gaps due to complex chained/reverse chained parameter translation. + Ship basic search parity first; mark advanced capabilities as partial/planned with explicit tests. + + + Dual indexing complexity when Mongo native search and Elasticsearch offloading coexist. + Use explicit search_offloaded controls and test for duplicated/stale index behavior. + + + CI instability from container startup/resource constraints on self-hosted runners. + Shared container lifecycle, timeouts, and explicit cleanup steps aligned with current CI patterns. + + + + + MongoDB backend module is enabled and feature-gated with stable compile and runtime startup paths. + MongoDB mode is selectable via HFS_STORAGE_BACKEND and validated in configuration parsing tests. + Core CRUD/version/history/tenant behavior passes defined contract tests for implemented capabilities. + MongoDB + Elasticsearch composite path is implemented with routing and sync tests passing. + Capability matrix and roadmap documentation reflect actual support levels (no aspirational mismatch). + CI includes Mongo-targeted stages and remains green for required feature sets. + + From a3a4bdf06efb2ebe4836b9713c04ba3c16c7b1a8 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 1 Mar 2026 18:15:44 -0400 Subject: [PATCH 02/17] feat: add mongodb backend phase1 --- crates/persistence/README.md | 163 ++++---- crates/persistence/src/backends/mod.rs | 4 +- .../src/backends/mongodb/backend.rs | 379 ++++++++++++++++++ .../persistence/src/backends/mongodb/mod.rs | 17 + .../src/backends/mongodb/schema.rs | 19 + roadmap_mongo.xml | 5 +- 6 files changed, 506 insertions(+), 81 deletions(-) create mode 100644 crates/persistence/src/backends/mongodb/backend.rs create mode 100644 crates/persistence/src/backends/mongodb/mod.rs create mode 100644 crates/persistence/src/backends/mongodb/schema.rs diff --git a/crates/persistence/README.md b/crates/persistence/README.md index 89465d33..35a13e30 100644 --- a/crates/persistence/README.md +++ b/crates/persistence/README.md @@ -128,6 +128,10 @@ helios-persistence/ │ │ │ └── search/ # Search query building │ │ │ ├── query_builder.rs # SQL with $N params, ILIKE, TIMESTAMPTZ │ │ │ └── writer.rs # Search index writer +│ │ ├── mongodb/ # MongoDB backend (phase 1 scaffold) +│ │ │ ├── backend.rs # MongoBackend + MongoBackendConfig +│ │ │ ├── schema.rs # Schema/index bootstrap placeholders +│ │ │ └── mod.rs # Module wiring and re-exports │ │ └── elasticsearch/ # Search-optimized secondary backend │ │ ├── backend.rs # ElasticsearchBackend with config │ │ ├── storage.rs # ResourceStorage for sync support @@ -297,79 +301,81 @@ The matrix below shows which FHIR operations each backend supports. This reflect **Legend:** ✓ Implemented | ◐ Partial | ○ Planned | ✗ Not planned | † Requires external service -| Feature | SQLite | PostgreSQL | MongoDB | Cassandra | Neo4j | Elasticsearch | S3 | -|---------|--------|------------|---------|-----------|-------|---------------|-----| -| **Core Operations** | -| [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | -| [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ✗ | -| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ○ | -| [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | -| [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | -| [Batch Bundles](https://build.fhir.org/http.html#batch) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | -| [Transaction Bundles](https://build.fhir.org/http.html#transaction) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | -| [Conditional Operations](https://build.fhir.org/http.html#cond-update) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | -| [Conditional Patch](https://build.fhir.org/http.html#patch) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | -| [Delete History](https://build.fhir.org/http.html#delete) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | -| **Multitenancy** | -| Shared Schema | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| Schema-per-Tenant | ✗ | ○ | ○ | ✗ | ✗ | ✗ | ✗ | -| Database-per-Tenant | ✓ | ○ | ○ | ○ | ○ | ○ | ○ | -| Row-Level Security | ✗ | ○ | ✗ | ✗ | ✗ | ✗ | ✗ | -| **[Search Parameters](https://build.fhir.org/search.html#ptypes)** | -| [String](https://build.fhir.org/search.html#string) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| [Token](https://build.fhir.org/search.html#token) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ✗ | -| [Reference](https://build.fhir.org/search.html#reference) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| [Date](https://build.fhir.org/search.html#date) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| [Number](https://build.fhir.org/search.html#number) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ○ | -| [Quantity](https://build.fhir.org/search.html#quantity) | ✓ | ✓ | ○ | ✗ | ✗ | ✓ | ○ | -| [URI](https://build.fhir.org/search.html#uri) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| [Composite](https://build.fhir.org/search.html#composite) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ✗ | -| **[Search Modifiers](https://build.fhir.org/search.html#modifiers)** | -| [:exact](https://build.fhir.org/search.html#modifiers) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| [:contains](https://build.fhir.org/search.html#modifiers) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| [:text](https://build.fhir.org/search.html#modifiers) (full-text) | ✓ | ◐ | ○ | ✗ | ✗ | ✓ | ✗ | -| [:not](https://build.fhir.org/search.html#modifiers) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ○ | -| [:missing](https://build.fhir.org/search.html#modifiers) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ○ | -| [:above / :below](https://build.fhir.org/search.html#modifiers) | ✗ | †○ | †○ | ✗ | ○ | ✓ | ✗ | -| [:in / :not-in](https://build.fhir.org/search.html#modifiers) | ✗ | †○ | †○ | ✗ | ○ | †○ | ✗ | -| [:of-type](https://build.fhir.org/search.html#modifiers) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ✗ | -| [:text-advanced](https://build.fhir.org/search.html#modifiertextadvanced) | ✓ | †○ | †○ | ✗ | ✗ | ✓ | ✗ | -| **[Special Parameters](https://build.fhir.org/search.html#all)** | -| [_text](https://build.fhir.org/search.html#_text) (narrative search) | ✓ | ◐ | ○ | ✗ | ✗ | ✓ | ✗ | -| [_content](https://build.fhir.org/search.html#_content) (full content) | ✓ | ◐ | ○ | ✗ | ✗ | ✓ | ✗ | -| [_filter](https://build.fhir.org/search.html#_filter) (advanced filtering) | ✓ | ○ | ○ | ✗ | ○ | ○ | ✗ | -| **Advanced Search** | -| [Chained Parameters](https://build.fhir.org/search.html#chaining) | ✓ | ◐ | ○ | ✗ | ○ | ✗ | ✗ | -| [Reverse Chaining (_has)](https://build.fhir.org/search.html#has) | ✓ | ◐ | ○ | ✗ | ○ | ✗ | ✗ | -| [_include](https://build.fhir.org/search.html#include) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| [_revinclude](https://build.fhir.org/search.html#revinclude) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| **[Pagination](https://build.fhir.org/http.html#paging)** | -| Offset | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| Cursor (keyset) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| **[Sorting](https://build.fhir.org/search.html#sort)** | -| Single field | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| Multiple fields | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| **[Bulk Operations](https://hl7.org/fhir/uv/bulkdata/)** | -| [Bulk Export](https://hl7.org/fhir/uv/bulkdata/export.html) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | -| [Bulk Submit](https://hackmd.io/@argonaut/rJoqHZrPle) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | +> **MongoDB Status:** Phase 1 backend scaffolding is implemented (module export, config, and `Backend` trait wiring). Capability rows below remain `○` until Phase 2+ storage/search behavior is implemented. + +| Feature | SQLite | PostgreSQL | MongoDB | Cassandra | Neo4j | Elasticsearch | S3 | +| --------------------------------------------------------------------------- | ------ | ---------- | ------- | --------- | ----- | ------------- | --- | +| **Core Operations** | +| [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | +| [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ✗ | +| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ○ | +| [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [Batch Bundles](https://build.fhir.org/http.html#batch) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | +| [Transaction Bundles](https://build.fhir.org/http.html#transaction) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [Conditional Operations](https://build.fhir.org/http.html#cond-update) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | +| [Conditional Patch](https://build.fhir.org/http.html#patch) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | +| [Delete History](https://build.fhir.org/http.html#delete) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| **Multitenancy** | +| Shared Schema | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| Schema-per-Tenant | ✗ | ○ | ○ | ✗ | ✗ | ✗ | ✗ | +| Database-per-Tenant | ✓ | ○ | ○ | ○ | ○ | ○ | ○ | +| Row-Level Security | ✗ | ○ | ✗ | ✗ | ✗ | ✗ | ✗ | +| **[Search Parameters](https://build.fhir.org/search.html#ptypes)** | +| [String](https://build.fhir.org/search.html#string) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| [Token](https://build.fhir.org/search.html#token) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ✗ | +| [Reference](https://build.fhir.org/search.html#reference) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| [Date](https://build.fhir.org/search.html#date) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| [Number](https://build.fhir.org/search.html#number) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ○ | +| [Quantity](https://build.fhir.org/search.html#quantity) | ✓ | ✓ | ○ | ✗ | ✗ | ✓ | ○ | +| [URI](https://build.fhir.org/search.html#uri) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| [Composite](https://build.fhir.org/search.html#composite) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ✗ | +| **[Search Modifiers](https://build.fhir.org/search.html#modifiers)** | +| [:exact](https://build.fhir.org/search.html#modifiers) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| [:contains](https://build.fhir.org/search.html#modifiers) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| [:text](https://build.fhir.org/search.html#modifiers) (full-text) | ✓ | ◐ | ○ | ✗ | ✗ | ✓ | ✗ | +| [:not](https://build.fhir.org/search.html#modifiers) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ○ | +| [:missing](https://build.fhir.org/search.html#modifiers) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ○ | +| [:above / :below](https://build.fhir.org/search.html#modifiers) | ✗ | †○ | †○ | ✗ | ○ | ✓ | ✗ | +| [:in / :not-in](https://build.fhir.org/search.html#modifiers) | ✗ | †○ | †○ | ✗ | ○ | †○ | ✗ | +| [:of-type](https://build.fhir.org/search.html#modifiers) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ✗ | +| [:text-advanced](https://build.fhir.org/search.html#modifiertextadvanced) | ✓ | †○ | †○ | ✗ | ✗ | ✓ | ✗ | +| **[Special Parameters](https://build.fhir.org/search.html#all)** | +| [\_text](https://build.fhir.org/search.html#_text) (narrative search) | ✓ | ◐ | ○ | ✗ | ✗ | ✓ | ✗ | +| [\_content](https://build.fhir.org/search.html#_content) (full content) | ✓ | ◐ | ○ | ✗ | ✗ | ✓ | ✗ | +| [\_filter](https://build.fhir.org/search.html#_filter) (advanced filtering) | ✓ | ○ | ○ | ✗ | ○ | ○ | ✗ | +| **Advanced Search** | +| [Chained Parameters](https://build.fhir.org/search.html#chaining) | ✓ | ◐ | ○ | ✗ | ○ | ✗ | ✗ | +| [Reverse Chaining (\_has)](https://build.fhir.org/search.html#has) | ✓ | ◐ | ○ | ✗ | ○ | ✗ | ✗ | +| [\_include](https://build.fhir.org/search.html#include) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| [\_revinclude](https://build.fhir.org/search.html#revinclude) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| **[Pagination](https://build.fhir.org/http.html#paging)** | +| Offset | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| Cursor (keyset) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| **[Sorting](https://build.fhir.org/search.html#sort)** | +| Single field | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| Multiple fields | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| **[Bulk Operations](https://hl7.org/fhir/uv/bulkdata/)** | +| [Bulk Export](https://hl7.org/fhir/uv/bulkdata/export.html) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | +| [Bulk Submit](https://hackmd.io/@argonaut/rJoqHZrPle) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | ### Primary/Secondary Role Matrix Backends can serve as primary (CRUD, versioning, transactions) or secondary (optimized for specific query patterns). When a secondary search backend is configured, the primary backend's search indexing is automatically disabled to avoid data duplication. -| Configuration | Primary | Secondary | Status | Use Case | -|---|---|---|---|---| -| SQLite alone | SQLite | — | ✓ Implemented | Development, testing, small deployments | -| SQLite + Elasticsearch | SQLite | Elasticsearch (search) | ✓ Implemented | Small prod with robust search | -| PostgreSQL alone | PostgreSQL | — | ✓ Implemented | Production OLTP | -| PostgreSQL + Elasticsearch | PostgreSQL | Elasticsearch (search) | ✓ Implemented | OLTP + advanced search | -| PostgreSQL + Neo4j | PostgreSQL | Neo4j (graph) | Planned | Graph-heavy queries | -| Cassandra alone | Cassandra | — | Planned | High write throughput | -| Cassandra + Elasticsearch | Cassandra | Elasticsearch (search) | Planned | Write-heavy + search | -| MongoDB alone | MongoDB | — | Planned | Document-centric | -| S3 alone | S3 | — | Planned | Archival/bulk storage | -| S3 + Elasticsearch | S3 | Elasticsearch (search) | Planned | Large-scale + search | +| Configuration | Primary | Secondary | Status | Use Case | +| -------------------------- | ---------- | ---------------------- | -------------------------------- | --------------------------------------- | +| SQLite alone | SQLite | — | ✓ Implemented | Development, testing, small deployments | +| SQLite + Elasticsearch | SQLite | Elasticsearch (search) | ✓ Implemented | Small prod with robust search | +| PostgreSQL alone | PostgreSQL | — | ✓ Implemented | Production OLTP | +| PostgreSQL + Elasticsearch | PostgreSQL | Elasticsearch (search) | ✓ Implemented | OLTP + advanced search | +| PostgreSQL + Neo4j | PostgreSQL | Neo4j (graph) | Planned | Graph-heavy queries | +| Cassandra alone | Cassandra | — | Planned | High write throughput | +| Cassandra + Elasticsearch | Cassandra | Elasticsearch (search) | Planned | Write-heavy + search | +| MongoDB alone | MongoDB | — | ◐ In progress (Phase 1 scaffold) | Document-centric | +| S3 alone | S3 | — | Planned | Archival/bulk storage | +| S3 + Elasticsearch | S3 | Elasticsearch (search) | Planned | Large-scale + search | ### Backend Selection Guide @@ -774,7 +780,8 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp ### Phase 5+: Additional Backends (Planned) - [ ] Cassandra backend (wide-column, partition keys) -- [ ] MongoDB backend (document storage, aggregation) +- [x] MongoDB Phase 1 scaffold (module wiring, config, Backend trait baseline, schema placeholders) +- [ ] MongoDB core storage/search implementation (CRUD, history, transactions, query execution) - [ ] Neo4j backend (graph queries, Cypher) - [ ] S3 backend (bulk export, object storage) @@ -804,14 +811,16 @@ The composite storage layer enables polyglot persistence by coordinating multipl ### Valid Backend Configurations -| Configuration | Primary | Secondary(s) | Status | Use Case | -|---------------|---------|--------------|--------|----------| -| SQLite-only | SQLite | None | ✓ Implemented | Development, small deployments | -| SQLite + ES | SQLite | Elasticsearch | ✓ Implemented | Small prod with robust search | -| PostgreSQL-only | PostgreSQL | None | ✓ Implemented | Production OLTP | -| PostgreSQL + ES | PostgreSQL | Elasticsearch | ✓ Implemented | OLTP + advanced search | -| PostgreSQL + Neo4j | PostgreSQL | Neo4j | Planned | Graph-heavy queries | -| S3 + ES | S3 | Elasticsearch | Planned | Large-scale, cheap storage | +| Configuration | Primary | Secondary(s) | Status | Use Case | +| ------------------ | ---------- | ------------- | ------------- | ------------------------------ | +| SQLite-only | SQLite | None | ✓ Implemented | Development, small deployments | +| SQLite + ES | SQLite | Elasticsearch | ✓ Implemented | Small prod with robust search | +| PostgreSQL-only | PostgreSQL | None | ✓ Implemented | Production OLTP | +| PostgreSQL + ES | PostgreSQL | Elasticsearch | ✓ Implemented | OLTP + advanced search | +| PostgreSQL + Neo4j | PostgreSQL | Neo4j | Planned | Graph-heavy queries | +| S3 + ES | S3 | Elasticsearch | Planned | Large-scale, cheap storage | + +> **MongoDB Note:** A Phase 1 scaffold exists under `src/backends/mongodb`, but runtime `HFS_STORAGE_BACKEND` modes for MongoDB are not yet enabled (planned in a later phase). ### Quick Start diff --git a/crates/persistence/src/backends/mod.rs b/crates/persistence/src/backends/mod.rs index dca84266..4e6547c5 100644 --- a/crates/persistence/src/backends/mod.rs +++ b/crates/persistence/src/backends/mod.rs @@ -41,8 +41,8 @@ pub mod postgres; // #[cfg(feature = "cassandra")] // pub mod cassandra; // -// #[cfg(feature = "mongodb")] -// pub mod mongodb; +#[cfg(feature = "mongodb")] +pub mod mongodb; // // #[cfg(feature = "neo4j")] // pub mod neo4j; diff --git a/crates/persistence/src/backends/mongodb/backend.rs b/crates/persistence/src/backends/mongodb/backend.rs new file mode 100644 index 00000000..a4bff1a6 --- /dev/null +++ b/crates/persistence/src/backends/mongodb/backend.rs @@ -0,0 +1,379 @@ +//! MongoDB backend implementation (phase 1 scaffold). + +use std::fmt::Debug; +use std::path::PathBuf; +use std::sync::Arc; + +use async_trait::async_trait; +use parking_lot::RwLock; +use serde::{Deserialize, Serialize}; + +use helios_fhir::FhirVersion; + +use crate::core::{Backend, BackendCapability, BackendKind}; +use crate::error::{BackendError, StorageError, StorageResult}; +use crate::search::{SearchParameterExtractor, SearchParameterLoader, SearchParameterRegistry}; + +use super::schema; + +/// MongoDB backend for FHIR resource storage. +/// +/// This is a phase 1 scaffold that provides backend wiring, configuration, +/// and trait-level integration. Resource CRUD/search/history are implemented +/// in later phases. +pub struct MongoBackend { + config: MongoBackendConfig, + /// Search parameter registry (in-memory cache of active parameters). + search_registry: Arc>, + /// Extractor for deriving searchable values from resources. + search_extractor: Arc, +} + +impl Debug for MongoBackend { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + f.debug_struct("MongoBackend") + .field("config", &self.config) + .field("search_registry_len", &self.search_registry.read().len()) + .finish_non_exhaustive() + } +} + +/// Configuration for the MongoDB backend. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct MongoBackendConfig { + /// MongoDB connection string. + #[serde(default = "default_connection_string")] + pub connection_string: String, + + /// MongoDB database name used by this backend. + #[serde(default = "default_database_name")] + pub database_name: String, + + /// Maximum number of connections in the driver pool. + #[serde(default = "default_max_connections")] + pub max_connections: u32, + + /// Connection timeout in milliseconds. + #[serde(default = "default_connect_timeout_ms")] + pub connect_timeout_ms: u64, + + /// FHIR version for this backend instance. + #[serde(default)] + pub fhir_version: FhirVersion, + + /// Directory containing FHIR SearchParameter spec files. + #[serde(default)] + pub data_dir: Option, + + /// When true, search indexing is offloaded to a secondary backend. + #[serde(default)] + pub search_offloaded: bool, +} + +fn default_connection_string() -> String { + "mongodb://localhost:27017".to_string() +} + +fn default_database_name() -> String { + "helios".to_string() +} + +fn default_max_connections() -> u32 { + 10 +} + +fn default_connect_timeout_ms() -> u64 { + 5000 +} + +impl Default for MongoBackendConfig { + fn default() -> Self { + Self { + connection_string: default_connection_string(), + database_name: default_database_name(), + max_connections: default_max_connections(), + connect_timeout_ms: default_connect_timeout_ms(), + fhir_version: FhirVersion::default(), + data_dir: None, + search_offloaded: false, + } + } +} + +impl MongoBackend { + /// Creates a new MongoDB backend from the provided configuration. + pub fn new(config: MongoBackendConfig) -> StorageResult { + Self::validate_connection_string(&config.connection_string)?; + + let search_registry = Arc::new(RwLock::new(SearchParameterRegistry::new())); + Self::initialize_search_registry(&search_registry, &config); + let search_extractor = Arc::new(SearchParameterExtractor::new(search_registry.clone())); + + Ok(Self { + config, + search_registry, + search_extractor, + }) + } + + /// Creates a backend from a MongoDB connection string. + pub fn from_connection_string(connection_string: impl Into) -> StorageResult { + let config = MongoBackendConfig { + connection_string: connection_string.into(), + ..Default::default() + }; + Self::new(config) + } + + /// Creates a backend from environment variables. + /// + /// Supported variables: + /// - `HFS_MONGODB_URL` (preferred) + /// - `HFS_MONGODB_URI` (alias) + /// - `HFS_DATABASE_URL` (fallback) + /// - `HFS_MONGODB_DATABASE` (default: `helios`) + /// - `HFS_MONGODB_MAX_CONNECTIONS` (default: `10`) + /// - `HFS_MONGODB_CONNECT_TIMEOUT_MS` (default: `5000`) + pub fn from_env() -> StorageResult { + let connection_string = std::env::var("HFS_MONGODB_URL") + .or_else(|_| std::env::var("HFS_MONGODB_URI")) + .or_else(|_| std::env::var("HFS_DATABASE_URL")) + .unwrap_or_else(|_| default_connection_string()); + + let database_name = + std::env::var("HFS_MONGODB_DATABASE").unwrap_or_else(|_| default_database_name()); + + let max_connections = std::env::var("HFS_MONGODB_MAX_CONNECTIONS") + .ok() + .and_then(|v| v.parse::().ok()) + .unwrap_or_else(default_max_connections); + + let connect_timeout_ms = std::env::var("HFS_MONGODB_CONNECT_TIMEOUT_MS") + .ok() + .and_then(|v| v.parse::().ok()) + .unwrap_or_else(default_connect_timeout_ms); + + let config = MongoBackendConfig { + connection_string, + database_name, + max_connections, + connect_timeout_ms, + ..Default::default() + }; + + Self::new(config) + } + + fn validate_connection_string(connection_string: &str) -> StorageResult<()> { + let uri = connection_string.trim(); + if uri.is_empty() { + return Err(StorageError::Backend(BackendError::ConnectionFailed { + backend_name: "mongodb".to_string(), + message: "MongoDB connection string cannot be empty".to_string(), + })); + } + + if !Self::looks_like_mongodb_uri(uri) { + tracing::warn!( + uri = %uri, + "MongoDB connection string does not start with mongodb:// or mongodb+srv://" + ); + } + + Ok(()) + } + + fn looks_like_mongodb_uri(connection_string: &str) -> bool { + connection_string.starts_with("mongodb://") + || connection_string.starts_with("mongodb+srv://") + } + + fn initialize_search_registry( + registry: &Arc>, + config: &MongoBackendConfig, + ) { + let loader = SearchParameterLoader::new(config.fhir_version); + let mut reg = registry.write(); + + let mut fallback_count = 0; + let mut spec_count = 0; + let mut spec_file: Option = None; + let mut custom_count = 0; + let mut custom_files: Vec = Vec::new(); + + // 1. Load minimal embedded fallback params. + match loader.load_embedded() { + Ok(params) => { + for param in params { + if reg.register(param).is_ok() { + fallback_count += 1; + } + } + } + Err(e) => { + tracing::error!("Failed to load embedded SearchParameters: {}", e); + } + } + + // 2. Load spec file params. + let data_dir = config + .data_dir + .clone() + .unwrap_or_else(|| PathBuf::from("./data")); + let spec_filename = loader.spec_filename(); + let spec_path = data_dir.join(spec_filename); + match loader.load_from_spec_file(&data_dir) { + Ok(params) => { + for param in params { + if reg.register(param).is_ok() { + spec_count += 1; + } + } + if spec_count > 0 { + spec_file = Some(spec_path); + } + } + Err(e) => { + tracing::warn!( + "Could not load spec SearchParameters from {}: {}. Using minimal fallback.", + spec_path.display(), + e + ); + } + } + + // 3. Load custom SearchParameters. + match loader.load_custom_from_directory_with_files(&data_dir) { + Ok((params, files)) => { + for param in params { + if reg.register(param).is_ok() { + custom_count += 1; + } + } + custom_files = files; + } + Err(e) => { + tracing::warn!( + "Error loading custom SearchParameters from {}: {}", + data_dir.display(), + e + ); + } + } + + let resource_type_count = reg.resource_types().len(); + let spec_info = spec_file + .map(|p| format!(" from {}", p.display())) + .unwrap_or_default(); + let custom_info = if custom_files.is_empty() { + String::new() + } else { + format!(" [{}]", custom_files.join(", ")) + }; + + tracing::info!( + "MongoDB SearchParameter registry initialized: {} total ({} spec{}, {} fallback, {} custom{}) covering {} resource types", + reg.len(), + spec_count, + spec_info, + fallback_count, + custom_count, + custom_info, + resource_type_count + ); + } + + /// Initializes the MongoDB schema/index bootstrap for this backend. + pub fn init_schema(&self) -> StorageResult<()> { + schema::initialize_schema(&self.config) + } + + /// Returns the backend configuration. + pub fn config(&self) -> &MongoBackendConfig { + &self.config + } + + /// Returns a reference to the search parameter registry. + pub fn search_registry(&self) -> &Arc> { + &self.search_registry + } + + /// Returns a reference to the search parameter extractor. + pub fn search_extractor(&self) -> &Arc { + &self.search_extractor + } + + /// Returns whether search indexing is offloaded to a secondary backend. + pub fn is_search_offloaded(&self) -> bool { + self.config.search_offloaded + } + + /// Sets the search-offloaded flag. + pub fn set_search_offloaded(&mut self, offloaded: bool) { + self.config.search_offloaded = offloaded; + } +} + +/// Placeholder Mongo connection type used by the phase 1 backend scaffold. +#[derive(Debug)] +pub struct MongoConnection; + +#[async_trait] +impl Backend for MongoBackend { + type Connection = MongoConnection; + + fn kind(&self) -> BackendKind { + BackendKind::MongoDB + } + + fn name(&self) -> &'static str { + "mongodb" + } + + fn supports(&self, _capability: BackendCapability) -> bool { + false + } + + fn capabilities(&self) -> Vec { + Vec::new() + } + + async fn acquire(&self) -> Result { + Err(BackendError::Unavailable { + backend_name: "mongodb".to_string(), + message: "MongoDB phase 1 scaffold does not expose pooled connections yet" + .to_string(), + }) + } + + async fn release(&self, _conn: Self::Connection) { + // No-op in phase 1 scaffold. + } + + async fn health_check(&self) -> Result<(), BackendError> { + if Self::looks_like_mongodb_uri(&self.config.connection_string) { + Ok(()) + } else { + Err(BackendError::Unavailable { + backend_name: "mongodb".to_string(), + message: "Invalid MongoDB connection string format".to_string(), + }) + } + } + + async fn initialize(&self) -> Result<(), BackendError> { + self.init_schema().map_err(|e| BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to initialize schema: {}", e), + source: None, + }) + } + + async fn migrate(&self) -> Result<(), BackendError> { + schema::migrate_schema(&self.config).map_err(|e| BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to run migrations: {}", e), + source: None, + }) + } +} diff --git a/crates/persistence/src/backends/mongodb/mod.rs b/crates/persistence/src/backends/mongodb/mod.rs new file mode 100644 index 00000000..6dfeb159 --- /dev/null +++ b/crates/persistence/src/backends/mongodb/mod.rs @@ -0,0 +1,17 @@ +//! MongoDB backend implementation (phase 1 scaffold). +//! +//! This module introduces the MongoDB backend wiring and baseline driver +//! abstraction required for feature-gated compilation. +//! +//! Phase 1 scope intentionally focuses on: +//! - backend/config scaffolding +//! - core [`crate::core::Backend`] trait implementation +//! - schema/bootstrap placeholders +//! +//! Full resource storage, history, search execution, and composite runtime +//! integration are implemented in later roadmap phases. + +mod backend; +pub(crate) mod schema; + +pub use backend::{MongoBackend, MongoBackendConfig}; diff --git a/crates/persistence/src/backends/mongodb/schema.rs b/crates/persistence/src/backends/mongodb/schema.rs new file mode 100644 index 00000000..10d2e9ed --- /dev/null +++ b/crates/persistence/src/backends/mongodb/schema.rs @@ -0,0 +1,19 @@ +//! MongoDB schema/bootstrap helpers (phase 1 scaffold). + +use crate::error::StorageResult; + +use super::backend::MongoBackendConfig; + +/// Initialize MongoDB collections/indexes required by the backend. +/// +/// Phase 1 behavior: configuration-level bootstrap placeholder. +pub fn initialize_schema(_config: &MongoBackendConfig) -> StorageResult<()> { + Ok(()) +} + +/// Run pending MongoDB schema/index migrations. +/// +/// Phase 1 behavior: no-op placeholder. +pub fn migrate_schema(_config: &MongoBackendConfig) -> StorageResult<()> { + Ok(()) +} diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml index 3e68fe94..647f1696 100644 --- a/roadmap_mongo.xml +++ b/roadmap_mongo.xml @@ -2,10 +2,11 @@ HeliosSoftware/hfs - draft + in-progress TBD date-agnostic 2026-03-01 + Phase 1 completed; Phase 2 is next. SQLite primary, PostgreSQL primary, Elasticsearch secondary @@ -59,7 +60,7 @@ - + Introduce MongoDB backend module scaffolding with compile-time and runtime hooks. Enable mongodb module export in crates/persistence/src/backends/mod.rs. From 061e93fc3f9b5d126da85c78735ce9f6d28b36e2 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 1 Mar 2026 18:32:46 -0400 Subject: [PATCH 03/17] feat: add mongodb phase2 detailed roadmap --- phase2_roadmap.xml | 238 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 238 insertions(+) create mode 100644 phase2_roadmap.xml diff --git a/phase2_roadmap.xml b/phase2_roadmap.xml new file mode 100644 index 00000000..8a0dfaf8 --- /dev/null +++ b/phase2_roadmap.xml @@ -0,0 +1,238 @@ + + + + HeliosSoftware/hfs + planned + TBD + 2 + + + 2026-03-01 + + + + + + Feature-gated MongoDB backend module export is enabled. + Mongo backend scaffolding exists (backend/config/schema wiring). + Core Backend trait integration is compile-safe. + Schema/bootstrap and migration paths exist as placeholders. + + + MongoDB runtime mode selection in HFS storage config is not enabled yet. + No ResourceStorage CRUD implementation exists yet. + No Mongo integration test suite exists yet. + + + + + + Deliver minimum MongoDB parity for ResourceStorage behavior while preserving strict tenant isolation and soft-delete/Gone semantics. + + Implement Mongo ResourceStorage contract methods for create/read/update/delete/exists/count/read_batch/create_or_update. + Enforce tenant isolation in every query path and index strategy. + Implement soft-delete semantics aligned with existing backend behavior and error contracts. + Replace schema placeholders with collection/index bootstrap logic required for Phase 2. + + + VersionedStorage parity (vread, update_with_match). + Instance/type/system history providers. + TransactionProvider parity and session-based ACID guarantees. + Advanced search execution, chained search, reverse chaining, and include/revinclude behavior. + Composite MongoDB + Elasticsearch runtime routing. + + + + + + + + + + + + + + + + + + + + + + + + + + Define stable Mongo persistence layout that supports ResourceStorage semantics and future phase expansion. + + Define canonical live-resource document shape with explicit fields for tenant_id, resource_type, resource_id, version_id, last_updated, is_deleted, and resource payload. + Define resource_history document shape and write strategy that does not block Phase 3 history provider implementation. + Create required indexes for tenant-scoped lookups and uniqueness (tenant_id + resource_type + resource_id for active records). + Decide and document whether soft delete retains unique-key occupancy or allows recreation via version bump policy. + Document collection naming conventions and migration-safe index names. + + + Concrete collection/index bootstrap in schema helpers. + Documented mapping from FHIR resource identity to Mongo keys. + + + + + Implement Phase 2 core storage methods with parity-focused behavior and error mapping. + + Introduce Mongo connection/client acquisition path suitable for storage operations (replacing Phase 1 unavailable acquire behavior for this phase scope). + Implement create semantics with conflict detection and deterministic ID handling. + Implement read semantics including not-found vs gone distinction. + Implement update semantics for existing resources with deterministic metadata updates. + Implement delete semantics using soft-delete/tombstone behavior aligned to existing backends. + Implement exists/count/read_batch/create_or_update helper methods with tenant scope guarantees. + Map Mongo driver errors into existing StorageError/BackendError variants consistently. + + + Mongo ResourceStorage parity for Phase 2 method set. + Consistent error behavior for contract tests and API consumers. + + + + + Guarantee strict tenant isolation in read/write operations and query helpers. + + Introduce shared tenant filter builder utilities to avoid missed tenant predicates. + Require tenant_id in every CRUD/read_batch/count query and write path. + Ensure indexes are tenant-first where query cardinality and safety require it. + Add cross-tenant negative tests for read, update, delete, count, and batch reads. + + + Cross-tenant leakage prevention validated by tests. + + + + + Match existing backend behavior for deleted resource visibility and error semantics. + + Define deleted-state fields (is_deleted/deleted_at/deleted_by_version as needed) for deterministic behavior. + Ensure normal read paths return Gone-compatible outcomes for soft-deleted resources. + Ensure update/create_or_update behavior on deleted resources follows existing backend contract expectations. + Add regression tests for repeated delete and delete-after-update edge cases. + + + Soft-delete behavior parity with SQLite/PostgreSQL expectations for Phase 2 scope. + + + + + Turn Phase 1 schema placeholders into deterministic schema/index bootstrap and migration entry points. + + Implement initialize_schema to create required collections/indexes idempotently. + Implement migrate_schema skeleton with migration version tracking strategy for Mongo indexes. + Add tests for initialize/migrate idempotency and startup safety. + Document migration assumptions and rollback limitations for Mongo index evolution. + + + Deterministic schema bootstrap/migration behavior for Phase 2 and future phases. + + + + + + + Resource document mapping tests (metadata field population and serialization invariants). + Tenant filter builder tests proving tenant predicate inclusion in every query constructor. + Soft-delete state transition tests (active -> deleted -> repeated delete handling). + Schema initialization and migration idempotency tests. + Error conversion tests from Mongo driver errors to StorageError/BackendError. + + + + Create and read round-trip under a single tenant. + Update behavior with immutable identity and mutable payload checks. + Delete and post-delete read behavior (Gone/not found contract). + exists/count/read_batch/create_or_update behavior under realistic tenant-scoped datasets. + Cross-tenant isolation: no access to another tenant records across all supported operations. + Bootstrap and migration execution against fresh and pre-initialized Mongo databases. + + + + Reuse existing persistence test harness and assertions where possible. + Compare Mongo outcomes against SQLite/PostgreSQL expected behavior for methods in scope. + Document any unavoidable deviations before marking the phase complete. + + + + + cargo check -p helios-persistence --features mongodb + cargo check -p helios-rest --features mongodb + cargo check -p helios-hfs --features mongodb + cargo check -p helios-persistence --features "sqlite,postgres,elasticsearch,mongodb" + cargo test -p helios-persistence --features mongodb --test mongodb_tests + cargo test -p helios-persistence --features mongodb mongodb:: + + + + + WS1.1-WS1.5, WS5.1 + Schema bootstrap creates required collections/indexes idempotently. + + + + WS2.1-WS2.5 + create/read/update/delete integration tests pass for single tenant. + + + + WS2.6-WS2.7, WS3.1-WS3.4, WS4.1-WS4.4 + exists/count/read_batch/create_or_update and cross-tenant tests pass. + + + + WS5.2-WS5.4 and status alignment updates + README and capability matrix reflect truthful post-Phase-2 status. + + + + + + + + + + + + + + + Update MongoDB rows in persistence README capability matrix to reflect only capabilities completed in this phase. + Update primary/secondary role matrix status from Phase 1 scaffold wording to Phase 2 wording after tests pass. + Keep all non-implemented capability rows as planned/partial exactly as supported. + + + + + Tenant leakage due to missing tenant filters in one or more query paths. + Centralize tenant filter construction; enforce with negative cross-tenant tests for every operation. + + + Soft-delete behavior diverges from existing Gone semantics. + Mirror sqlite/postgres behavior via contract tests before marking parity complete. + + + Unique index design conflicts with soft-delete and recreation scenarios. + Explicitly define active/deleted uniqueness policy and test both conflict and recreation paths. + + + Schema bootstrap or migration logic is not idempotent across repeated startup runs. + Require repeated initialize/migrate test passes against both fresh and pre-initialized databases. + + + + + All Phase 2 in-scope ResourceStorage methods are implemented and covered by Mongo integration tests. + Tenant isolation behavior matches established sqlite/postgres contract expectations for in-scope methods. + Soft-delete and Gone semantics are validated by regression tests. + Schema bootstrap and migration routines are idempotent and safe to execute at startup. + Validation commands run green for mongodb-only and mixed-feature builds. + Documentation and capability matrix reflect actual support levels with no aspirational mismatch. + + From bd55590dbcd6de79c4c1d20e4bcf38ab562619b2 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 1 Mar 2026 20:46:09 -0400 Subject: [PATCH 04/17] feat: implement mongodb phase2 core storage with CRUD/tenant isolation/soft-delete --- crates/persistence/README.md | 15 +- .../src/backends/mongodb/backend.rs | 131 +++- .../persistence/src/backends/mongodb/mod.rs | 19 +- .../src/backends/mongodb/schema.rs | 198 +++++- .../src/backends/mongodb/storage.rs | 585 ++++++++++++++++++ .../persistence/tests/common/capabilities.rs | 12 +- crates/persistence/tests/mongodb_tests.rs | 261 ++++++++ phase2_roadmap.xml | 184 ++++-- 8 files changed, 1287 insertions(+), 118 deletions(-) create mode 100644 crates/persistence/src/backends/mongodb/storage.rs create mode 100644 crates/persistence/tests/mongodb_tests.rs diff --git a/crates/persistence/README.md b/crates/persistence/README.md index 35a13e30..84364f04 100644 --- a/crates/persistence/README.md +++ b/crates/persistence/README.md @@ -184,6 +184,7 @@ helios-persistence/ ├── composite_polyglot_tests.rs # Multi-backend tests ├── sqlite_tests.rs # SQLite backend tests ├── postgres_tests.rs # PostgreSQL backend tests + ├── mongodb_tests.rs # MongoDB backend tests └── elasticsearch_tests.rs # Elasticsearch backend tests ``` @@ -306,7 +307,7 @@ The matrix below shows which FHIR operations each backend supports. This reflect | Feature | SQLite | PostgreSQL | MongoDB | Cassandra | Neo4j | Elasticsearch | S3 | | --------------------------------------------------------------------------- | ------ | ---------- | ------- | --------- | ----- | ------------- | --- | | **Core Operations** | -| [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | | [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | | [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ✗ | | [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ○ | @@ -318,7 +319,7 @@ The matrix below shows which FHIR operations each backend supports. This reflect | [Conditional Patch](https://build.fhir.org/http.html#patch) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | | [Delete History](https://build.fhir.org/http.html#delete) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | | **Multitenancy** | -| Shared Schema | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| Shared Schema | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | | Schema-per-Tenant | ✗ | ○ | ○ | ✗ | ✗ | ✗ | ✗ | | Database-per-Tenant | ✓ | ○ | ○ | ○ | ○ | ○ | ○ | | Row-Level Security | ✗ | ○ | ✗ | ✗ | ✗ | ✗ | ✗ | @@ -636,7 +637,7 @@ let composite = CompositeStorage::new(config, backends)? - [x] ResourceStorage trait (CRUD operations) - [x] VersionedStorage trait (vread, If-Match) - [x] History provider traits (instance, type, system) -- [x] Search provider traits (basic, chained, _include, terminology) +- [x] Search provider traits (basic, chained, \_include, terminology) - [x] Transaction traits (ACID, bundles) - [x] Capabilities trait (CapabilityStatement generation) @@ -779,9 +780,11 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] ReindexableStorage implementation ### Phase 5+: Additional Backends (Planned) + - [ ] Cassandra backend (wide-column, partition keys) -- [x] MongoDB Phase 1 scaffold (module wiring, config, Backend trait baseline, schema placeholders) -- [ ] MongoDB core storage/search implementation (CRUD, history, transactions, query execution) +- [x] MongoDB Phase 1 scaffold (module wiring, config, Backend trait baseline) +- [x] MongoDB Phase 2 core storage parity (CRUD/count/read_batch/create_or_update, tenant isolation, soft-delete, schema bootstrap) +- [ ] MongoDB Phase 3+ advanced semantics (versioning/history/conditional/transactions/search execution) - [ ] Neo4j backend (graph queries, Cypher) - [ ] S3 backend (bulk export, object storage) @@ -820,7 +823,7 @@ The composite storage layer enables polyglot persistence by coordinating multipl | PostgreSQL + Neo4j | PostgreSQL | Neo4j | Planned | Graph-heavy queries | | S3 + ES | S3 | Elasticsearch | Planned | Large-scale, cheap storage | -> **MongoDB Note:** A Phase 1 scaffold exists under `src/backends/mongodb`, but runtime `HFS_STORAGE_BACKEND` modes for MongoDB are not yet enabled (planned in a later phase). +> **MongoDB Note:** Phase 2 core storage is implemented under `src/backends/mongodb`, but runtime `HFS_STORAGE_BACKEND` modes for MongoDB are not yet enabled (planned in a later phase). ### Quick Start diff --git a/crates/persistence/src/backends/mongodb/backend.rs b/crates/persistence/src/backends/mongodb/backend.rs index a4bff1a6..4dd6802d 100644 --- a/crates/persistence/src/backends/mongodb/backend.rs +++ b/crates/persistence/src/backends/mongodb/backend.rs @@ -1,10 +1,16 @@ -//! MongoDB backend implementation (phase 1 scaffold). +//! MongoDB backend implementation. use std::fmt::Debug; use std::path::PathBuf; use std::sync::Arc; +use std::time::Duration; use async_trait::async_trait; +use mongodb::{ + Client, Database, + bson::doc, + options::ClientOptions, +}; use parking_lot::RwLock; use serde::{Deserialize, Serialize}; @@ -18,9 +24,10 @@ use super::schema; /// MongoDB backend for FHIR resource storage. /// -/// This is a phase 1 scaffold that provides backend wiring, configuration, -/// and trait-level integration. Resource CRUD/search/history are implemented -/// in later phases. +/// The phase 2 implementation provides backend wiring, schema bootstrap, +/// and core ResourceStorage behavior for CRUD/count + tenant isolation. +/// +/// Versioned/history/search/composite behavior is implemented in later phases. pub struct MongoBackend { config: MongoBackendConfig, /// Search parameter registry (in-memory cache of active parameters). @@ -101,6 +108,9 @@ impl Default for MongoBackendConfig { } impl MongoBackend { + pub(crate) const RESOURCES_COLLECTION: &'static str = "resources"; + pub(crate) const RESOURCE_HISTORY_COLLECTION: &'static str = "resource_history"; + /// Creates a new MongoDB backend from the provided configuration. pub fn new(config: MongoBackendConfig) -> StorageResult { Self::validate_connection_string(&config.connection_string)?; @@ -284,8 +294,39 @@ impl MongoBackend { } /// Initializes the MongoDB schema/index bootstrap for this backend. - pub fn init_schema(&self) -> StorageResult<()> { - schema::initialize_schema(&self.config) + pub async fn init_schema(&self) -> StorageResult<()> { + let db = self.get_database().await?; + schema::initialize_schema_async(&db).await + } + + /// Creates a MongoDB client from backend configuration. + pub(crate) async fn get_client(&self) -> StorageResult { + let mut client_options = ClientOptions::parse(&self.config.connection_string) + .await + .map_err(|e| { + StorageError::Backend(BackendError::ConnectionFailed { + backend_name: "mongodb".to_string(), + message: e.to_string(), + }) + })?; + + client_options.max_pool_size = Some(self.config.max_connections); + client_options.connect_timeout = Some(Duration::from_millis(self.config.connect_timeout_ms)); + client_options.app_name = Some("helios-persistence".to_string()); + + Client::with_options(client_options).map_err(|e| { + StorageError::Backend(BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to create MongoDB client: {}", e), + source: None, + }) + }) + } + + /// Returns the configured MongoDB database handle. + pub(crate) async fn get_database(&self) -> StorageResult { + let client = self.get_client().await?; + Ok(client.database(&self.config.database_name)) } /// Returns the backend configuration. @@ -314,9 +355,19 @@ impl MongoBackend { } } -/// Placeholder Mongo connection type used by the phase 1 backend scaffold. -#[derive(Debug)] -pub struct MongoConnection; +/// Connection wrapper for MongoDB. +#[derive(Clone)] +pub struct MongoConnection { + pub(crate) database: Database, +} + +impl Debug for MongoConnection { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + f.debug_struct("MongoConnection") + .field("database", &self.database.name()) + .finish_non_exhaustive() + } +} #[async_trait] impl Backend for MongoBackend { @@ -330,39 +381,61 @@ impl Backend for MongoBackend { "mongodb" } - fn supports(&self, _capability: BackendCapability) -> bool { - false + fn supports(&self, capability: BackendCapability) -> bool { + matches!( + capability, + BackendCapability::Crud | BackendCapability::SharedSchema + ) } fn capabilities(&self) -> Vec { - Vec::new() + vec![BackendCapability::Crud, BackendCapability::SharedSchema] } async fn acquire(&self) -> Result { - Err(BackendError::Unavailable { - backend_name: "mongodb".to_string(), - message: "MongoDB phase 1 scaffold does not expose pooled connections yet" - .to_string(), - }) + let client = self + .get_client() + .await + .map_err(|e| BackendError::ConnectionFailed { + backend_name: "mongodb".to_string(), + message: e.to_string(), + })?; + let database = client.database(&self.config.database_name); + Ok(MongoConnection { database }) } async fn release(&self, _conn: Self::Connection) { - // No-op in phase 1 scaffold. + // MongoDB connection pooling is managed by the client internally. } async fn health_check(&self) -> Result<(), BackendError> { - if Self::looks_like_mongodb_uri(&self.config.connection_string) { - Ok(()) - } else { - Err(BackendError::Unavailable { + if !Self::looks_like_mongodb_uri(&self.config.connection_string) { + return Err(BackendError::Unavailable { backend_name: "mongodb".to_string(), message: "Invalid MongoDB connection string format".to_string(), - }) + }); } + + let db = self + .get_database() + .await + .map_err(|e| BackendError::Unavailable { + backend_name: "mongodb".to_string(), + message: format!("Unable to create database handle: {}", e), + })?; + + db.run_command(doc! { "ping": 1_i32 }) + .await + .map_err(|e| BackendError::Unavailable { + backend_name: "mongodb".to_string(), + message: format!("Health check failed: {}", e), + })?; + + Ok(()) } async fn initialize(&self) -> Result<(), BackendError> { - self.init_schema().map_err(|e| BackendError::Internal { + self.init_schema().await.map_err(|e| BackendError::Internal { backend_name: "mongodb".to_string(), message: format!("Failed to initialize schema: {}", e), source: None, @@ -370,7 +443,15 @@ impl Backend for MongoBackend { } async fn migrate(&self) -> Result<(), BackendError> { - schema::migrate_schema(&self.config).map_err(|e| BackendError::Internal { + let db = self.get_database().await.map_err(|e| BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to acquire database for migration: {}", e), + source: None, + })?; + + schema::migrate_schema_async(&db) + .await + .map_err(|e| BackendError::Internal { backend_name: "mongodb".to_string(), message: format!("Failed to run migrations: {}", e), source: None, diff --git a/crates/persistence/src/backends/mongodb/mod.rs b/crates/persistence/src/backends/mongodb/mod.rs index 6dfeb159..d50ba1dd 100644 --- a/crates/persistence/src/backends/mongodb/mod.rs +++ b/crates/persistence/src/backends/mongodb/mod.rs @@ -1,17 +1,18 @@ -//! MongoDB backend implementation (phase 1 scaffold). +//! MongoDB backend implementation. //! -//! This module introduces the MongoDB backend wiring and baseline driver -//! abstraction required for feature-gated compilation. +//! This module provides MongoDB backend wiring, schema bootstrap helpers, +//! and core storage contract support. //! -//! Phase 1 scope intentionally focuses on: -//! - backend/config scaffolding -//! - core [`crate::core::Backend`] trait implementation -//! - schema/bootstrap placeholders +//! Phase 2 scope focuses on: +//! - backend/config wiring and health checks +//! - core [`crate::core::ResourceStorage`] contract parity for CRUD/count +//! - tenant isolation and soft-delete semantics +//! - schema/index bootstrap foundations //! -//! Full resource storage, history, search execution, and composite runtime -//! integration are implemented in later roadmap phases. +//! Versioned/history/search/composite behavior remains part of later phases. mod backend; pub(crate) mod schema; +mod storage; pub use backend::{MongoBackend, MongoBackendConfig}; diff --git a/crates/persistence/src/backends/mongodb/schema.rs b/crates/persistence/src/backends/mongodb/schema.rs index 10d2e9ed..741281fe 100644 --- a/crates/persistence/src/backends/mongodb/schema.rs +++ b/crates/persistence/src/backends/mongodb/schema.rs @@ -1,19 +1,203 @@ -//! MongoDB schema/bootstrap helpers (phase 1 scaffold). +//! MongoDB schema/bootstrap helpers. -use crate::error::StorageResult; +use mongodb::{ + Client, Collection, Database, IndexModel, + bson::{Document, doc}, + options::{ClientOptions, IndexOptions}, +}; +use tokio::runtime::RuntimeFlavor; + +use crate::error::{BackendError, StorageError, StorageResult}; use super::backend::MongoBackendConfig; +/// Current MongoDB schema version. +pub const SCHEMA_VERSION: i32 = 2; + /// Initialize MongoDB collections/indexes required by the backend. /// -/// Phase 1 behavior: configuration-level bootstrap placeholder. -pub fn initialize_schema(_config: &MongoBackendConfig) -> StorageResult<()> { - Ok(()) +/// Prefer using [`initialize_schema_async`] from async contexts. +#[allow(dead_code)] +pub fn initialize_schema(config: &MongoBackendConfig) -> StorageResult<()> { + run_with_runtime(async { + let client = create_client(config).await?; + let db = client.database(&config.database_name); + initialize_schema_async(&db).await + }) } /// Run pending MongoDB schema/index migrations. /// -/// Phase 1 behavior: no-op placeholder. -pub fn migrate_schema(_config: &MongoBackendConfig) -> StorageResult<()> { +/// Prefer using [`migrate_schema_async`] from async contexts. +#[allow(dead_code)] +pub fn migrate_schema(config: &MongoBackendConfig) -> StorageResult<()> { + run_with_runtime(async { + let client = create_client(config).await?; + let db = client.database(&config.database_name); + migrate_schema_async(&db).await + }) +} + +/// Initialize the MongoDB schema and indexes asynchronously. +pub async fn initialize_schema_async(database: &Database) -> StorageResult<()> { + ensure_resources_indexes(database).await?; + ensure_history_indexes(database).await?; + set_schema_version(database, SCHEMA_VERSION).await?; + Ok(()) +} + +/// Run pending MongoDB schema/index migrations asynchronously. +pub async fn migrate_schema_async(database: &Database) -> StorageResult<()> { + let current = get_schema_version(database).await?; + if current < SCHEMA_VERSION { + ensure_resources_indexes(database).await?; + ensure_history_indexes(database).await?; + set_schema_version(database, SCHEMA_VERSION).await?; + } + Ok(()) +} + +#[allow(dead_code)] +async fn create_client(config: &MongoBackendConfig) -> StorageResult { + let mut options = ClientOptions::parse(&config.connection_string) + .await + .map_err(|e| { + StorageError::Backend(BackendError::ConnectionFailed { + backend_name: "mongodb".to_string(), + message: e.to_string(), + }) + })?; + + options.max_pool_size = Some(config.max_connections); + options.connect_timeout = Some(std::time::Duration::from_millis(config.connect_timeout_ms)); + options.app_name = Some("helios-persistence".to_string()); + + Client::with_options(options).map_err(|e| { + StorageError::Backend(BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to create MongoDB client: {}", e), + source: None, + }) + }) +} + +async fn ensure_resources_indexes(database: &Database) -> StorageResult<()> { + let resources = database.collection::("resources"); + + create_index( + &resources, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "id": 1_i32 }, + "idx_resources_identity", + true, + ) + .await?; + + create_index( + &resources, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "is_deleted": 1_i32 }, + "idx_resources_type_deleted", + false, + ) + .await?; + + create_index( + &resources, + doc! { "tenant_id": 1_i32, "last_updated": -1_i32 }, + "idx_resources_updated", + false, + ) + .await?; + + Ok(()) +} + +async fn ensure_history_indexes(database: &Database) -> StorageResult<()> { + let history = database.collection::("resource_history"); + + create_index( + &history, + doc! { + "tenant_id": 1_i32, + "resource_type": 1_i32, + "id": 1_i32, + "version_id": 1_i32 + }, + "idx_history_identity", + true, + ) + .await?; + + create_index( + &history, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "id": 1_i32, "last_updated": -1_i32 }, + "idx_history_resource_updated", + false, + ) + .await?; + + Ok(()) +} + +async fn create_index( + collection: &Collection, + keys: Document, + name: &str, + unique: bool, +) -> StorageResult<()> { + let options = IndexOptions::builder() + .name(Some(name.to_string())) + .unique(Some(unique)) + .build(); + + let model = IndexModel::builder().keys(keys).options(Some(options)).build(); + collection.create_index(model).await?; Ok(()) } + +async fn get_schema_version(database: &Database) -> StorageResult { + let collection = database.collection::("schema_version"); + let doc = collection.find_one(doc! { "_id": "schema_version" }).await?; + let version = doc + .and_then(|d| d.get_i32("version").ok()) + .unwrap_or(0_i32); + Ok(version) +} + +async fn set_schema_version(database: &Database, version: i32) -> StorageResult<()> { + let collection = database.collection::("schema_version"); + collection.delete_many(doc! { "_id": "schema_version" }).await?; + collection + .insert_one(doc! { + "_id": "schema_version", + "version": version, + }) + .await?; + Ok(()) +} + +#[allow(dead_code)] +fn run_with_runtime(future: F) -> StorageResult<()> +where + F: std::future::Future>, +{ + if let Ok(handle) = tokio::runtime::Handle::try_current() { + match handle.runtime_flavor() { + RuntimeFlavor::MultiThread => tokio::task::block_in_place(|| handle.block_on(future)), + RuntimeFlavor::CurrentThread => Err(StorageError::Backend(BackendError::Internal { + backend_name: "mongodb".to_string(), + message: "Cannot run synchronous MongoDB schema initialization inside a current-thread runtime; call Backend::initialize().await instead".to_string(), + source: None, + })), + _ => tokio::task::block_in_place(|| handle.block_on(future)), + } + } else { + let rt = tokio::runtime::Runtime::new().map_err(|e| { + StorageError::Backend(BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to create runtime for schema initialization: {}", e), + source: None, + }) + })?; + rt.block_on(future) + } +} diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs new file mode 100644 index 00000000..b2c19169 --- /dev/null +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -0,0 +1,585 @@ +//! ResourceStorage implementation for MongoDB. + +use async_trait::async_trait; +use chrono::{DateTime, Utc}; +use helios_fhir::FhirVersion; +use mongodb::{ + bson::{self, Bson, DateTime as BsonDateTime, Document, doc}, + error::Error as MongoError, +}; +use serde_json::Value; + +use crate::core::ResourceStorage; +use crate::error::{BackendError, ConcurrencyError, ResourceError, StorageError, StorageResult}; +use crate::tenant::TenantContext; +use crate::types::StoredResource; + +use super::MongoBackend; + +fn internal_error(message: String) -> StorageError { + StorageError::Backend(BackendError::Internal { + backend_name: "mongodb".to_string(), + message, + source: None, + }) +} + +fn serialization_error(message: String) -> StorageError { + StorageError::Backend(BackendError::SerializationError { message }) +} + +fn is_duplicate_key_error(err: &MongoError) -> bool { + err.to_string().contains("E11000") +} + +fn ensure_resource_identity(resource_type: &str, id: &str, resource: &mut Value) { + if let Some(obj) = resource.as_object_mut() { + obj.insert( + "resourceType".to_string(), + Value::String(resource_type.to_string()), + ); + obj.insert("id".to_string(), Value::String(id.to_string())); + } +} + +fn value_to_document(value: &Value) -> StorageResult { + let bson = bson::to_bson(value) + .map_err(|e| serialization_error(format!("Failed to serialize resource: {}", e)))?; + match bson { + Bson::Document(doc) => Ok(doc), + _ => Err(serialization_error( + "Resource payload must serialize to a BSON document".to_string(), + )), + } +} + +fn document_to_value(doc: &Document) -> StorageResult { + bson::from_bson::(Bson::Document(doc.clone())) + .map_err(|e| serialization_error(format!("Failed to deserialize resource: {}", e))) +} + +fn bson_to_chrono(dt: &BsonDateTime) -> DateTime { + DateTime::::from_timestamp_millis(dt.timestamp_millis()).unwrap_or_else(Utc::now) +} + +fn chrono_to_bson(dt: DateTime) -> BsonDateTime { + BsonDateTime::from_millis(dt.timestamp_millis()) +} + +fn next_version(version: &str) -> StorageResult { + let parsed = version.parse::().map_err(|e| { + serialization_error(format!("Invalid version value '{}': {}", version, e)) + })?; + Ok((parsed + 1).to_string()) +} + +fn extract_deleted_at(doc: &Document) -> Option> { + match doc.get("deleted_at") { + Some(Bson::DateTime(dt)) => Some(bson_to_chrono(dt)), + _ => None, + } +} + +fn extract_created_at(doc: &Document, fallback: DateTime) -> DateTime { + doc.get_datetime("created_at") + .map(bson_to_chrono) + .unwrap_or(fallback) +} + +fn extract_last_updated(doc: &Document, fallback: DateTime) -> DateTime { + doc.get_datetime("last_updated") + .map(bson_to_chrono) + .unwrap_or(fallback) +} + +fn extract_fhir_version(doc: &Document, fallback: FhirVersion) -> FhirVersion { + doc.get_str("fhir_version") + .ok() + .and_then(FhirVersion::from_storage) + .unwrap_or(fallback) +} + +#[async_trait] +impl ResourceStorage for MongoBackend { + fn backend_name(&self) -> &'static str { + "mongodb" + } + + async fn create( + &self, + tenant: &TenantContext, + resource_type: &str, + resource: Value, + fhir_version: FhirVersion, + ) -> StorageResult { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + // Extract or generate ID + let id = resource + .get("id") + .and_then(|v| v.as_str()) + .map(String::from) + .unwrap_or_else(|| uuid::Uuid::new_v4().to_string()); + + // Check if resource already exists (including deleted resources). + let existing = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + }) + .await + .map_err(|e| internal_error(format!("Failed to check existence: {}", e)))?; + + if existing.is_some() { + return Err(StorageError::Resource(ResourceError::AlreadyExists { + resource_type: resource_type.to_string(), + id, + })); + } + + let mut resource = resource; + ensure_resource_identity(resource_type, &id, &mut resource); + + let payload = value_to_document(&resource)?; + + let now = Utc::now(); + let now_bson = chrono_to_bson(now); + let version_id = "1".to_string(); + let fhir_version_str = fhir_version.as_mime_param().to_string(); + + let resource_doc = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + "version_id": &version_id, + "data": Bson::Document(payload.clone()), + "created_at": now_bson, + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + }; + + resources.insert_one(resource_doc).await.map_err(|e| { + if is_duplicate_key_error(&e) { + StorageError::Resource(ResourceError::AlreadyExists { + resource_type: resource_type.to_string(), + id: id.clone(), + }) + } else { + internal_error(format!("Failed to insert resource: {}", e)) + } + })?; + + let history_doc = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + "version_id": &version_id, + "data": Bson::Document(payload), + "created_at": now_bson, + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": fhir_version_str, + }; + + history.insert_one(history_doc).await.map_err(|e| { + internal_error(format!("Failed to insert resource history: {}", e)) + })?; + + Ok(StoredResource::from_storage( + resource_type, + &id, + version_id, + tenant.tenant_id().clone(), + resource, + now, + now, + None, + fhir_version, + )) + } + + async fn create_or_update( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + resource: Value, + fhir_version: FhirVersion, + ) -> StorageResult<(StoredResource, bool)> { + let existing = self.read(tenant, resource_type, id).await?; + + if let Some(current) = existing { + let updated = self.update(tenant, ¤t, resource).await?; + Ok((updated, false)) + } else { + let mut resource = resource; + if let Some(obj) = resource.as_object_mut() { + obj.insert("id".to_string(), Value::String(id.to_string())); + } + let created = self + .create(tenant, resource_type, resource, fhir_version) + .await?; + Ok((created, true)) + } + } + + async fn read( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + ) -> StorageResult> { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let maybe_doc = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + }) + .await + .map_err(|e| internal_error(format!("Failed to read resource: {}", e)))?; + + let Some(doc) = maybe_doc else { + return Ok(None); + }; + + let is_deleted = doc.get_bool("is_deleted").unwrap_or(false); + if is_deleted { + return Err(StorageError::Resource(ResourceError::Gone { + resource_type: resource_type.to_string(), + id: id.to_string(), + deleted_at: extract_deleted_at(&doc), + })); + } + + let version_id = doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing version_id: {}", e)))? + .to_string(); + + let payload = doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing resource payload: {}", e)))?; + let content = document_to_value(payload)?; + + let now = Utc::now(); + let created_at = extract_created_at(&doc, now); + let last_updated = extract_last_updated(&doc, now); + let fhir_version = extract_fhir_version(&doc, FhirVersion::default()); + + Ok(Some(StoredResource::from_storage( + resource_type, + id, + version_id, + tenant.tenant_id().clone(), + content, + created_at, + last_updated, + None, + fhir_version, + ))) + } + + async fn update( + &self, + tenant: &TenantContext, + current: &StoredResource, + resource: Value, + ) -> StorageResult { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + let resource_type = current.resource_type(); + let id = current.id(); + + let maybe_existing = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .await + .map_err(|e| internal_error(format!("Failed to load current resource: {}", e)))?; + + let Some(existing_doc) = maybe_existing else { + return Err(StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + })); + }; + + let actual_version = existing_doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing current version: {}", e)))? + .to_string(); + + if actual_version != current.version_id() { + return Err(StorageError::Concurrency(ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: current.version_id().to_string(), + actual_version, + })); + } + + let new_version = next_version(current.version_id())?; + + let mut resource = resource; + ensure_resource_identity(resource_type, id, &mut resource); + let payload = value_to_document(&resource)?; + + let now = Utc::now(); + let now_bson = chrono_to_bson(now); + let fhir_version = current.fhir_version(); + let fhir_version_str = fhir_version.as_mime_param().to_string(); + + let update_result = resources + .update_one( + doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": current.version_id(), + "is_deleted": false, + }, + doc! { + "$set": { + "version_id": &new_version, + "data": Bson::Document(payload.clone()), + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + } + }, + ) + .await + .map_err(|e| internal_error(format!("Failed to update resource: {}", e)))?; + + if update_result.matched_count == 0 { + let latest = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + }) + .await + .map_err(|e| internal_error(format!("Failed to reload version conflict state: {}", e)))?; + + let actual = latest + .as_ref() + .and_then(|d| d.get_str("version_id").ok()) + .unwrap_or("unknown") + .to_string(); + + return Err(StorageError::Concurrency(ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: current.version_id().to_string(), + actual_version: actual, + })); + } + + let created_at = extract_created_at(&existing_doc, now); + + let history_doc = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": &new_version, + "data": Bson::Document(payload), + "created_at": chrono_to_bson(created_at), + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": fhir_version_str, + }; + + history.insert_one(history_doc).await.map_err(|e| { + internal_error(format!("Failed to insert updated history row: {}", e)) + })?; + + Ok(StoredResource::from_storage( + resource_type, + id, + new_version, + tenant.tenant_id().clone(), + resource, + created_at, + now, + None, + fhir_version, + )) + } + + async fn delete( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + ) -> StorageResult<()> { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let maybe_existing = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .await + .map_err(|e| internal_error(format!("Failed to check resource before delete: {}", e)))?; + + let Some(existing_doc) = maybe_existing else { + return Err(StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + })); + }; + + let current_version = existing_doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing current version: {}", e)))? + .to_string(); + let new_version = next_version(¤t_version)?; + + let payload = existing_doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing resource payload: {}", e)))? + .clone(); + let fhir_version = existing_doc + .get_str("fhir_version") + .unwrap_or("4.0") + .to_string(); + let created_at = extract_created_at(&existing_doc, Utc::now()); + + let now = Utc::now(); + let now_bson = chrono_to_bson(now); + + let update_result = resources + .update_one( + doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": ¤t_version, + "is_deleted": false, + }, + doc! { + "$set": { + "version_id": &new_version, + "is_deleted": true, + "deleted_at": now_bson, + "last_updated": now_bson, + } + }, + ) + .await + .map_err(|e| internal_error(format!("Failed to soft-delete resource: {}", e)))?; + + if update_result.matched_count == 0 { + return Err(StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + })); + } + + let history_doc = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": &new_version, + "data": Bson::Document(payload), + "created_at": chrono_to_bson(created_at), + "last_updated": now_bson, + "is_deleted": true, + "deleted_at": now_bson, + "fhir_version": fhir_version, + }; + + history.insert_one(history_doc).await.map_err(|e| { + internal_error(format!("Failed to insert deletion history row: {}", e)) + })?; + + Ok(()) + } + + async fn exists( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + ) -> StorageResult { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let count = resources + .count_documents(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .await + .map_err(|e| internal_error(format!("Failed to check resource existence: {}", e)))?; + + Ok(count > 0) + } + + async fn read_batch( + &self, + tenant: &TenantContext, + resource_type: &str, + ids: &[&str], + ) -> StorageResult> { + let mut resources = Vec::with_capacity(ids.len()); + + for id in ids { + if let Some(resource) = self.read(tenant, resource_type, id).await? { + resources.push(resource); + } + } + + Ok(resources) + } + + async fn count( + &self, + tenant: &TenantContext, + resource_type: Option<&str>, + ) -> StorageResult { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let mut filter = doc! { + "tenant_id": tenant_id, + "is_deleted": false, + }; + + if let Some(resource_type) = resource_type { + filter.insert("resource_type", resource_type); + } + + resources + .count_documents(filter) + .await + .map_err(|e| internal_error(format!("Failed to count resources: {}", e))) + } +} diff --git a/crates/persistence/tests/common/capabilities.rs b/crates/persistence/tests/common/capabilities.rs index 717b4637..e37c77fe 100644 --- a/crates/persistence/tests/common/capabilities.rs +++ b/crates/persistence/tests/common/capabilities.rs @@ -158,7 +158,7 @@ impl CapabilityMatrix { matrix.set_backend_capabilities( BackendKind::MongoDB, vec![ - (BackendCapability::Crud, SupportLevel::Planned), + (BackendCapability::Crud, SupportLevel::Implemented), (BackendCapability::Versioning, SupportLevel::Planned), (BackendCapability::InstanceHistory, SupportLevel::Planned), (BackendCapability::TypeHistory, SupportLevel::Planned), @@ -166,19 +166,19 @@ impl CapabilityMatrix { (BackendCapability::BasicSearch, SupportLevel::Planned), (BackendCapability::DateSearch, SupportLevel::Planned), (BackendCapability::ReferenceSearch, SupportLevel::Planned), - (BackendCapability::ChainedSearch, SupportLevel::Partial), - (BackendCapability::ReverseChaining, SupportLevel::Partial), + (BackendCapability::ChainedSearch, SupportLevel::Planned), + (BackendCapability::ReverseChaining, SupportLevel::Planned), (BackendCapability::Include, SupportLevel::Planned), (BackendCapability::Revinclude, SupportLevel::Planned), - (BackendCapability::FullTextSearch, SupportLevel::Implemented), - (BackendCapability::TerminologySearch, SupportLevel::RequiresExternalService), + (BackendCapability::FullTextSearch, SupportLevel::Planned), + (BackendCapability::TerminologySearch, SupportLevel::Planned), (BackendCapability::Transactions, SupportLevel::Planned), (BackendCapability::OptimisticLocking, SupportLevel::Planned), (BackendCapability::CursorPagination, SupportLevel::Planned), (BackendCapability::OffsetPagination, SupportLevel::Planned), (BackendCapability::Sorting, SupportLevel::Planned), (BackendCapability::BulkExport, SupportLevel::Planned), - (BackendCapability::SharedSchema, SupportLevel::Planned), + (BackendCapability::SharedSchema, SupportLevel::Implemented), (BackendCapability::SchemaPerTenant, SupportLevel::NotPlanned), (BackendCapability::DatabasePerTenant, SupportLevel::Planned), ], diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs new file mode 100644 index 00000000..708213f3 --- /dev/null +++ b/crates/persistence/tests/mongodb_tests.rs @@ -0,0 +1,261 @@ +//! MongoDB backend tests. +//! +//! Run compile-only/unit tests with: +//! `cargo test -p helios-persistence --features mongodb --test mongodb_tests` +//! +//! To run integration tests that hit a real MongoDB instance, set: +//! `HFS_TEST_MONGODB_URL=mongodb://localhost:27017` + +#![cfg(feature = "mongodb")] + +use helios_fhir::FhirVersion; +use helios_persistence::backends::mongodb::{MongoBackend, MongoBackendConfig}; +use helios_persistence::core::{Backend, BackendCapability, BackendKind, ResourceStorage}; +use helios_persistence::error::{ResourceError, StorageError}; +use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; +use serde_json::json; + +#[test] +fn test_mongodb_config_defaults() { + let config = MongoBackendConfig::default(); + assert_eq!(config.connection_string, "mongodb://localhost:27017"); + assert_eq!(config.database_name, "helios"); + assert_eq!(config.max_connections, 10); + assert_eq!(config.connect_timeout_ms, 5000); + assert!(!config.search_offloaded); + assert_eq!(config.fhir_version, FhirVersion::default()); +} + +#[test] +fn test_mongodb_config_serialization() { + let config = MongoBackendConfig { + connection_string: "mongodb://mongo.test:27018".to_string(), + database_name: "phase2".to_string(), + max_connections: 24, + connect_timeout_ms: 7000, + ..Default::default() + }; + + let serialized = serde_json::to_string(&config).unwrap(); + let decoded: MongoBackendConfig = serde_json::from_str(&serialized).unwrap(); + + assert_eq!(decoded.connection_string, "mongodb://mongo.test:27018"); + assert_eq!(decoded.database_name, "phase2"); + assert_eq!(decoded.max_connections, 24); + assert_eq!(decoded.connect_timeout_ms, 7000); +} + +#[test] +fn test_mongodb_backend_kind_display() { + assert_eq!(BackendKind::MongoDB.to_string(), "mongodb"); +} + +#[test] +fn test_mongodb_phase2_capabilities() { + let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); + + assert_eq!(backend.kind(), BackendKind::MongoDB); + assert_eq!(backend.name(), "mongodb"); + + assert!(backend.supports(BackendCapability::Crud)); + assert!(backend.supports(BackendCapability::SharedSchema)); + + assert!(!backend.supports(BackendCapability::Versioning)); + assert!(!backend.supports(BackendCapability::BasicSearch)); + assert!(!backend.supports(BackendCapability::Transactions)); +} + +fn test_mongo_url() -> Option { + std::env::var("HFS_TEST_MONGODB_URL").ok() +} + +fn create_tenant(tenant_id: &str) -> TenantContext { + TenantContext::new(TenantId::new(tenant_id), TenantPermissions::full_access()) +} + +async fn create_backend(test_name: &str) -> Option { + let connection_string = test_mongo_url()?; + let config = MongoBackendConfig { + connection_string, + database_name: format!( + "hfs_phase2_mongo_{}_{}", + test_name, + uuid::Uuid::new_v4().simple() + ), + ..Default::default() + }; + + let backend = MongoBackend::new(config).ok()?; + backend.initialize().await.ok()?; + Some(backend) +} + +#[tokio::test] +async fn mongodb_integration_create_read_update_delete() { + let Some(backend) = create_backend("crud").await else { + eprintln!("Skipping mongodb_integration_create_read_update_delete (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant = create_tenant("tenant-a"); + + let created = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "name": [{"family": "Phase2"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let read = backend.read(&tenant, "Patient", created.id()).await.unwrap(); + assert!(read.is_some()); + + let updated = backend + .update( + &tenant, + &created, + json!({ + "resourceType": "Patient", + "name": [{"family": "Updated"}] + }), + ) + .await + .unwrap(); + + assert_eq!(updated.version_id(), "2"); + assert_eq!(updated.content()["name"][0]["family"], "Updated"); + + backend + .delete(&tenant, "Patient", updated.id()) + .await + .unwrap(); + + let read_after_delete = backend.read(&tenant, "Patient", updated.id()).await; + assert!(matches!( + read_after_delete, + Err(StorageError::Resource(ResourceError::Gone { .. })) + )); +} + +#[tokio::test] +async fn mongodb_integration_tenant_isolation() { + let Some(backend) = create_backend("tenant").await else { + eprintln!("Skipping mongodb_integration_tenant_isolation (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant_a = create_tenant("tenant-a"); + let tenant_b = create_tenant("tenant-b"); + + let created = backend + .create( + &tenant_a, + "Patient", + json!({ + "resourceType": "Patient", + "id": "shared-id", + "name": [{"family": "TenantA"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let read_a = backend.read(&tenant_a, "Patient", created.id()).await.unwrap(); + assert!(read_a.is_some()); + + let read_b = backend.read(&tenant_b, "Patient", created.id()).await.unwrap(); + assert!(read_b.is_none()); + + let exists_a = backend.exists(&tenant_a, "Patient", created.id()).await.unwrap(); + let exists_b = backend.exists(&tenant_b, "Patient", created.id()).await.unwrap(); + assert!(exists_a); + assert!(!exists_b); +} + +#[tokio::test] +async fn mongodb_integration_count_and_batch() { + let Some(backend) = create_backend("count_batch").await else { + eprintln!("Skipping mongodb_integration_count_and_batch (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant = create_tenant("tenant-count"); + + let mut ids = Vec::new(); + for idx in 0..3 { + let created = backend + .create( + &tenant, + "Observation", + json!({ + "resourceType": "Observation", + "id": format!("obs-{}", idx), + "status": "final" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + ids.push(created.id().to_string()); + } + + let count = backend.count(&tenant, Some("Observation")).await.unwrap(); + assert_eq!(count, 3); + + let id_refs: Vec<&str> = ids.iter().map(String::as_str).collect(); + let batch = backend + .read_batch(&tenant, "Observation", &id_refs) + .await + .unwrap(); + assert_eq!(batch.len(), 3); +} + +#[tokio::test] +async fn mongodb_integration_create_or_update() { + let Some(backend) = create_backend("create_or_update").await else { + eprintln!("Skipping mongodb_integration_create_or_update (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant = create_tenant("tenant-cou"); + + let (created, was_created) = backend + .create_or_update( + &tenant, + "Patient", + "patient-1", + json!({ + "resourceType": "Patient", + "name": [{"family": "First"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + assert!(was_created); + assert_eq!(created.version_id(), "1"); + + let (updated, was_created_again) = backend + .create_or_update( + &tenant, + "Patient", + "patient-1", + json!({ + "resourceType": "Patient", + "name": [{"family": "Second"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + assert!(!was_created_again); + assert_eq!(updated.version_id(), "2"); +} diff --git a/phase2_roadmap.xml b/phase2_roadmap.xml index 8a0dfaf8..28889a64 100644 --- a/phase2_roadmap.xml +++ b/phase2_roadmap.xml @@ -1,13 +1,15 @@ - + HeliosSoftware/hfs - planned + completed TBD 2 2026-03-01 + Phase 2 completed: core ResourceStorage parity, tenant isolation, soft-delete + semantics, schema bootstrap, tests, and docs shipped. @@ -19,26 +21,32 @@ Schema/bootstrap and migration paths exist as placeholders. - MongoDB runtime mode selection in HFS storage config is not enabled yet. - No ResourceStorage CRUD implementation exists yet. - No Mongo integration test suite exists yet. + At Phase 1 completion, MongoDB runtime mode selection in HFS storage config was + not enabled yet. + At Phase 1 completion, no ResourceStorage CRUD implementation existed yet. + At Phase 1 completion, no Mongo integration test suite existed yet. - - Deliver minimum MongoDB parity for ResourceStorage behavior while preserving strict tenant isolation and soft-delete/Gone semantics. + + Delivered minimum MongoDB parity for ResourceStorage behavior while preserving strict + tenant isolation and soft-delete/Gone semantics. - Implement Mongo ResourceStorage contract methods for create/read/update/delete/exists/count/read_batch/create_or_update. + Implement Mongo ResourceStorage contract methods for + create/read/update/delete/exists/count/read_batch/create_or_update. Enforce tenant isolation in every query path and index strategy. - Implement soft-delete semantics aligned with existing backend behavior and error contracts. - Replace schema placeholders with collection/index bootstrap logic required for Phase 2. + Implement soft-delete semantics aligned with existing backend behavior and + error contracts. + Replace schema placeholders with collection/index bootstrap logic required + for Phase 2. VersionedStorage parity (vread, update_with_match). Instance/type/system history providers. TransactionProvider parity and session-based ACID guarantees. - Advanced search execution, chained search, reverse chaining, and include/revinclude behavior. + Advanced search execution, chained search, reverse chaining, and include/revinclude + behavior. Composite MongoDB + Elasticsearch runtime routing. @@ -64,13 +72,19 @@ - - Define stable Mongo persistence layout that supports ResourceStorage semantics and future phase expansion. + + Define stable Mongo persistence layout that supports ResourceStorage semantics and + future phase expansion. - Define canonical live-resource document shape with explicit fields for tenant_id, resource_type, resource_id, version_id, last_updated, is_deleted, and resource payload. - Define resource_history document shape and write strategy that does not block Phase 3 history provider implementation. - Create required indexes for tenant-scoped lookups and uniqueness (tenant_id + resource_type + resource_id for active records). - Decide and document whether soft delete retains unique-key occupancy or allows recreation via version bump policy. + Define canonical live-resource document shape with explicit fields for + tenant_id, resource_type, resource_id, version_id, last_updated, is_deleted, and resource + payload. + Define resource_history document shape and write strategy that does not + block Phase 3 history provider implementation. + Create required indexes for tenant-scoped lookups and uniqueness (tenant_id + + resource_type + resource_id for active records). + Decide and document whether soft delete retains unique-key occupancy or + allows recreation via version bump policy. Document collection naming conventions and migration-safe index names. @@ -79,16 +93,23 @@ - + Implement Phase 2 core storage methods with parity-focused behavior and error mapping. - Introduce Mongo connection/client acquisition path suitable for storage operations (replacing Phase 1 unavailable acquire behavior for this phase scope). - Implement create semantics with conflict detection and deterministic ID handling. + Introduce Mongo connection/client acquisition path suitable for storage + operations (replacing Phase 1 unavailable acquire behavior for this phase scope). + Implement create semantics with conflict detection and deterministic ID + handling. Implement read semantics including not-found vs gone distinction. - Implement update semantics for existing resources with deterministic metadata updates. - Implement delete semantics using soft-delete/tombstone behavior aligned to existing backends. - Implement exists/count/read_batch/create_or_update helper methods with tenant scope guarantees. - Map Mongo driver errors into existing StorageError/BackendError variants consistently. + Implement update semantics for existing resources with deterministic + metadata updates. + Implement delete semantics using soft-delete/tombstone behavior aligned to + existing backends. + Implement exists/count/read_batch/create_or_update helper methods with + tenant scope guarantees. + Map Mongo driver errors into existing StorageError/BackendError variants + consistently. Mongo ResourceStorage parity for Phase 2 method set. @@ -96,51 +117,68 @@ - + Guarantee strict tenant isolation in read/write operations and query helpers. - Introduce shared tenant filter builder utilities to avoid missed tenant predicates. + Introduce shared tenant filter builder utilities to avoid missed tenant + predicates. Require tenant_id in every CRUD/read_batch/count query and write path. - Ensure indexes are tenant-first where query cardinality and safety require it. - Add cross-tenant negative tests for read, update, delete, count, and batch reads. + Ensure indexes are tenant-first where query cardinality and safety require + it. + Add cross-tenant negative tests for read, update, delete, count, and batch + reads. Cross-tenant leakage prevention validated by tests. - + Match existing backend behavior for deleted resource visibility and error semantics. - Define deleted-state fields (is_deleted/deleted_at/deleted_by_version as needed) for deterministic behavior. - Ensure normal read paths return Gone-compatible outcomes for soft-deleted resources. - Ensure update/create_or_update behavior on deleted resources follows existing backend contract expectations. - Add regression tests for repeated delete and delete-after-update edge cases. + Define deleted-state fields (is_deleted/deleted_at/deleted_by_version as + needed) for deterministic behavior. + Ensure normal read paths return Gone-compatible outcomes for soft-deleted + resources. + Ensure update/create_or_update behavior on deleted resources follows + existing backend contract expectations. + Add regression tests for repeated delete and delete-after-update edge + cases. - Soft-delete behavior parity with SQLite/PostgreSQL expectations for Phase 2 scope. + Soft-delete behavior parity with SQLite/PostgreSQL expectations for Phase 2 + scope. - - Turn Phase 1 schema placeholders into deterministic schema/index bootstrap and migration entry points. + + Turn Phase 1 schema placeholders into deterministic schema/index bootstrap and migration + entry points. - Implement initialize_schema to create required collections/indexes idempotently. - Implement migrate_schema skeleton with migration version tracking strategy for Mongo indexes. + Implement initialize_schema to create required collections/indexes + idempotently. + Implement migrate_schema skeleton with migration version tracking strategy + for Mongo indexes. Add tests for initialize/migrate idempotency and startup safety. - Document migration assumptions and rollback limitations for Mongo index evolution. + Document migration assumptions and rollback limitations for Mongo index + evolution. - Deterministic schema bootstrap/migration behavior for Phase 2 and future phases. + Deterministic schema bootstrap/migration behavior for Phase 2 and future + phases. - Resource document mapping tests (metadata field population and serialization invariants). - Tenant filter builder tests proving tenant predicate inclusion in every query constructor. - Soft-delete state transition tests (active -> deleted -> repeated delete handling). + Resource document mapping tests (metadata field population and serialization + invariants). + Tenant filter builder tests proving tenant predicate inclusion in every query + constructor. + Soft-delete state transition tests (active -> deleted -> repeated delete + handling). Schema initialization and migration idempotency tests. Error conversion tests from Mongo driver errors to StorageError/BackendError. @@ -149,14 +187,18 @@ Create and read round-trip under a single tenant. Update behavior with immutable identity and mutable payload checks. Delete and post-delete read behavior (Gone/not found contract). - exists/count/read_batch/create_or_update behavior under realistic tenant-scoped datasets. - Cross-tenant isolation: no access to another tenant records across all supported operations. - Bootstrap and migration execution against fresh and pre-initialized Mongo databases. + exists/count/read_batch/create_or_update behavior under realistic tenant-scoped + datasets. + Cross-tenant isolation: no access to another tenant records across all + supported operations. + Bootstrap and migration execution against fresh and pre-initialized Mongo + databases. Reuse existing persistence test harness and assertions where possible. - Compare Mongo outcomes against SQLite/PostgreSQL expected behavior for methods in scope. + Compare Mongo outcomes against SQLite/PostgreSQL expected behavior for methods in + scope. Document any unavoidable deviations before marking the phase complete. @@ -165,28 +207,30 @@ cargo check -p helios-persistence --features mongodb cargo check -p helios-rest --features mongodb cargo check -p helios-hfs --features mongodb - cargo check -p helios-persistence --features "sqlite,postgres,elasticsearch,mongodb" + cargo check -p helios-persistence --features + "sqlite,postgres,elasticsearch,mongodb" cargo test -p helios-persistence --features mongodb --test mongodb_tests cargo test -p helios-persistence --features mongodb mongodb:: - + WS1.1-WS1.5, WS5.1 Schema bootstrap creates required collections/indexes idempotently. - + WS2.1-WS2.5 create/read/update/delete integration tests pass for single tenant. - + WS2.6-WS2.7, WS3.1-WS3.4, WS4.1-WS4.4 exists/count/read_batch/create_or_update and cross-tenant tests pass. - + WS5.2-WS5.4 and status alignment updates README and capability matrix reflect truthful post-Phase-2 status. @@ -203,15 +247,18 @@ - Update MongoDB rows in persistence README capability matrix to reflect only capabilities completed in this phase. - Update primary/secondary role matrix status from Phase 1 scaffold wording to Phase 2 wording after tests pass. + Update MongoDB rows in persistence README capability matrix to reflect only capabilities + completed in this phase. + Update primary/secondary role matrix status from Phase 1 scaffold wording to Phase 2 + wording after tests pass. Keep all non-implemented capability rows as planned/partial exactly as supported. Tenant leakage due to missing tenant filters in one or more query paths. - Centralize tenant filter construction; enforce with negative cross-tenant tests for every operation. + Centralize tenant filter construction; enforce with negative cross-tenant tests + for every operation. Soft-delete behavior diverges from existing Gone semantics. @@ -219,20 +266,27 @@ Unique index design conflicts with soft-delete and recreation scenarios. - Explicitly define active/deleted uniqueness policy and test both conflict and recreation paths. + Explicitly define active/deleted uniqueness policy and test both conflict and + recreation paths. - Schema bootstrap or migration logic is not idempotent across repeated startup runs. - Require repeated initialize/migrate test passes against both fresh and pre-initialized databases. + Schema bootstrap or migration logic is not idempotent across repeated startup + runs. + Require repeated initialize/migrate test passes against both fresh and + pre-initialized databases. - All Phase 2 in-scope ResourceStorage methods are implemented and covered by Mongo integration tests. - Tenant isolation behavior matches established sqlite/postgres contract expectations for in-scope methods. - Soft-delete and Gone semantics are validated by regression tests. - Schema bootstrap and migration routines are idempotent and safe to execute at startup. - Validation commands run green for mongodb-only and mixed-feature builds. - Documentation and capability matrix reflect actual support levels with no aspirational mismatch. + All Phase 2 in-scope ResourceStorage methods are implemented and covered + by Mongo integration tests. + Tenant isolation behavior matches established sqlite/postgres contract + expectations for in-scope methods. + Soft-delete and Gone semantics are validated by regression tests. + Schema bootstrap and migration routines are idempotent and safe to + execute at startup. + Validation commands run green for mongodb-only and mixed-feature builds. + Documentation and capability matrix reflect actual support levels with + no aspirational mismatch. - + \ No newline at end of file From fe77af8e8983f938a3ef0976e486459115e1e36a Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 1 Mar 2026 21:01:28 -0400 Subject: [PATCH 05/17] feat: refactor mongodb test db naming with length validation and error handling --- crates/persistence/tests/mongodb_tests.rs | 39 +++++++++++++++++++---- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index 708213f3..f14a687d 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -15,6 +15,18 @@ use helios_persistence::error::{ResourceError, StorageError}; use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; use serde_json::json; +const MONGODB_MAX_DATABASE_NAME_LEN: usize = 63; +const MONGODB_TEST_DB_PREFIX: &str = "hfs_phase2_mongo_"; + +fn build_test_database_name(test_name: &str) -> String { + let suffix = uuid::Uuid::new_v4().simple().to_string(); + let reserved_len = MONGODB_TEST_DB_PREFIX.len() + 1 + suffix.len(); + let max_test_name_len = MONGODB_MAX_DATABASE_NAME_LEN.saturating_sub(reserved_len); + let truncated_test_name: String = test_name.chars().take(max_test_name_len).collect(); + + format!("{MONGODB_TEST_DB_PREFIX}{truncated_test_name}_{suffix}") +} + #[test] fn test_mongodb_config_defaults() { let config = MongoBackendConfig::default(); @@ -50,6 +62,17 @@ fn test_mongodb_backend_kind_display() { assert_eq!(BackendKind::MongoDB.to_string(), "mongodb"); } +#[test] +fn test_mongodb_integration_database_name_within_limit() { + let db_name = build_test_database_name("create_or_update"); + + assert!(db_name.len() <= MONGODB_MAX_DATABASE_NAME_LEN); + + let (name_without_uuid, uuid_suffix) = db_name.rsplit_once('_').unwrap(); + assert!(name_without_uuid.starts_with(MONGODB_TEST_DB_PREFIX)); + assert_eq!(uuid_suffix.len(), 32); +} + #[test] fn test_mongodb_phase2_capabilities() { let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); @@ -75,18 +98,20 @@ fn create_tenant(tenant_id: &str) -> TenantContext { async fn create_backend(test_name: &str) -> Option { let connection_string = test_mongo_url()?; + let config = MongoBackendConfig { connection_string, - database_name: format!( - "hfs_phase2_mongo_{}_{}", - test_name, - uuid::Uuid::new_v4().simple() - ), + database_name: build_test_database_name(test_name), ..Default::default() }; - let backend = MongoBackend::new(config).ok()?; - backend.initialize().await.ok()?; + let backend = MongoBackend::new(config) + .expect("failed to create MongoBackend for mongodb integration tests"); + backend + .initialize() + .await + .expect("failed to initialize MongoDB schema for integration tests"); + Some(backend) } From 3207f501ffede6fda16c5bdc5b0da6e06f977681 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 1 Mar 2026 21:25:49 -0400 Subject: [PATCH 06/17] feat: apply rustfmt formatting to mongodb backend code [skip ci] --- .../src/backends/mongodb/backend.rs | 42 +++++++------ .../src/backends/mongodb/schema.rs | 17 +++-- .../src/backends/mongodb/storage.rs | 63 +++++++++++-------- crates/persistence/tests/mongodb_tests.rs | 29 +++++++-- 4 files changed, 93 insertions(+), 58 deletions(-) diff --git a/crates/persistence/src/backends/mongodb/backend.rs b/crates/persistence/src/backends/mongodb/backend.rs index 4dd6802d..0e5d4f47 100644 --- a/crates/persistence/src/backends/mongodb/backend.rs +++ b/crates/persistence/src/backends/mongodb/backend.rs @@ -6,11 +6,7 @@ use std::sync::Arc; use std::time::Duration; use async_trait::async_trait; -use mongodb::{ - Client, Database, - bson::doc, - options::ClientOptions, -}; +use mongodb::{Client, Database, bson::doc, options::ClientOptions}; use parking_lot::RwLock; use serde::{Deserialize, Serialize}; @@ -311,7 +307,8 @@ impl MongoBackend { })?; client_options.max_pool_size = Some(self.config.max_connections); - client_options.connect_timeout = Some(Duration::from_millis(self.config.connect_timeout_ms)); + client_options.connect_timeout = + Some(Duration::from_millis(self.config.connect_timeout_ms)); client_options.app_name = Some("helios-persistence".to_string()); Client::with_options(client_options).map_err(|e| { @@ -435,26 +432,31 @@ impl Backend for MongoBackend { } async fn initialize(&self) -> Result<(), BackendError> { - self.init_schema().await.map_err(|e| BackendError::Internal { - backend_name: "mongodb".to_string(), - message: format!("Failed to initialize schema: {}", e), - source: None, - }) + self.init_schema() + .await + .map_err(|e| BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to initialize schema: {}", e), + source: None, + }) } async fn migrate(&self) -> Result<(), BackendError> { - let db = self.get_database().await.map_err(|e| BackendError::Internal { - backend_name: "mongodb".to_string(), - message: format!("Failed to acquire database for migration: {}", e), - source: None, - })?; + let db = self + .get_database() + .await + .map_err(|e| BackendError::Internal { + backend_name: "mongodb".to_string(), + message: format!("Failed to acquire database for migration: {}", e), + source: None, + })?; schema::migrate_schema_async(&db) .await .map_err(|e| BackendError::Internal { - backend_name: "mongodb".to_string(), - message: format!("Failed to run migrations: {}", e), - source: None, - }) + backend_name: "mongodb".to_string(), + message: format!("Failed to run migrations: {}", e), + source: None, + }) } } diff --git a/crates/persistence/src/backends/mongodb/schema.rs b/crates/persistence/src/backends/mongodb/schema.rs index 741281fe..d197366b 100644 --- a/crates/persistence/src/backends/mongodb/schema.rs +++ b/crates/persistence/src/backends/mongodb/schema.rs @@ -149,23 +149,28 @@ async fn create_index( .unique(Some(unique)) .build(); - let model = IndexModel::builder().keys(keys).options(Some(options)).build(); + let model = IndexModel::builder() + .keys(keys) + .options(Some(options)) + .build(); collection.create_index(model).await?; Ok(()) } async fn get_schema_version(database: &Database) -> StorageResult { let collection = database.collection::("schema_version"); - let doc = collection.find_one(doc! { "_id": "schema_version" }).await?; - let version = doc - .and_then(|d| d.get_i32("version").ok()) - .unwrap_or(0_i32); + let doc = collection + .find_one(doc! { "_id": "schema_version" }) + .await?; + let version = doc.and_then(|d| d.get_i32("version").ok()).unwrap_or(0_i32); Ok(version) } async fn set_schema_version(database: &Database, version: i32) -> StorageResult<()> { let collection = database.collection::("schema_version"); - collection.delete_many(doc! { "_id": "schema_version" }).await?; + collection + .delete_many(doc! { "_id": "schema_version" }) + .await?; collection .insert_one(doc! { "_id": "schema_version", diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index b2c19169..4c2b3216 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -67,9 +67,9 @@ fn chrono_to_bson(dt: DateTime) -> BsonDateTime { } fn next_version(version: &str) -> StorageResult { - let parsed = version.parse::().map_err(|e| { - serialization_error(format!("Invalid version value '{}': {}", version, e)) - })?; + let parsed = version + .parse::() + .map_err(|e| serialization_error(format!("Invalid version value '{}': {}", version, e)))?; Ok((parsed + 1).to_string()) } @@ -188,9 +188,10 @@ impl ResourceStorage for MongoBackend { "fhir_version": fhir_version_str, }; - history.insert_one(history_doc).await.map_err(|e| { - internal_error(format!("Failed to insert resource history: {}", e)) - })?; + history + .insert_one(history_doc) + .await + .map_err(|e| internal_error(format!("Failed to insert resource history: {}", e)))?; Ok(StoredResource::from_storage( resource_type, @@ -326,12 +327,14 @@ impl ResourceStorage for MongoBackend { .to_string(); if actual_version != current.version_id() { - return Err(StorageError::Concurrency(ConcurrencyError::VersionConflict { - resource_type: resource_type.to_string(), - id: id.to_string(), - expected_version: current.version_id().to_string(), - actual_version, - })); + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: current.version_id().to_string(), + actual_version, + }, + )); } let new_version = next_version(current.version_id())?; @@ -376,7 +379,9 @@ impl ResourceStorage for MongoBackend { "id": id, }) .await - .map_err(|e| internal_error(format!("Failed to reload version conflict state: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to reload version conflict state: {}", e)) + })?; let actual = latest .as_ref() @@ -384,12 +389,14 @@ impl ResourceStorage for MongoBackend { .unwrap_or("unknown") .to_string(); - return Err(StorageError::Concurrency(ConcurrencyError::VersionConflict { - resource_type: resource_type.to_string(), - id: id.to_string(), - expected_version: current.version_id().to_string(), - actual_version: actual, - })); + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: current.version_id().to_string(), + actual_version: actual, + }, + )); } let created_at = extract_created_at(&existing_doc, now); @@ -407,9 +414,10 @@ impl ResourceStorage for MongoBackend { "fhir_version": fhir_version_str, }; - history.insert_one(history_doc).await.map_err(|e| { - internal_error(format!("Failed to insert updated history row: {}", e)) - })?; + history + .insert_one(history_doc) + .await + .map_err(|e| internal_error(format!("Failed to insert updated history row: {}", e)))?; Ok(StoredResource::from_storage( resource_type, @@ -443,7 +451,9 @@ impl ResourceStorage for MongoBackend { "is_deleted": false, }) .await - .map_err(|e| internal_error(format!("Failed to check resource before delete: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to check resource before delete: {}", e)) + })?; let Some(existing_doc) = maybe_existing else { return Err(StorageError::Resource(ResourceError::NotFound { @@ -512,9 +522,10 @@ impl ResourceStorage for MongoBackend { "fhir_version": fhir_version, }; - history.insert_one(history_doc).await.map_err(|e| { - internal_error(format!("Failed to insert deletion history row: {}", e)) - })?; + history + .insert_one(history_doc) + .await + .map_err(|e| internal_error(format!("Failed to insert deletion history row: {}", e)))?; Ok(()) } diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index f14a687d..beb1afdf 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -118,7 +118,9 @@ async fn create_backend(test_name: &str) -> Option { #[tokio::test] async fn mongodb_integration_create_read_update_delete() { let Some(backend) = create_backend("crud").await else { - eprintln!("Skipping mongodb_integration_create_read_update_delete (set HFS_TEST_MONGODB_URL)"); + eprintln!( + "Skipping mongodb_integration_create_read_update_delete (set HFS_TEST_MONGODB_URL)" + ); return; }; @@ -137,7 +139,10 @@ async fn mongodb_integration_create_read_update_delete() { .await .unwrap(); - let read = backend.read(&tenant, "Patient", created.id()).await.unwrap(); + let read = backend + .read(&tenant, "Patient", created.id()) + .await + .unwrap(); assert!(read.is_some()); let updated = backend @@ -191,14 +196,26 @@ async fn mongodb_integration_tenant_isolation() { .await .unwrap(); - let read_a = backend.read(&tenant_a, "Patient", created.id()).await.unwrap(); + let read_a = backend + .read(&tenant_a, "Patient", created.id()) + .await + .unwrap(); assert!(read_a.is_some()); - let read_b = backend.read(&tenant_b, "Patient", created.id()).await.unwrap(); + let read_b = backend + .read(&tenant_b, "Patient", created.id()) + .await + .unwrap(); assert!(read_b.is_none()); - let exists_a = backend.exists(&tenant_a, "Patient", created.id()).await.unwrap(); - let exists_b = backend.exists(&tenant_b, "Patient", created.id()).await.unwrap(); + let exists_a = backend + .exists(&tenant_a, "Patient", created.id()) + .await + .unwrap(); + let exists_b = backend + .exists(&tenant_b, "Patient", created.id()) + .await + .unwrap(); assert!(exists_a); assert!(!exists_b); } From e6b09d4c6f13066c9ff6a49489bc80d44ffe7137 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Thu, 5 Mar 2026 14:59:01 -0400 Subject: [PATCH 07/17] feat(persistence): MongoDB Phase 3 - topology-aware session handling --- crates/persistence/README.md | 221 ++-- .../src/backends/mongodb/backend.rs | 25 +- .../persistence/src/backends/mongodb/mod.rs | 8 +- .../src/backends/mongodb/schema.rs | 18 +- .../src/backends/mongodb/storage.rs | 986 ++++++++++++++++-- .../persistence/tests/common/capabilities.rs | 10 +- crates/persistence/tests/mongodb_tests.rs | 302 +++++- phase3_roadmap.xml | 374 +++++++ roadmap_mongo.xml | 120 ++- 9 files changed, 1814 insertions(+), 250 deletions(-) create mode 100644 phase3_roadmap.xml diff --git a/crates/persistence/README.md b/crates/persistence/README.md index 84364f04..777ff3f1 100644 --- a/crates/persistence/README.md +++ b/crates/persistence/README.md @@ -8,14 +8,14 @@ Traditional FHIR server implementations force all resources into a single databa **Polyglot persistence** is an architectural approach where different types of data and operations are routed to the storage technologies best suited for how that data will be accessed. Rather than accepting compromise, this pattern leverages specialized storage systems optimized for specific workloads: -| Workload | Optimal Technology | Why | -|----------|-------------------|-----| -| ACID transactions | PostgreSQL | Strong consistency guarantees | -| Document storage | MongoDB | Natural alignment with FHIR's resource model | -| Relationship traversal | Neo4j | Efficient graph queries for references | -| Full-text search | Elasticsearch | Optimized inverted indexes | -| Semantic search | Vector databases | Embedding similarity for clinical matching | -| Bulk analytics & ML | Object Storage | Cost-effective columnar storage | +| Workload | Optimal Technology | Why | +| ---------------------- | ------------------ | -------------------------------------------- | +| ACID transactions | PostgreSQL | Strong consistency guarantees | +| Document storage | MongoDB | Natural alignment with FHIR's resource model | +| Relationship traversal | Neo4j | Efficient graph queries for references | +| Full-text search | Elasticsearch | Optimized inverted indexes | +| Semantic search | Vector databases | Embedding similarity for clinical matching | +| Bulk analytics & ML | Object Storage | Cost-effective columnar storage | ## Polyglot Query Example @@ -26,6 +26,7 @@ GET /Observation?patient.name:contains=smith&_text=cardiac&code:below=http://loi ``` This query requires: + 1. **Chained search** (`patient.name:contains=smith`) - Find observations where the referenced patient's name contains "smith" 2. **Full-text search** (`_text=cardiac`) - Search narrative text for "cardiac" 3. **Terminology subsumption** (`code:below=LOINC|8867-4`) - Find codes that are descendants of heart rate @@ -214,7 +215,7 @@ Backend (connection management, capabilities) - **Multiple Backends**: SQLite, PostgreSQL, Cassandra, MongoDB, Neo4j, Elasticsearch, S3 - **Multitenancy**: Three isolation strategies with type-level enforcement -- **Full FHIR Search**: All parameter types, modifiers, chaining, _include/_revinclude +- **Full FHIR Search**: All parameter types, modifiers, chaining, \_include/\_revinclude - **Versioning**: Complete resource history with optimistic locking - **Transactions**: ACID transactions with FHIR bundle support - **Capability Discovery**: Runtime introspection of backend capabilities @@ -225,11 +226,11 @@ All storage operations require a `TenantContext`, ensuring tenant isolation at t ### Tenancy Strategies -| Strategy | Isolation | Use Case | -|----------|-----------|----------| -| **Shared Schema** | `tenant_id` column + optional RLS | Multi-tenant SaaS with shared infrastructure | -| **Schema-per-Tenant** | PostgreSQL schemas | Logical isolation with shared database | -| **Database-per-Tenant** | Separate databases | Complete isolation for compliance | +| Strategy | Isolation | Use Case | +| ----------------------- | --------------------------------- | -------------------------------------------- | +| **Shared Schema** | `tenant_id` column + optional RLS | Multi-tenant SaaS with shared infrastructure | +| **Schema-per-Tenant** | PostgreSQL schemas | Logical isolation with shared database | +| **Database-per-Tenant** | Separate databases | Complete isolation for compliance | ### Hierarchical Tenants @@ -302,17 +303,17 @@ The matrix below shows which FHIR operations each backend supports. This reflect **Legend:** ✓ Implemented | ◐ Partial | ○ Planned | ✗ Not planned | † Requires external service -> **MongoDB Status:** Phase 1 backend scaffolding is implemented (module export, config, and `Backend` trait wiring). Capability rows below remain `○` until Phase 2+ storage/search behavior is implemented. +> **MongoDB Status:** Phase 3 core storage semantics are implemented (CRUD, vread/history, optimistic locking, tenant isolation), including best-effort session-backed consistency for multi-write flows where deployment topology permits. Search and conditional operations remain planned, and full transaction bundle semantics remain planned. | Feature | SQLite | PostgreSQL | MongoDB | Cassandra | Neo4j | Elasticsearch | S3 | | --------------------------------------------------------------------------- | ------ | ---------- | ------- | --------- | ----- | ------------- | --- | | **Core Operations** | | [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | -| [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | -| [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ✗ | -| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ○ | ○ | ✗ | ○ | -| [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | -| [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ✓ | ○ | ○ | ○ | ○ | +| [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ✓ | ○ | ○ | ✗ | ✗ | +| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ○ | ○ | ✗ | ○ | +| [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | +| [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | | [Batch Bundles](https://build.fhir.org/http.html#batch) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | | [Transaction Bundles](https://build.fhir.org/http.html#transaction) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | | [Conditional Operations](https://build.fhir.org/http.html#cond-update) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | @@ -365,42 +366,42 @@ The matrix below shows which FHIR operations each backend supports. This reflect Backends can serve as primary (CRUD, versioning, transactions) or secondary (optimized for specific query patterns). When a secondary search backend is configured, the primary backend's search indexing is automatically disabled to avoid data duplication. -| Configuration | Primary | Secondary | Status | Use Case | -| -------------------------- | ---------- | ---------------------- | -------------------------------- | --------------------------------------- | -| SQLite alone | SQLite | — | ✓ Implemented | Development, testing, small deployments | -| SQLite + Elasticsearch | SQLite | Elasticsearch (search) | ✓ Implemented | Small prod with robust search | -| PostgreSQL alone | PostgreSQL | — | ✓ Implemented | Production OLTP | -| PostgreSQL + Elasticsearch | PostgreSQL | Elasticsearch (search) | ✓ Implemented | OLTP + advanced search | -| PostgreSQL + Neo4j | PostgreSQL | Neo4j (graph) | Planned | Graph-heavy queries | -| Cassandra alone | Cassandra | — | Planned | High write throughput | -| Cassandra + Elasticsearch | Cassandra | Elasticsearch (search) | Planned | Write-heavy + search | -| MongoDB alone | MongoDB | — | ◐ In progress (Phase 1 scaffold) | Document-centric | -| S3 alone | S3 | — | Planned | Archival/bulk storage | -| S3 + Elasticsearch | S3 | Elasticsearch (search) | Planned | Large-scale + search | +| Configuration | Primary | Secondary | Status | Use Case | +| -------------------------- | ---------- | ---------------------- | ------------------------------------ | --------------------------------------- | +| SQLite alone | SQLite | — | ✓ Implemented | Development, testing, small deployments | +| SQLite + Elasticsearch | SQLite | Elasticsearch (search) | ✓ Implemented | Small prod with robust search | +| PostgreSQL alone | PostgreSQL | — | ✓ Implemented | Production OLTP | +| PostgreSQL + Elasticsearch | PostgreSQL | Elasticsearch (search) | ✓ Implemented | OLTP + advanced search | +| PostgreSQL + Neo4j | PostgreSQL | Neo4j (graph) | Planned | Graph-heavy queries | +| Cassandra alone | Cassandra | — | Planned | High write throughput | +| Cassandra + Elasticsearch | Cassandra | Elasticsearch (search) | Planned | Write-heavy + search | +| MongoDB alone | MongoDB | — | ◐ In progress (Phase 3 core storage) | Document-centric | +| S3 alone | S3 | — | Planned | Archival/bulk storage | +| S3 + Elasticsearch | S3 | Elasticsearch (search) | Planned | Large-scale + search | ### Backend Selection Guide -| Use Case | Recommended Backend | Rationale | -|----------|---------------------|-----------| -| Development & Testing | SQLite | Zero configuration, in-memory mode | -| Production OLTP | PostgreSQL | ACID transactions, JSONB, mature ecosystem | -| Document-centric | MongoDB | Natural FHIR alignment, flexible schema | -| Graph queries | Neo4j | Efficient relationship traversal | -| Full-text search | Elasticsearch | Optimized inverted indexes, analyzers | -| Bulk analytics | S3 + Parquet | Cost-effective, columnar, ML-ready | -| High write throughput | Cassandra | Distributed writes, eventual consistency | +| Use Case | Recommended Backend | Rationale | +| --------------------- | ------------------- | ------------------------------------------ | +| Development & Testing | SQLite | Zero configuration, in-memory mode | +| Production OLTP | PostgreSQL | ACID transactions, JSONB, mature ecosystem | +| Document-centric | MongoDB | Natural FHIR alignment, flexible schema | +| Graph queries | Neo4j | Efficient relationship traversal | +| Full-text search | Elasticsearch | Optimized inverted indexes, analyzers | +| Bulk analytics | S3 + Parquet | Cost-effective, columnar, ML-ready | +| High write throughput | Cassandra | Distributed writes, eventual consistency | ### Feature Flags -| Feature | Description | Driver | -|---------|-------------|--------| -| `sqlite` (default) | SQLite (in-memory and file) | rusqlite | -| `postgres` | PostgreSQL with JSONB | tokio-postgres | -| `cassandra` | Apache Cassandra | cdrs-tokio | -| `mongodb` | MongoDB document store | mongodb | -| `neo4j` | Neo4j graph database | neo4rs | -| `elasticsearch` | Elasticsearch search | elasticsearch | -| `s3` | AWS S3 object storage | object_store | +| Feature | Description | Driver | +| ------------------ | --------------------------- | -------------- | +| `sqlite` (default) | SQLite (in-memory and file) | rusqlite | +| `postgres` | PostgreSQL with JSONB | tokio-postgres | +| `cassandra` | Apache Cassandra | cdrs-tokio | +| `mongodb` | MongoDB document store | mongodb | +| `neo4j` | Neo4j graph database | neo4rs | +| `elasticsearch` | Elasticsearch search | elasticsearch | +| `s3` | AWS S3 object storage | object_store | ## Building & Running Storage Backends @@ -548,22 +549,23 @@ let config = ElasticsearchConfig { }; ``` -| Option | Default | Description | -|--------|---------|-------------| -| `nodes` | `["http://localhost:9200"]` | Elasticsearch node URLs | -| `index_prefix` | `"hfs"` | Prefix for all index names | -| `username` / `password` | `None` | Basic authentication credentials | -| `timeout` | `30s` | Request timeout | -| `number_of_shards` | `1` | Number of primary shards per index | -| `number_of_replicas` | `1` | Number of replica shards per index | -| `max_result_window` | `10000` | Maximum `from + size` for offset pagination | -| `refresh_interval` | `"1s"` | How often new documents become searchable | +| Option | Default | Description | +| ----------------------- | --------------------------- | ------------------------------------------- | +| `nodes` | `["http://localhost:9200"]` | Elasticsearch node URLs | +| `index_prefix` | `"hfs"` | Prefix for all index names | +| `username` / `password` | `None` | Basic authentication credentials | +| `timeout` | `30s` | Request timeout | +| `number_of_shards` | `1` | Number of primary shards per index | +| `number_of_replicas` | `1` | Number of replica shards per index | +| `max_result_window` | `10000` | Maximum `from + size` for offset pagination | +| `refresh_interval` | `"1s"` | How often new documents become searchable | ### Index Structure Each tenant + resource type combination gets its own index: `{prefix}_{tenant_id}_{resource_type}` (e.g., `hfs_acme_patient`). Documents contain: + - **Metadata**: `resource_type`, `resource_id`, `tenant_id`, `version_id`, `last_updated`, `is_deleted` - **Content**: Raw FHIR JSON (stored but not indexed) - **Full-text fields**: `narrative_text` (from `text.div`), `content_text` (all string values) @@ -626,6 +628,7 @@ let composite = CompositeStorage::new(config, backends)? ## Implementation Status ### Phase 1: Core Types ✓ + - [x] Error types with comprehensive variants - [x] Tenant types (TenantId, TenantContext, TenantPermissions) - [x] Stored resource types with versioning metadata @@ -633,6 +636,7 @@ let composite = CompositeStorage::new(config, backends)? - [x] Pagination types (cursor and offset) ### Phase 2: Core Traits ✓ + - [x] Backend trait with capability discovery - [x] ResourceStorage trait (CRUD operations) - [x] VersionedStorage trait (vread, If-Match) @@ -642,11 +646,13 @@ let composite = CompositeStorage::new(config, backends)? - [x] Capabilities trait (CapabilityStatement generation) ### Phase 3: Tenancy Strategies ✓ + - [x] Shared schema strategy with RLS support - [x] Schema-per-tenant strategy with PostgreSQL search_path - [x] Database-per-tenant strategy with pool management ### Phase 4: SQLite Backend ✓ + - [x] Connection pooling (r2d2) - [x] Schema migrations - [x] ResourceStorage implementation @@ -662,6 +668,7 @@ FHIR [transaction](https://build.fhir.org/http.html#transaction) and [batch](htt > **Backend Support:** Transaction bundles require ACID support. SQLite supports transactions. Cassandra, Elasticsearch, and S3 do not support transactions (batch only). See the capability matrix above. **Implemented Features:** + - [x] **Transaction bundles** - Atomic all-or-nothing processing with automatic rollback on failure - [x] **Batch bundles** - Independent entry processing (failures don't affect other entries) - [x] **Processing order** - Entries processed per FHIR spec: DELETE → POST → PUT/PATCH → GET @@ -673,27 +680,29 @@ FHIR [transaction](https://build.fhir.org/http.html#transaction) and [batch](htt **Not Yet Implemented:** -| Gap | Description | Spec Reference | -|-----|-------------|----------------| -| Conditional reference resolution | References like `Patient?identifier=12345` should resolve via search | [Transaction](https://build.fhir.org/http.html#trules) | -| PATCH method | PATCH operations in bundle entries return 501 | [Patch](https://build.fhir.org/http.html#patch) | -| Duplicate resource detection | Same resource appearing twice in transaction should fail | [Transaction](https://build.fhir.org/http.html#trules) | -| Prefer header handling | `return=minimal`, `return=representation`, `return=OperationOutcome` | [Prefer](https://build.fhir.org/http.html#return) | -| History bundle acceptance | Servers SHOULD accept history bundles for replay | [History](https://build.fhir.org/http.html#history) | -| Version-specific references | `resolve-as-version-specific` extension support | [References](https://build.fhir.org/http.html#trules) | -| lastModified in response | Bundle entry responses should include lastModified | [Transaction](https://build.fhir.org/http.html#transaction-response) | +| Gap | Description | Spec Reference | +| -------------------------------- | -------------------------------------------------------------------- | -------------------------------------------------------------------- | +| Conditional reference resolution | References like `Patient?identifier=12345` should resolve via search | [Transaction](https://build.fhir.org/http.html#trules) | +| PATCH method | PATCH operations in bundle entries return 501 | [Patch](https://build.fhir.org/http.html#patch) | +| Duplicate resource detection | Same resource appearing twice in transaction should fail | [Transaction](https://build.fhir.org/http.html#trules) | +| Prefer header handling | `return=minimal`, `return=representation`, `return=OperationOutcome` | [Prefer](https://build.fhir.org/http.html#return) | +| History bundle acceptance | Servers SHOULD accept history bundles for replay | [History](https://build.fhir.org/http.html#history) | +| Version-specific references | `resolve-as-version-specific` extension support | [References](https://build.fhir.org/http.html#trules) | +| lastModified in response | Bundle entry responses should include lastModified | [Transaction](https://build.fhir.org/http.html#transaction-response) | #### SQLite Search Implementation ✓ The SQLite backend includes a complete FHIR search implementation using pre-computed indexes: **Search Parameter Registry & Extraction:** + - [x] `SearchParameterRegistry` - In-memory cache of active SearchParameter definitions - [x] `SearchParameterLoader` - Loads embedded R4 standard parameters at startup - [x] `SearchParameterExtractor` - FHIRPath-based value extraction using `helios-fhirpath` - [x] Dynamic SearchParameter handling - POST/PUT/DELETE to SearchParameter updates the registry **Search Index & Query:** + - [x] Pre-computed `search_index` table for fast queries - [x] All 8 parameter type handlers (string, token, date, number, quantity, reference, URI, composite) - [x] Modifier support (:exact, :contains, :missing, :not, :identifier, :below, :above) @@ -703,6 +712,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] Single-field sorting **Full-Text Search (FTS5):** + - [x] `resource_fts` FTS5 virtual table for full-text indexing - [x] Narrative text extraction from `text.div` with HTML stripping - [x] Full content extraction from all resource string values @@ -712,12 +722,13 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - Porter stemming (e.g., "run" matches "running") - Boolean operators (AND, OR, NOT) - Phrase matching ("heart failure") - - Prefix search (cardio*) + - Prefix search (cardio\*) - Proximity matching (NEAR operator) - [x] Porter stemmer tokenization for improved search quality - [x] Automatic FTS indexing on resource create/update/delete **Chained Parameters & Reverse Chaining:** + - [x] N-level forward chains (e.g., `Observation?subject.organization.name=Hospital`) - [x] Nested reverse chains / `_has` (e.g., `Patient?_has:Observation:subject:code=1234-5`) - [x] Type modifiers for ambiguous references (e.g., `subject:Patient.name=Smith`) @@ -726,23 +737,26 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] Configurable depth limits (default: 4, max: 8) **Reindexing:** + - [x] `ReindexableStorage` trait for backend-agnostic reindexing - [x] `ReindexOperation` with background task execution - [x] Progress tracking and cancellation support - [ ] `$reindex` HTTP endpoint (planned for server layer) **Capability Reporting:** + - [x] `SearchCapabilityProvider` implementation - [x] Runtime capability discovery from registry **Bulk Operations:** + - [x] `BulkExportStorage` trait implementation (FHIR Bulk Data Access IG) - System-level export (`/$export`) - Patient-level export (`/Patient/$export`) - Group-level export (`/Group/[id]/$export`) - Job lifecycle management (pending, in-progress, completed, failed, cancelled) - Streaming NDJSON batch generation - - Type filtering and _since parameter support + - Type filtering and \_since parameter support - [x] `BulkSubmitProvider` trait implementation (FHIR Bulk Submit) - Submission lifecycle management - Manifest creation and management @@ -751,6 +765,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] Schema migration v5 to v6 with 7 new tables for bulk operations ### Phase 5: Elasticsearch Backend ✓ + - [x] Backend structure with connection management and health checks - [x] Index schema and mappings (nested objects for multi-value search params) - [x] ResourceStorage implementation for composite sync support @@ -764,6 +779,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] Search offloading: when Elasticsearch is the search secondary, the primary backend skips search index population ### Phase 5b: PostgreSQL Backend ✓ + - [x] Connection pooling (deadpool-postgres) - [x] Schema migrations with JSONB storage - [x] ResourceStorage implementation (CRUD) @@ -772,7 +788,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] TransactionProvider with configurable isolation levels - [x] Conditional operations (conditional create/update/delete) - [x] SearchProvider with all parameter types -- [x] ChainedSearchProvider and reverse chaining (_has) +- [x] ChainedSearchProvider and reverse chaining (\_has) - [x] Full-text search (tsvector/tsquery) - [x] `_include` and `_revinclude` resolution - [x] BulkExportStorage and BulkSubmitProvider @@ -789,6 +805,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [ ] S3 backend (bulk export, object storage) ### Phase 6: Composite Storage ✓ + - [x] Query analysis and feature detection - [x] Multi-backend coordination with primary-secondary model - [x] Cost-based query routing @@ -850,14 +867,14 @@ let prod_config = CompositeConfigBuilder::new() Queries are automatically analyzed and routed to optimal backends: -| Feature | Detection | Routed To | -|---------|-----------|-----------| -| Basic search | Standard parameters | Primary | -| Chained parameters | `patient.name=Smith` | Graph backend | -| Full-text | `_text`, `_content` | Search backend | -| Terminology | `:above`, `:below`, `:in` | Terminology backend | -| Writes | All mutations | Primary only | -| _include/_revinclude | Include directives | Primary | +| Feature | Detection | Routed To | +| ---------------------- | ------------------------- | ------------------- | +| Basic search | Standard parameters | Primary | +| Chained parameters | `patient.name=Smith` | Graph backend | +| Full-text | `_text`, `_content` | Search backend | +| Terminology | `:above`, `:below`, `:in` | Terminology backend | +| Writes | All mutations | Primary only | +| \_include/\_revinclude | Include directives | Primary | ```rust use helios_persistence::composite::{QueryAnalyzer, QueryFeature}; @@ -878,20 +895,20 @@ println!("Complexity: {}", analysis.complexity_score); When queries span multiple backends, results are merged using configurable strategies: -| Strategy | Behavior | Use Case | -|----------|----------|----------| -| **Intersection** | Results must match all backends (AND) | Restrictive queries | -| **Union** | Results from any backend (OR) | Inclusive queries | -| **PrimaryEnriched** | Primary results with metadata from secondaries | Standard search | -| **SecondaryFiltered** | Filter secondary results through primary | Search-heavy queries | +| Strategy | Behavior | Use Case | +| --------------------- | ---------------------------------------------- | -------------------- | +| **Intersection** | Results must match all backends (AND) | Restrictive queries | +| **Union** | Results from any backend (OR) | Inclusive queries | +| **PrimaryEnriched** | Primary results with metadata from secondaries | Standard search | +| **SecondaryFiltered** | Filter secondary results through primary | Search-heavy queries | ### Synchronization Modes -| Mode | Latency | Consistency | Use Case | -|------|---------|-------------|----------| -| **Synchronous** | Higher | Strong | Critical data requiring consistency | -| **Asynchronous** | Lower | Eventual | Read-heavy workloads | -| **Hybrid** | Balanced | Configurable | Search indexes sync, others async | +| Mode | Latency | Consistency | Use Case | +| ---------------- | -------- | ------------ | ----------------------------------- | +| **Synchronous** | Higher | Strong | Critical data requiring consistency | +| **Asynchronous** | Lower | Eventual | Read-heavy workloads | +| **Hybrid** | Balanced | Configurable | Search indexes sync, others async | ```rust use helios_persistence::composite::SyncMode; @@ -974,15 +991,15 @@ ADVISOR_HOST=0.0.0.0 ADVISOR_PORT=9000 ./target/debug/config-advisor #### API Endpoints -| Endpoint | Method | Description | -|----------|--------|-------------| -| `/health` | GET | Health check | -| `/backends` | GET | List available backend types | -| `/backends/{kind}` | GET | Get capabilities for a backend type | -| `/analyze` | POST | Analyze a configuration | -| `/validate` | POST | Validate a configuration | -| `/suggest` | POST | Get optimization suggestions | -| `/simulate` | POST | Simulate query routing | +| Endpoint | Method | Description | +| ------------------ | ------ | ----------------------------------- | +| `/health` | GET | Health check | +| `/backends` | GET | List available backend types | +| `/backends/{kind}` | GET | Get capabilities for a backend type | +| `/analyze` | POST | Analyze a configuration | +| `/validate` | POST | Validate a configuration | +| `/suggest` | POST | Get optimization suggestions | +| `/simulate` | POST | Simulate query routing | #### Example: Analyze Configuration @@ -1061,21 +1078,25 @@ let config = CompositeConfigBuilder::new() ### Troubleshooting **Query not routing to expected backend:** + - Enable debug logging: `RUST_LOG=helios_persistence::composite=debug` - Use the analyzer to inspect detected features: `analyzer.analyze(&query)` - Check backend capabilities match required features **High sync lag:** + - Reduce batch size in SyncConfig - Increase sync workers - Consider synchronous mode for critical data **Failover not triggering:** + - Check health check interval isn't too long - Verify failure threshold is appropriate - Ensure failover_to targets are configured **Cost estimates seem wrong:** + - Run Criterion benchmarks to calibrate costs - Use `with_benchmarks()` on CostEstimator - Check feature multipliers in CostConfig diff --git a/crates/persistence/src/backends/mongodb/backend.rs b/crates/persistence/src/backends/mongodb/backend.rs index 0e5d4f47..1abe0a8a 100644 --- a/crates/persistence/src/backends/mongodb/backend.rs +++ b/crates/persistence/src/backends/mongodb/backend.rs @@ -20,10 +20,11 @@ use super::schema; /// MongoDB backend for FHIR resource storage. /// -/// The phase 2 implementation provides backend wiring, schema bootstrap, -/// and core ResourceStorage behavior for CRUD/count + tenant isolation. +/// The Phase 3 implementation provides backend wiring, schema bootstrap, +/// core ResourceStorage behavior for CRUD/count + tenant isolation, +/// [`crate::core::VersionedStorage`] support, and history providers. /// -/// Versioned/history/search/composite behavior is implemented in later phases. +/// Search/composite behavior and conditional operations remain in later phases. pub struct MongoBackend { config: MongoBackendConfig, /// Search parameter registry (in-memory cache of active parameters). @@ -381,12 +382,26 @@ impl Backend for MongoBackend { fn supports(&self, capability: BackendCapability) -> bool { matches!( capability, - BackendCapability::Crud | BackendCapability::SharedSchema + BackendCapability::Crud + | BackendCapability::Versioning + | BackendCapability::InstanceHistory + | BackendCapability::TypeHistory + | BackendCapability::SystemHistory + | BackendCapability::OptimisticLocking + | BackendCapability::SharedSchema ) } fn capabilities(&self) -> Vec { - vec![BackendCapability::Crud, BackendCapability::SharedSchema] + vec![ + BackendCapability::Crud, + BackendCapability::Versioning, + BackendCapability::InstanceHistory, + BackendCapability::TypeHistory, + BackendCapability::SystemHistory, + BackendCapability::OptimisticLocking, + BackendCapability::SharedSchema, + ] } async fn acquire(&self) -> Result { diff --git a/crates/persistence/src/backends/mongodb/mod.rs b/crates/persistence/src/backends/mongodb/mod.rs index d50ba1dd..b7dd0555 100644 --- a/crates/persistence/src/backends/mongodb/mod.rs +++ b/crates/persistence/src/backends/mongodb/mod.rs @@ -1,15 +1,17 @@ //! MongoDB backend implementation. //! //! This module provides MongoDB backend wiring, schema bootstrap helpers, -//! and core storage contract support. +//! and storage contract support through Phase 3. //! -//! Phase 2 scope focuses on: +//! Phase 3 scope currently includes: //! - backend/config wiring and health checks //! - core [`crate::core::ResourceStorage`] contract parity for CRUD/count +//! - [`crate::core::VersionedStorage`] for vread and If-Match update/delete +//! - history providers for instance/type/system history retrieval //! - tenant isolation and soft-delete semantics //! - schema/index bootstrap foundations //! -//! Versioned/history/search/composite behavior remains part of later phases. +//! Search/composite behavior and conditional operations remain part of later phases. mod backend; pub(crate) mod schema; diff --git a/crates/persistence/src/backends/mongodb/schema.rs b/crates/persistence/src/backends/mongodb/schema.rs index d197366b..d0f953a6 100644 --- a/crates/persistence/src/backends/mongodb/schema.rs +++ b/crates/persistence/src/backends/mongodb/schema.rs @@ -12,7 +12,7 @@ use crate::error::{BackendError, StorageError, StorageResult}; use super::backend::MongoBackendConfig; /// Current MongoDB schema version. -pub const SCHEMA_VERSION: i32 = 2; +pub const SCHEMA_VERSION: i32 = 3; /// Initialize MongoDB collections/indexes required by the backend. /// @@ -135,6 +135,22 @@ async fn ensure_history_indexes(database: &Database) -> StorageResult<()> { ) .await?; + create_index( + &history, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "last_updated": -1_i32, "id": -1_i32 }, + "idx_history_type_updated", + false, + ) + .await?; + + create_index( + &history, + doc! { "tenant_id": 1_i32, "last_updated": -1_i32, "resource_type": -1_i32, "id": -1_i32 }, + "idx_history_system_updated", + false, + ) + .await?; + Ok(()) } diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index 4c2b3216..62ff1648 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -4,15 +4,19 @@ use async_trait::async_trait; use chrono::{DateTime, Utc}; use helios_fhir::FhirVersion; use mongodb::{ + ClientSession, Cursor, bson::{self, Bson, DateTime as BsonDateTime, Document, doc}, error::Error as MongoError, }; use serde_json::Value; -use crate::core::ResourceStorage; +use crate::core::{ + HistoryEntry, HistoryMethod, HistoryPage, HistoryParams, InstanceHistoryProvider, + ResourceStorage, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, normalize_etag, +}; use crate::error::{BackendError, ConcurrencyError, ResourceError, StorageError, StorageResult}; use crate::tenant::TenantContext; -use crate::types::StoredResource; +use crate::types::{CursorValue, Page, PageCursor, PageInfo, StoredResource}; use super::MongoBackend; @@ -99,6 +103,251 @@ fn extract_fhir_version(doc: &Document, fallback: FhirVersion) -> FhirVersion { .unwrap_or(fallback) } +fn parse_version_id(version_id: &str) -> i64 { + version_id.parse::().unwrap_or(0) +} + +fn history_method_for(version_id: &str, is_deleted: bool) -> HistoryMethod { + if is_deleted { + HistoryMethod::Delete + } else if version_id == "1" { + HistoryMethod::Post + } else { + HistoryMethod::Put + } +} + +fn apply_history_params_filter(filter: &mut Document, params: &HistoryParams) { + if !params.include_deleted { + filter.insert("is_deleted", false); + } + + let mut last_updated = Document::new(); + if let Some(since) = params.since { + last_updated.insert("$gte", chrono_to_bson(since)); + } + if let Some(before) = params.before { + last_updated.insert("$lt", chrono_to_bson(before)); + } + + if !last_updated.is_empty() { + filter.insert("last_updated", Bson::Document(last_updated)); + } +} + +async fn collect_documents(mut cursor: Cursor) -> StorageResult> { + let mut docs = Vec::new(); + while cursor + .advance() + .await + .map_err(|e| internal_error(format!("Failed to advance MongoDB cursor: {}", e)))? + { + let doc = cursor.deserialize_current().map_err(|e| { + internal_error(format!("Failed to deserialize MongoDB document: {}", e)) + })?; + docs.push(doc); + } + Ok(docs) +} + +fn parse_cursor_version(params: &HistoryParams) -> Option { + let cursor = params.pagination.cursor_value()?; + let value = cursor.sort_values().first()?; + match value { + CursorValue::String(version) => version.parse::().ok(), + CursorValue::Number(version) => Some(*version), + _ => None, + } +} + +fn parse_cursor_timestamp(value: &str) -> Option> { + DateTime::parse_from_rfc3339(value) + .ok() + .map(|dt| dt.with_timezone(&Utc)) +} + +fn parse_type_history_cursor(params: &HistoryParams) -> Option<(DateTime, String)> { + let cursor = params.pagination.cursor_value()?; + let sort_values = cursor.sort_values(); + if sort_values.len() < 2 { + return None; + } + + let timestamp = match sort_values.first()? { + CursorValue::String(value) => parse_cursor_timestamp(value)?, + _ => return None, + }; + + let id = match sort_values.get(1)? { + CursorValue::String(value) => value.clone(), + _ => return None, + }; + + Some((timestamp, id)) +} + +fn parse_system_history_cursor(params: &HistoryParams) -> Option<(DateTime, String, String)> { + let cursor = params.pagination.cursor_value()?; + let sort_values = cursor.sort_values(); + if sort_values.len() < 3 { + return None; + } + + let timestamp = match sort_values.first()? { + CursorValue::String(value) => parse_cursor_timestamp(value)?, + _ => return None, + }; + + let resource_type = match sort_values.get(1)? { + CursorValue::String(value) => value.clone(), + _ => return None, + }; + + let id = match sort_values.get(2)? { + CursorValue::String(value) => value.clone(), + _ => return None, + }; + + Some((timestamp, resource_type, id)) +} + +#[derive(Debug, Clone)] +struct ParsedHistoryRow { + resource_type: String, + id: String, + version_id: String, + content: Value, + last_updated: DateTime, + is_deleted: bool, + deleted_at: Option>, + fhir_version: FhirVersion, +} + +impl ParsedHistoryRow { + fn into_stored_resource(self, tenant: &TenantContext) -> StoredResource { + StoredResource::from_storage( + &self.resource_type, + &self.id, + &self.version_id, + tenant.tenant_id().clone(), + self.content, + self.last_updated, + self.last_updated, + self.deleted_at, + self.fhir_version, + ) + } + + fn into_history_entry(self, tenant: &TenantContext) -> HistoryEntry { + let method = history_method_for(&self.version_id, self.is_deleted); + let timestamp = self.last_updated; + let resource = self.into_stored_resource(tenant); + + HistoryEntry { + resource, + method, + timestamp, + } + } +} + +fn parse_history_row( + doc: &Document, + fallback_resource_type: Option<&str>, + fallback_id: Option<&str>, +) -> StorageResult { + let resource_type = doc + .get_str("resource_type") + .ok() + .map(str::to_string) + .or_else(|| fallback_resource_type.map(str::to_string)) + .ok_or_else(|| internal_error("Missing resource_type in history document".to_string()))?; + + let id = doc + .get_str("id") + .ok() + .map(str::to_string) + .or_else(|| fallback_id.map(str::to_string)) + .ok_or_else(|| internal_error("Missing id in history document".to_string()))?; + + let version_id = doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing history version_id: {}", e)))? + .to_string(); + + let payload = doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing history payload: {}", e)))?; + let content = document_to_value(payload)?; + + let now = Utc::now(); + let last_updated = extract_last_updated(doc, now); + let is_deleted = doc.get_bool("is_deleted").unwrap_or(false); + let deleted_at = extract_deleted_at(doc).or(if is_deleted { Some(last_updated) } else { None }); + let fhir_version = extract_fhir_version(doc, FhirVersion::default()); + + Ok(ParsedHistoryRow { + resource_type, + id, + version_id, + content, + last_updated, + is_deleted, + deleted_at, + fhir_version, + }) +} + +async fn begin_best_effort_multi_write_session( + db: &mongodb::Database, +) -> (Option, bool) { + let mut session = db.client().start_session().await.ok(); + let mut transaction_active = false; + + if let Some(active_session) = session.as_mut() { + // Transactions are only supported on replica sets and sharded deployments. + // Fall back to non-transactional writes for standalone servers. + let hello = db.run_command(doc! { "hello": 1_i32 }).await.ok(); + let supports_transactions = hello.as_ref().is_some_and(|doc| { + doc.contains_key("setName") + || doc + .get_str("msg") + .map(|value| value == "isdbgrid") + .unwrap_or(false) + }); + + if supports_transactions && active_session.start_transaction().await.is_ok() { + transaction_active = true; + } else { + // Fallback to non-transactional writes on deployments that don't support transactions. + session = None; + } + } + + (session, transaction_active) +} + +async fn commit_best_effort_multi_write_session( + session: &mut Option, + transaction_active: bool, + operation: &str, +) -> StorageResult<()> { + if !transaction_active { + return Ok(()); + } + + if let Some(active_session) = session.as_mut() { + active_session.commit_transaction().await.map_err(|e| { + internal_error(format!( + "Failed to commit MongoDB transaction after {}: {}", + operation, e + )) + })?; + } + + Ok(()) +} + #[async_trait] impl ResourceStorage for MongoBackend { fn backend_name(&self) -> &'static str { @@ -115,6 +364,7 @@ impl ResourceStorage for MongoBackend { let db = self.get_database().await?; let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let (mut session, transaction_active) = begin_best_effort_multi_write_session(&db).await; let tenant_id = tenant.tenant_id().as_str(); // Extract or generate ID @@ -125,14 +375,26 @@ impl ResourceStorage for MongoBackend { .unwrap_or_else(|| uuid::Uuid::new_v4().to_string()); // Check if resource already exists (including deleted resources). - let existing = resources - .find_one(doc! { - "tenant_id": tenant_id, - "resource_type": resource_type, - "id": &id, - }) - .await - .map_err(|e| internal_error(format!("Failed to check existence: {}", e)))?; + let identity_filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + }; + + let existing = if let Some(active_session) = session.as_mut() { + resources + .find_one(identity_filter.clone()) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!("Failed to check existence (session): {}", e)) + })? + } else { + resources + .find_one(identity_filter) + .await + .map_err(|e| internal_error(format!("Failed to check existence: {}", e)))? + }; if existing.is_some() { return Err(StorageError::Resource(ResourceError::AlreadyExists { @@ -164,16 +426,33 @@ impl ResourceStorage for MongoBackend { "fhir_version": &fhir_version_str, }; - resources.insert_one(resource_doc).await.map_err(|e| { - if is_duplicate_key_error(&e) { - StorageError::Resource(ResourceError::AlreadyExists { - resource_type: resource_type.to_string(), - id: id.clone(), - }) - } else { - internal_error(format!("Failed to insert resource: {}", e)) - } - })?; + if let Some(active_session) = session.as_mut() { + resources + .insert_one(resource_doc.clone()) + .session(active_session) + .await + .map_err(|e| { + if is_duplicate_key_error(&e) { + StorageError::Resource(ResourceError::AlreadyExists { + resource_type: resource_type.to_string(), + id: id.clone(), + }) + } else { + internal_error(format!("Failed to insert resource (session): {}", e)) + } + })?; + } else { + resources.insert_one(resource_doc).await.map_err(|e| { + if is_duplicate_key_error(&e) { + StorageError::Resource(ResourceError::AlreadyExists { + resource_type: resource_type.to_string(), + id: id.clone(), + }) + } else { + internal_error(format!("Failed to insert resource: {}", e)) + } + })?; + } let history_doc = doc! { "tenant_id": tenant_id, @@ -188,10 +467,25 @@ impl ResourceStorage for MongoBackend { "fhir_version": fhir_version_str, }; - history - .insert_one(history_doc) - .await - .map_err(|e| internal_error(format!("Failed to insert resource history: {}", e)))?; + if let Some(active_session) = session.as_mut() { + history + .insert_one(history_doc) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!( + "Failed to insert resource history (session): {}", + e + )) + })?; + } else { + history + .insert_one(history_doc) + .await + .map_err(|e| internal_error(format!("Failed to insert resource history: {}", e)))?; + } + + commit_best_effort_multi_write_session(&mut session, transaction_active, "create").await?; Ok(StoredResource::from_storage( resource_type, @@ -300,19 +594,32 @@ impl ResourceStorage for MongoBackend { let db = self.get_database().await?; let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let (mut session, transaction_active) = begin_best_effort_multi_write_session(&db).await; let tenant_id = tenant.tenant_id().as_str(); let resource_type = current.resource_type(); let id = current.id(); - let maybe_existing = resources - .find_one(doc! { - "tenant_id": tenant_id, - "resource_type": resource_type, - "id": id, - "is_deleted": false, - }) - .await - .map_err(|e| internal_error(format!("Failed to load current resource: {}", e)))?; + let current_filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }; + + let maybe_existing = if let Some(active_session) = session.as_mut() { + resources + .find_one(current_filter.clone()) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!("Failed to load current resource (session): {}", e)) + })? + } else { + resources + .find_one(current_filter) + .await + .map_err(|e| internal_error(format!("Failed to load current resource: {}", e)))? + }; let Some(existing_doc) = maybe_existing else { return Err(StorageError::Resource(ResourceError::NotFound { @@ -348,28 +655,38 @@ impl ResourceStorage for MongoBackend { let fhir_version = current.fhir_version(); let fhir_version_str = fhir_version.as_mime_param().to_string(); - let update_result = resources - .update_one( - doc! { - "tenant_id": tenant_id, - "resource_type": resource_type, - "id": id, - "version_id": current.version_id(), - "is_deleted": false, - }, - doc! { - "$set": { - "version_id": &new_version, - "data": Bson::Document(payload.clone()), - "last_updated": now_bson, - "is_deleted": false, - "deleted_at": Bson::Null, - "fhir_version": &fhir_version_str, - } - }, - ) - .await - .map_err(|e| internal_error(format!("Failed to update resource: {}", e)))?; + let update_filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": current.version_id(), + "is_deleted": false, + }; + let update_doc = doc! { + "$set": { + "version_id": &new_version, + "data": Bson::Document(payload.clone()), + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + } + }; + + let update_result = if let Some(active_session) = session.as_mut() { + resources + .update_one(update_filter.clone(), update_doc.clone()) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!("Failed to update resource (session): {}", e)) + })? + } else { + resources + .update_one(update_filter, update_doc) + .await + .map_err(|e| internal_error(format!("Failed to update resource: {}", e)))? + }; if update_result.matched_count == 0 { let latest = resources @@ -414,10 +731,24 @@ impl ResourceStorage for MongoBackend { "fhir_version": fhir_version_str, }; - history - .insert_one(history_doc) - .await - .map_err(|e| internal_error(format!("Failed to insert updated history row: {}", e)))?; + if let Some(active_session) = session.as_mut() { + history + .insert_one(history_doc) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!( + "Failed to insert updated history row (session): {}", + e + )) + })?; + } else { + history.insert_one(history_doc).await.map_err(|e| { + internal_error(format!("Failed to insert updated history row: {}", e)) + })?; + } + + commit_best_effort_multi_write_session(&mut session, transaction_active, "update").await?; Ok(StoredResource::from_storage( resource_type, @@ -441,19 +772,35 @@ impl ResourceStorage for MongoBackend { let db = self.get_database().await?; let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let (mut session, transaction_active) = begin_best_effort_multi_write_session(&db).await; let tenant_id = tenant.tenant_id().as_str(); - let maybe_existing = resources - .find_one(doc! { - "tenant_id": tenant_id, - "resource_type": resource_type, - "id": id, - "is_deleted": false, - }) - .await - .map_err(|e| { - internal_error(format!("Failed to check resource before delete: {}", e)) - })?; + let delete_lookup_filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }; + + let maybe_existing = if let Some(active_session) = session.as_mut() { + resources + .find_one(delete_lookup_filter.clone()) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!( + "Failed to check resource before delete (session): {}", + e + )) + })? + } else { + resources + .find_one(delete_lookup_filter) + .await + .map_err(|e| { + internal_error(format!("Failed to check resource before delete: {}", e)) + })? + }; let Some(existing_doc) = maybe_existing else { return Err(StorageError::Resource(ResourceError::NotFound { @@ -481,26 +828,36 @@ impl ResourceStorage for MongoBackend { let now = Utc::now(); let now_bson = chrono_to_bson(now); - let update_result = resources - .update_one( - doc! { - "tenant_id": tenant_id, - "resource_type": resource_type, - "id": id, - "version_id": ¤t_version, - "is_deleted": false, - }, - doc! { - "$set": { - "version_id": &new_version, - "is_deleted": true, - "deleted_at": now_bson, - "last_updated": now_bson, - } - }, - ) - .await - .map_err(|e| internal_error(format!("Failed to soft-delete resource: {}", e)))?; + let delete_update_filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": ¤t_version, + "is_deleted": false, + }; + let delete_update_doc = doc! { + "$set": { + "version_id": &new_version, + "is_deleted": true, + "deleted_at": now_bson, + "last_updated": now_bson, + } + }; + + let update_result = if let Some(active_session) = session.as_mut() { + resources + .update_one(delete_update_filter.clone(), delete_update_doc.clone()) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!("Failed to soft-delete resource (session): {}", e)) + })? + } else { + resources + .update_one(delete_update_filter, delete_update_doc) + .await + .map_err(|e| internal_error(format!("Failed to soft-delete resource: {}", e)))? + }; if update_result.matched_count == 0 { return Err(StorageError::Resource(ResourceError::NotFound { @@ -522,10 +879,24 @@ impl ResourceStorage for MongoBackend { "fhir_version": fhir_version, }; - history - .insert_one(history_doc) - .await - .map_err(|e| internal_error(format!("Failed to insert deletion history row: {}", e)))?; + if let Some(active_session) = session.as_mut() { + history + .insert_one(history_doc) + .session(active_session) + .await + .map_err(|e| { + internal_error(format!( + "Failed to insert deletion history row (session): {}", + e + )) + })?; + } else { + history.insert_one(history_doc).await.map_err(|e| { + internal_error(format!("Failed to insert deletion history row: {}", e)) + })?; + } + + commit_best_effort_multi_write_session(&mut session, transaction_active, "delete").await?; Ok(()) } @@ -594,3 +965,430 @@ impl ResourceStorage for MongoBackend { .map_err(|e| internal_error(format!("Failed to count resources: {}", e))) } } + +#[async_trait] +impl VersionedStorage for MongoBackend { + async fn vread( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + version_id: &str, + ) -> StorageResult> { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let maybe_doc = history + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": version_id, + }) + .await + .map_err(|e| internal_error(format!("Failed to read historical version: {}", e)))?; + + let Some(doc) = maybe_doc else { + return Ok(None); + }; + + let row = parse_history_row(&doc, Some(resource_type), Some(id))?; + Ok(Some(row.into_stored_resource(tenant))) + } + + async fn update_with_match( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + expected_version: &str, + resource: Value, + ) -> StorageResult { + let current = self.read(tenant, resource_type, id).await?.ok_or_else(|| { + StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + }) + })?; + + let expected = normalize_etag(expected_version); + let actual = normalize_etag(current.version_id()); + + if expected != actual { + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: expected.to_string(), + actual_version: actual.to_string(), + }, + )); + } + + self.update(tenant, ¤t, resource).await + } + + async fn delete_with_match( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + expected_version: &str, + ) -> StorageResult<()> { + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let maybe_doc = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .await + .map_err(|e| { + internal_error(format!( + "Failed to load resource for delete_with_match: {}", + e + )) + })?; + + let Some(doc) = maybe_doc else { + return Err(StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + })); + }; + + let actual = doc.get_str("version_id").map_err(|e| { + internal_error(format!( + "Missing current version for delete_with_match: {}", + e + )) + })?; + + let expected = normalize_etag(expected_version); + let actual = normalize_etag(actual); + + if expected != actual { + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: expected.to_string(), + actual_version: actual.to_string(), + }, + )); + } + + self.delete(tenant, resource_type, id).await + } + + async fn list_versions( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + ) -> StorageResult> { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let cursor = history + .find(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + }) + .await + .map_err(|e| internal_error(format!("Failed to query version history: {}", e)))?; + + let docs = collect_documents(cursor).await?; + let mut versions = docs + .iter() + .filter_map(|doc| doc.get_str("version_id").ok().map(str::to_string)) + .collect::>(); + + versions.sort_by(|a, b| { + parse_version_id(a) + .cmp(&parse_version_id(b)) + .then_with(|| a.cmp(b)) + }); + + Ok(versions) + } +} + +#[async_trait] +impl InstanceHistoryProvider for MongoBackend { + async fn history_instance( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + params: &HistoryParams, + ) -> StorageResult { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let mut filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + }; + apply_history_params_filter(&mut filter, params); + + let cursor = history + .find(filter) + .await + .map_err(|e| internal_error(format!("Failed to query instance history: {}", e)))?; + + let docs = collect_documents(cursor).await?; + let mut rows = docs + .iter() + .map(|doc| parse_history_row(doc, Some(resource_type), Some(id))) + .collect::>>()?; + + rows.sort_by(|a, b| { + parse_version_id(&b.version_id) + .cmp(&parse_version_id(&a.version_id)) + .then_with(|| b.last_updated.cmp(&a.last_updated)) + }); + + if let Some(cursor_version) = parse_cursor_version(params) { + rows.retain(|row| parse_version_id(&row.version_id) < cursor_version); + } + + let page_len = params.pagination.count as usize; + let has_more = rows.len() > page_len; + if has_more { + rows.truncate(page_len); + } + + let page_info = if has_more { + if let Some(last) = rows.last() { + PageInfo::with_next(PageCursor::new( + vec![CursorValue::String(last.version_id.clone())], + id.to_string(), + )) + } else { + PageInfo::end() + } + } else { + PageInfo::end() + }; + + let entries = rows + .into_iter() + .map(|row| row.into_history_entry(tenant)) + .collect::>(); + + Ok(Page::new(entries, page_info)) + } + + async fn history_instance_count( + &self, + tenant: &TenantContext, + resource_type: &str, + id: &str, + ) -> StorageResult { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + history + .count_documents(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + }) + .await + .map_err(|e| internal_error(format!("Failed to count instance history: {}", e))) + } +} + +#[async_trait] +impl TypeHistoryProvider for MongoBackend { + async fn history_type( + &self, + tenant: &TenantContext, + resource_type: &str, + params: &HistoryParams, + ) -> StorageResult { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let mut filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + }; + apply_history_params_filter(&mut filter, params); + + let cursor = history + .find(filter) + .await + .map_err(|e| internal_error(format!("Failed to query type history: {}", e)))?; + + let docs = collect_documents(cursor).await?; + let mut rows = docs + .iter() + .map(|doc| parse_history_row(doc, Some(resource_type), None)) + .collect::>>()?; + + rows.sort_by(|a, b| { + b.last_updated + .cmp(&a.last_updated) + .then_with(|| b.id.cmp(&a.id)) + .then_with(|| parse_version_id(&b.version_id).cmp(&parse_version_id(&a.version_id))) + }); + + if let Some((cursor_timestamp, cursor_id)) = parse_type_history_cursor(params) { + rows.retain(|row| { + row.last_updated < cursor_timestamp + || (row.last_updated == cursor_timestamp && row.id < cursor_id) + }); + } + + let page_len = params.pagination.count as usize; + let has_more = rows.len() > page_len; + if has_more { + rows.truncate(page_len); + } + + let page_info = if has_more { + if let Some(last) = rows.last() { + PageInfo::with_next(PageCursor::new( + vec![ + CursorValue::String(last.last_updated.to_rfc3339()), + CursorValue::String(last.id.clone()), + ], + resource_type.to_string(), + )) + } else { + PageInfo::end() + } + } else { + PageInfo::end() + }; + + let entries = rows + .into_iter() + .map(|row| row.into_history_entry(tenant)) + .collect::>(); + + Ok(Page::new(entries, page_info)) + } + + async fn history_type_count( + &self, + tenant: &TenantContext, + resource_type: &str, + ) -> StorageResult { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + history + .count_documents(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + }) + .await + .map_err(|e| internal_error(format!("Failed to count type history: {}", e))) + } +} + +#[async_trait] +impl SystemHistoryProvider for MongoBackend { + async fn history_system( + &self, + tenant: &TenantContext, + params: &HistoryParams, + ) -> StorageResult { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let mut filter = doc! { + "tenant_id": tenant_id, + }; + apply_history_params_filter(&mut filter, params); + + let cursor = history + .find(filter) + .await + .map_err(|e| internal_error(format!("Failed to query system history: {}", e)))?; + + let docs = collect_documents(cursor).await?; + let mut rows = docs + .iter() + .map(|doc| parse_history_row(doc, None, None)) + .collect::>>()?; + + rows.sort_by(|a, b| { + b.last_updated + .cmp(&a.last_updated) + .then_with(|| b.resource_type.cmp(&a.resource_type)) + .then_with(|| b.id.cmp(&a.id)) + .then_with(|| parse_version_id(&b.version_id).cmp(&parse_version_id(&a.version_id))) + }); + + if let Some((cursor_timestamp, cursor_type, cursor_id)) = + parse_system_history_cursor(params) + { + rows.retain(|row| { + row.last_updated < cursor_timestamp + || (row.last_updated == cursor_timestamp + && (row.resource_type < cursor_type + || (row.resource_type == cursor_type && row.id < cursor_id))) + }); + } + + let page_len = params.pagination.count as usize; + let has_more = rows.len() > page_len; + if has_more { + rows.truncate(page_len); + } + + let page_info = if has_more { + if let Some(last) = rows.last() { + PageInfo::with_next(PageCursor::new( + vec![ + CursorValue::String(last.last_updated.to_rfc3339()), + CursorValue::String(last.resource_type.clone()), + CursorValue::String(last.id.clone()), + ], + "system".to_string(), + )) + } else { + PageInfo::end() + } + } else { + PageInfo::end() + }; + + let entries = rows + .into_iter() + .map(|row| row.into_history_entry(tenant)) + .collect::>(); + + Ok(Page::new(entries, page_info)) + } + + async fn history_system_count(&self, tenant: &TenantContext) -> StorageResult { + let db = self.get_database().await?; + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + history + .count_documents(doc! { + "tenant_id": tenant_id, + }) + .await + .map_err(|e| internal_error(format!("Failed to count system history: {}", e))) + } +} diff --git a/crates/persistence/tests/common/capabilities.rs b/crates/persistence/tests/common/capabilities.rs index e37c77fe..e09d7df8 100644 --- a/crates/persistence/tests/common/capabilities.rs +++ b/crates/persistence/tests/common/capabilities.rs @@ -159,10 +159,10 @@ impl CapabilityMatrix { BackendKind::MongoDB, vec![ (BackendCapability::Crud, SupportLevel::Implemented), - (BackendCapability::Versioning, SupportLevel::Planned), - (BackendCapability::InstanceHistory, SupportLevel::Planned), - (BackendCapability::TypeHistory, SupportLevel::Planned), - (BackendCapability::SystemHistory, SupportLevel::Planned), + (BackendCapability::Versioning, SupportLevel::Implemented), + (BackendCapability::InstanceHistory, SupportLevel::Implemented), + (BackendCapability::TypeHistory, SupportLevel::Implemented), + (BackendCapability::SystemHistory, SupportLevel::Implemented), (BackendCapability::BasicSearch, SupportLevel::Planned), (BackendCapability::DateSearch, SupportLevel::Planned), (BackendCapability::ReferenceSearch, SupportLevel::Planned), @@ -173,7 +173,7 @@ impl CapabilityMatrix { (BackendCapability::FullTextSearch, SupportLevel::Planned), (BackendCapability::TerminologySearch, SupportLevel::Planned), (BackendCapability::Transactions, SupportLevel::Planned), - (BackendCapability::OptimisticLocking, SupportLevel::Planned), + (BackendCapability::OptimisticLocking, SupportLevel::Implemented), (BackendCapability::CursorPagination, SupportLevel::Planned), (BackendCapability::OffsetPagination, SupportLevel::Planned), (BackendCapability::Sorting, SupportLevel::Planned), diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index beb1afdf..4a491612 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -10,8 +10,11 @@ use helios_fhir::FhirVersion; use helios_persistence::backends::mongodb::{MongoBackend, MongoBackendConfig}; -use helios_persistence::core::{Backend, BackendCapability, BackendKind, ResourceStorage}; -use helios_persistence::error::{ResourceError, StorageError}; +use helios_persistence::core::{ + Backend, BackendCapability, BackendKind, HistoryParams, InstanceHistoryProvider, + ResourceStorage, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, +}; +use helios_persistence::error::{BackendError, ConcurrencyError, ResourceError, StorageError}; use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; use serde_json::json; @@ -74,16 +77,20 @@ fn test_mongodb_integration_database_name_within_limit() { } #[test] -fn test_mongodb_phase2_capabilities() { +fn test_mongodb_phase3_capabilities() { let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); assert_eq!(backend.kind(), BackendKind::MongoDB); assert_eq!(backend.name(), "mongodb"); assert!(backend.supports(BackendCapability::Crud)); + assert!(backend.supports(BackendCapability::Versioning)); + assert!(backend.supports(BackendCapability::InstanceHistory)); + assert!(backend.supports(BackendCapability::TypeHistory)); + assert!(backend.supports(BackendCapability::SystemHistory)); + assert!(backend.supports(BackendCapability::OptimisticLocking)); assert!(backend.supports(BackendCapability::SharedSchema)); - assert!(!backend.supports(BackendCapability::Versioning)); assert!(!backend.supports(BackendCapability::BasicSearch)); assert!(!backend.supports(BackendCapability::Transactions)); } @@ -301,3 +308,290 @@ async fn mongodb_integration_create_or_update() { assert!(!was_created_again); assert_eq!(updated.version_id(), "2"); } + +#[tokio::test] +async fn mongodb_integration_versioned_storage_vread_and_list_versions() { + let Some(backend) = create_backend("versioned_vread").await else { + eprintln!( + "Skipping mongodb_integration_versioned_storage_vread_and_list_versions (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-versioned"); + + let v1 = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "patient-v", + "name": [{"family": "Version1"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let v2 = backend + .update( + &tenant, + &v1, + json!({ + "resourceType": "Patient", + "id": "patient-v", + "name": [{"family": "Version2"}] + }), + ) + .await + .unwrap(); + + backend.delete(&tenant, "Patient", v2.id()).await.unwrap(); + + let read_v1 = backend + .vread(&tenant, "Patient", v1.id(), "1") + .await + .unwrap() + .unwrap(); + let read_v2 = backend + .vread(&tenant, "Patient", v1.id(), "2") + .await + .unwrap() + .unwrap(); + let read_v3 = backend + .vread(&tenant, "Patient", v1.id(), "3") + .await + .unwrap() + .unwrap(); + + assert_eq!(read_v1.version_id(), "1"); + assert_eq!(read_v1.content()["name"][0]["family"], "Version1"); + assert_eq!(read_v2.version_id(), "2"); + assert_eq!(read_v2.content()["name"][0]["family"], "Version2"); + assert_eq!(read_v3.version_id(), "3"); + assert!(read_v3.deleted_at().is_some()); + + let versions = backend + .list_versions(&tenant, "Patient", v1.id()) + .await + .unwrap(); + assert_eq!(versions, vec!["1", "2", "3"]); +} + +#[tokio::test] +async fn mongodb_integration_update_with_match_and_delete_with_match() { + let Some(backend) = create_backend("if_match").await else { + eprintln!( + "Skipping mongodb_integration_update_with_match_and_delete_with_match (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-if-match"); + + let created = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "patient-if-match", + "name": [{"family": "Original"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let updated = backend + .update_with_match( + &tenant, + "Patient", + created.id(), + "W/\"1\"", + json!({ + "resourceType": "Patient", + "id": created.id(), + "name": [{"family": "Updated"}] + }), + ) + .await + .unwrap(); + + assert_eq!(updated.version_id(), "2"); + assert_eq!(updated.content()["name"][0]["family"], "Updated"); + + let stale_update = backend + .update_with_match( + &tenant, + "Patient", + created.id(), + "1", + json!({ + "resourceType": "Patient", + "id": created.id(), + "name": [{"family": "ShouldFail"}] + }), + ) + .await; + + assert!(matches!( + stale_update, + Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { .. } + )) + )); + + let stale_delete = backend + .delete_with_match(&tenant, "Patient", created.id(), "1") + .await; + assert!(matches!( + stale_delete, + Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { .. } + )) + )); + + backend + .delete_with_match(&tenant, "Patient", created.id(), "2") + .await + .unwrap(); +} + +#[tokio::test] +async fn mongodb_integration_history_providers() { + let Some(backend) = create_backend("history_providers").await else { + eprintln!("Skipping mongodb_integration_history_providers (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant = create_tenant("tenant-history"); + + let patient_v1 = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "patient-history", + "name": [{"family": "One"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let patient_v2 = backend + .update( + &tenant, + &patient_v1, + json!({ + "resourceType": "Patient", + "id": "patient-history", + "name": [{"family": "Two"}] + }), + ) + .await + .unwrap(); + + backend + .delete(&tenant, "Patient", patient_v2.id()) + .await + .unwrap(); + + backend + .create( + &tenant, + "Observation", + json!({ + "resourceType": "Observation", + "id": "obs-history", + "status": "final" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let params = HistoryParams::new().count(20).include_deleted(true); + + let instance_history = backend + .history_instance(&tenant, "Patient", patient_v1.id(), ¶ms) + .await + .unwrap(); + assert_eq!(instance_history.items.len(), 3); + assert_eq!(instance_history.items[0].resource.version_id(), "3"); + + let type_history = backend + .history_type(&tenant, "Patient", ¶ms) + .await + .unwrap(); + assert!(type_history.items.len() >= 3); + + let system_history = backend.history_system(&tenant, ¶ms).await.unwrap(); + assert!(system_history.items.len() >= 4); + + let instance_count = backend + .history_instance_count(&tenant, "Patient", patient_v1.id()) + .await + .unwrap(); + assert_eq!(instance_count, 3); + + let type_count = backend + .history_type_count(&tenant, "Patient") + .await + .unwrap(); + assert_eq!(type_count, 3); + + let system_count = backend.history_system_count(&tenant).await.unwrap(); + assert!(system_count >= 4); +} + +#[tokio::test] +async fn mongodb_integration_history_delete_trial_use_not_supported() { + let Some(backend) = create_backend("history_delete_not_supported").await else { + eprintln!( + "Skipping mongodb_integration_history_delete_trial_use_not_supported (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-history-not-supported"); + + let created = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "patient-history-delete", + "name": [{"family": "TrialUse"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let delete_all_history = backend + .delete_instance_history(&tenant, "Patient", created.id()) + .await; + + assert!(matches!( + delete_all_history, + Err(StorageError::Backend( + BackendError::UnsupportedCapability { .. } + )) + )); + + let delete_single_version = backend + .delete_version(&tenant, "Patient", created.id(), "1") + .await; + + assert!(matches!( + delete_single_version, + Err(StorageError::Backend( + BackendError::UnsupportedCapability { .. } + )) + )); +} diff --git a/phase3_roadmap.xml b/phase3_roadmap.xml new file mode 100644 index 00000000..111b6cd5 --- /dev/null +++ b/phase3_roadmap.xml @@ -0,0 +1,374 @@ + + + + HeliosSoftware/hfs + planned + TBD + 3 + + + + 2026-03-04 + Detailed Phase 3 plan drafted: versioning/history parity is in scope; conditional + operations are deferred to Phase 4; history delete Trial Use methods remain NotSupported in + this phase. + + + + + + Mongo ResourceStorage CRUD/count/read_batch/create_or_update parity delivered. + Tenant isolation and soft-delete/Gone semantics validated for Phase 2 scope. + Schema bootstrap and migration foundations are implemented and idempotent. + + + Version-aware reads and optimistic locking beyond base update behavior were not + completed in Phase 2. + History provider traits were not implemented in Phase 2. + Conditional operations were not implemented in Phase 2. + + + + + + Deliver MongoDB parity for VersionedStorage and history providers while documenting and + implementing session-based behavior where it improves consistency, without pulling + search-dependent conditional semantics into this phase. + + Implement VersionedStorage methods for vread, update_with_match, + delete_with_match, and list_versions with tenant-safe filters and deterministic error + mapping. + Implement InstanceHistoryProvider, TypeHistoryProvider, and + SystemHistoryProvider with pagination, time filters, and include_deleted handling. + Keep FHIR Trial Use history delete features + (delete_instance_history/delete_version) as NotSupported in this phase and document the + support level explicitly. + Implement Mongo session-based execution where beneficial for multi-document + write paths and document deviations from PostgreSQL transaction semantics. + Align tests, capability declarations, and documentation with actual + post-Phase-3 support levels. + + + Conditional create/update/delete implementation (deferred to Phase 4 due to search + dependency). + Conditional patch support. + Advanced search execution, chained/reverse chaining, include/revinclude behavior. + DifferentialHistoryProvider implementation. + Composite MongoDB plus Elasticsearch runtime routing changes. + + + + + + + + + + + + + + + + + + + + + + + + + + + + Provide tenant-safe versioned reads and If-Match behavior with parity-focused error + semantics. + + Implement vread against resource_history using tenant_id + resource_type + + resource_id + version_id filters. + Ensure vread returns historical versions regardless of current deleted + state when the requested version exists. + Implement update_with_match with strict expected-version comparison and + ConcurrencyError::VersionConflict mapping. + Implement delete_with_match with strict expected-version comparison and + consistent not-found/version-conflict behavior. + Implement list_versions using stable ordering (oldest to newest) under + tenant scope. + Normalize If-Match/ETag inputs before comparison and keep behavior + consistent with core helper semantics. + Guarantee history snapshots are persisted for each successful update/delete + path before returning success. + + + Mongo VersionedStorage implementation with behavior-parity tests. + Deterministic concurrency error behavior for stale version writes/deletes. + + + + + Deliver paginated history interactions at all three scopes with explicit Phase 3 support + boundaries. + + Implement history_instance with since/before/include_deleted/pagination + handling. + Implement history_instance_count. + Implement history_type with tenant + type scoped filters and + reverse-chronological ordering. + Implement history_type_count. + Implement history_system with tenant-scoped cross-type ordering. + Implement history_system_count. + Map history entries to HistoryMethod values consistently (POST/PUT/DELETE + and PATCH where applicable). + Keep delete_instance_history and delete_version as default NotSupported + behavior for this phase. + Document NotSupported status for Trial Use history-delete methods in + roadmap/docs/capability notes. + + + Instance/type/system history trait implementations validated by integration + tests. + Explicitly documented Phase 3 support boundary for history-delete Trial Use + operations. + + + + + Ensure predictable query performance and deterministic ordering for version/history + operations. + + Add/verify history indexes for tenant_id + resource_type + resource_id + + version_id lookup paths. + Add/verify indexes for tenant_id + resource_type + last_updated + history_type queries. + Add/verify indexes for tenant_id + last_updated history_system queries. + Document any version sorting assumptions (string versus numeric) and + enforce deterministic behavior. + Validate schema bootstrap/migration idempotency after index additions. + + + Index strategy supporting Phase 3 history and version query patterns. + Migration-safe schema behavior for repeated startup runs. + + + + + Implement Mongo sessions where they provide practical consistency value and document + behavior differences versus PostgreSQL. + + Identify multi-document operations that benefit from session/transaction + wrapping (for example: current resource write plus history append). + Implement session-backed execution paths for selected operations with safe + fallback behavior where full transaction support is not feasible. + Handle transient transaction/session errors with consistent StorageError + mapping and clear retry boundaries. + Document explicit deviations from PostgreSQL transactional guarantees and + isolation expectations. + Add focused tests for session-backed paths and rollback expectations where + behavior is claimed. + + + Session-backed operations implemented where beneficial and test-covered. + Clear transaction behavior documentation with no implicit parity claims. + + + + + Keep scope boundaries explicit by deferring conditional operations to Phase 4 and + reflecting that consistently in roadmap/docs/tests. + + Record that conditional create/update/delete remain deferred to Phase 4 due + to search dependency. + Ensure Mongo backend capability declarations do not claim conditional + support in Phase 3. + Do not add conditional-operation implementation in Mongo Phase 3 code + paths. + Define Phase 4 enablement gates for conditional support (search matching + fidelity, test coverage, and docs). + + + Roadmap/capability/docs alignment showing conditional support deferred to Phase + 4. + + + + + Ship Phase 3 with parity-focused tests and truthful support reporting. + + Add unit tests for version conflict checks and ETag normalization edge + cases. + Add integration tests for + vread/update_with_match/delete_with_match/list_versions behavior. + Add integration tests for history_instance/history_type/history_system + filters and paging. + Add integration tests proving history-delete Trial Use methods remain + NotSupported. + Add targeted tests for session-backed operations where introduced. + Update README capability matrix and backend role/status text for + post-Phase-3 truthfulness. + Update tests/common/capabilities.rs to match implemented support exactly. + + + Phase 3 test coverage for versioning/history/session behavior in Mongo feature + mode. + Capability and roadmap documentation aligned with implementation reality. + + + + + + + Version conflict checks for matching/non-matching expected versions. + ETag normalization coverage for W/quoted/unquoted forms. + History entry method mapping tests for create/update/delete transitions. + History parameter filter builder tests (since/before/include_deleted/pagination + cursors). + Session/transaction error mapping tests for Mongo driver errors translated to + StorageError. + Schema index bootstrap idempotency tests for Phase 3 index additions. + + + + vread returns expected historical version, including versions of resources + currently deleted. + update_with_match succeeds on exact version and fails with VersionConflict on + stale version. + delete_with_match succeeds on exact version and fails with + VersionConflict/NotFound as appropriate. + list_versions returns complete ascending version sequence under tenant scope. + history_instance supports since/before/include_deleted/pagination semantics. + history_type and history_system return tenant-safe reverse-chronological pages + and correct counts. + delete_instance_history and delete_version remain NotSupported in Mongo Phase + 3. + Session-backed write paths preserve current plus history consistency for + selected operations. + Cross-tenant negative tests for all newly added version/history operations. + + + + Compare Mongo outcomes against SQLite/PostgreSQL contract expectations for methods in + Phase 3 scope. + Do not mark conditional operations as implemented in tests or capabilities for this + phase. + Document all intentional deviations before phase completion sign-off. + + + + + cargo check -p helios-persistence --features mongodb + cargo check -p helios-rest --features mongodb + cargo check -p helios-hfs --features mongodb + cargo check -p helios-persistence --features + "sqlite,postgres,elasticsearch,mongodb" + cargo test -p helios-persistence --features mongodb --test mongodb_tests + cargo test -p helios-persistence --features mongodb mongodb:: + cargo fmt --all -- --check + + + + + WS1.1-WS1.7, WS6.1-WS6.2 + VersionedStorage behavior is implemented and version-concurrency tests pass. + + + + WS2.1-WS2.9, WS6.3-WS6.4 + Instance/type/system history tests pass and Trial Use history-delete methods are + explicitly NotSupported. + + + + WS3.1-WS3.5 + Phase 3 indexes are present, migrations are idempotent, and history query ordering + is deterministic. + + + + WS4.1-WS4.5, WS6.5 + Session-backed operations are implemented where beneficial and transaction + deviations are documented. + + + + WS5.1-WS5.4, WS6.6-WS6.7 + Conditional operations are clearly deferred to Phase 4 and all capability/docs + entries are aligned. + + + + + + + + + + + + + + + + + + Add explicit Phase 3 note in roadmap_mongo.xml that conditional operations are deferred to + Phase 4. + Add explicit Phase 3 note in roadmap_mongo.xml that delete_instance_history/delete_version + remain NotSupported. + Update persistence README capability matrix and backend role text for post-Phase-3 status. + Update test capability matrix to avoid enabling conditional-operation tests for MongoDB in + Phase 3. + + + + + Version ordering inconsistencies (string vs numeric) can break list_versions and + history pagination determinism. + Define explicit ordering strategy and enforce it in both query logic and tests. + + + Session/transaction behavior may diverge from PostgreSQL assumptions in higher + layers. + Implement session wrapping only where beneficial and document deviation boundaries + explicitly. + + + History queries across type/system scope may regress without proper indexes. + Add targeted indexes and validate query behavior with integration tests on + realistic data volumes. + + + Scope creep from conditional semantics can delay Phase 3 completion. + Hard-scope conditional support to Phase 4 with explicit roadmap and capability + gating. + + + Trial Use history-delete expectations may be misinterpreted as implemented + support. + Keep default NotSupported behavior and document support level in + roadmap/docs/tests. + + + + + VersionedStorage methods in Phase 3 scope are implemented and + validated by Mongo integration tests. + Instance/type/system history provider tests pass with tenant + isolation preserved. + delete_instance_history and delete_version remain NotSupported and + are explicitly documented. + Conditional operations are explicitly deferred to Phase 4 in + roadmap/docs/capability artifacts. + Session-based operations are implemented where beneficial, with + documented transaction deviations. + Validation commands pass for mongodb-only and mixed-feature builds. + Documentation and capability matrix remain truthful with no + aspirational mismatch. + + \ No newline at end of file diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml index 647f1696..8186084a 100644 --- a/roadmap_mongo.xml +++ b/roadmap_mongo.xml @@ -26,7 +26,8 @@ MongoDB as a primary persistence backend in helios-persistence. - Tenant-aware CRUD, versioning, history, and conditional operation semantics compatible with existing backends. + Tenant-aware CRUD, versioning, history, and conditional operation semantics compatible + with existing backends. Search strategy for MongoDB (native indexing and/or search offloading to Elasticsearch). Composite integration for MongoDB + Elasticsearch. Server runtime wiring via HFS_STORAGE_BACKEND options for MongoDB modes. @@ -64,13 +65,16 @@ Introduce MongoDB backend module scaffolding with compile-time and runtime hooks. Enable mongodb module export in crates/persistence/src/backends/mod.rs. - Create crates/persistence/src/backends/mongodb with backend/config/schema skeleton. + Create crates/persistence/src/backends/mongodb with backend/config/schema + skeleton. Define MongoDB backend config with defaults and serde support. Implement Backend trait basics (kind/name/capabilities/health checks). - Wire feature-gated compile paths for mongo in helios-persistence and helios-hfs. + Wire feature-gated compile paths for mongo in helios-persistence and + helios-hfs. - Existing Cargo feature 'mongodb' and optional dependency in crates/persistence/Cargo.toml. + Existing Cargo feature 'mongodb' and optional dependency in + crates/persistence/Cargo.toml. cargo check -p helios-persistence --features mongodb passes. @@ -84,7 +88,8 @@ Implement create/read/update/delete/exists/count/read_batch/create_or_update. Enforce tenant isolation in all collection queries and indexes. Implement soft-delete semantics aligned with existing Gone behavior. - Create core collection/index strategy: resources, resource_history, search indexes (if native). + Create core collection/index strategy: resources, resource_history, search + indexes (if native). Phase 1 backend skeleton. @@ -100,11 +105,20 @@ Implement VersionedStorage (vread + update_with_match semantics). Implement instance/type/system history providers. - Implement conditional create/update/delete and If-Match handling. - Define transaction behavior using Mongo sessions where feasible. + Defer conditional create/update/delete and If-Match handling to Phase 4 + search/indexing work (search dependency). + Define and implement session-based transaction behavior where beneficial; + document deviations from PostgreSQL semantics. + + FHIR Trial Use history delete features (DELETE [type]/[id]/_history and DELETE + [type]/[id]/_history/[vid]) remain NotSupported for MongoDB in Phase 3. + Conditional operations remain a target capability overall but are intentionally + phase-shifted to Phase 4 where search matching behavior is implemented. + Phase 2 core contract implementation. + Phase 4 search/indexing implementation for conditional matching semantics. History and versioning test suites pass for MongoDB feature mode. @@ -116,10 +130,13 @@ Support FHIR search behavior with clear native/offloaded boundaries. Implement SearchParameter extraction + indexing path for Mongo resources. - Implement basic parameter types (string/token/date/number/quantity/reference/uri/composite) per priority. + Implement basic parameter types + (string/token/date/number/quantity/reference/uri/composite) per priority. Support paging and sorting; define practical limits. - Implement full-text path via Mongo text indexes OR formalize Elasticsearch offload-first strategy. - Define support levels (implemented/partial/planned) in capability matrix and docs. + Implement full-text path via Mongo text indexes OR formalize Elasticsearch + offload-first strategy. + Define support levels (implemented/partial/planned) in capability matrix and + docs. Phase 2 and 3 collections + version model. @@ -131,12 +148,16 @@ - Provide robust primary-secondary mode mirroring sqlite-elasticsearch and postgres-elasticsearch. + Provide robust primary-secondary mode mirroring sqlite-elasticsearch and + postgres-elasticsearch. - Implement Mongo search_offloaded mode to avoid duplicate indexing when ES secondary is configured. + Implement Mongo search_offloaded mode to avoid duplicate indexing when ES + secondary is configured. Create composite wiring with Mongo primary and Elasticsearch search backend. - Ensure search registry sharing between Mongo backend and ES backend initialization. - Validate write-primary/read-primary/search-secondary routing and sync behavior. + Ensure search registry sharing between Mongo backend and ES backend + initialization. + Validate write-primary/read-primary/search-secondary routing and sync + behavior. Phase 4 search model clarity. @@ -150,9 +171,12 @@ Expose Mongo modes to HFS runtime and document operational guidance. - Add StorageBackendMode values for mongodb and mongodb-elasticsearch in crates/rest/src/config.rs. - Add start_mongodb and start_mongodb_elasticsearch flows in crates/hfs/src/main.rs. - Update persistence README capability matrix and role matrix to reflect implemented Mongo status. + Add StorageBackendMode values for mongodb and mongodb-elasticsearch in + crates/rest/src/config.rs. + Add start_mongodb and start_mongodb_elasticsearch flows in + crates/hfs/src/main.rs. + Update persistence README capability matrix and role matrix to reflect + implemented Mongo status. Update top-level ROADMAP.md persistence section when milestones ship. Document deployment examples, environment variables, and feature flags. @@ -168,7 +192,8 @@ - Prefer parity with existing SQLite/PostgreSQL behavioral contracts over backend-specific shortcuts. + Prefer parity with existing SQLite/PostgreSQL behavioral contracts over + backend-specific shortcuts. Use capability-driven tests to skip only what is explicitly planned/not planned. Run fast unit coverage first, then containerized integration, then full regression. @@ -188,9 +213,12 @@ - Create crates/persistence/tests/mongodb_tests.rs analogous to postgres_tests.rs and elasticsearch_tests.rs. - Use shared Mongo container lifecycle for speed and isolation via unique tenant IDs per test. - CRUD, tenant isolation, version increments, delete semantics (Gone/not found expectations). + Create crates/persistence/tests/mongodb_tests.rs analogous to postgres_tests.rs and + elasticsearch_tests.rs. + Use shared Mongo container lifecycle for speed and isolation via unique tenant IDs per + test. + CRUD, tenant isolation, version increments, delete semantics (Gone/not found + expectations). History and conditional operation behavior for implemented levels. @@ -201,11 +229,14 @@ - Reuse tests/common harness, fixtures, assertions, and capability matrix for Mongo runs. - Validate parity with SQLite/PostgreSQL for tenant isolation, versioning, and error semantics. + Reuse tests/common harness, fixtures, assertions, and capability matrix for Mongo + runs. + Validate parity with SQLite/PostgreSQL for tenant isolation, versioning, and error + semantics. - All contract tests for implemented Mongo capabilities pass without Mongo-only exceptions. + All contract tests for implemented Mongo capabilities pass without Mongo-only + exceptions. @@ -241,7 +272,8 @@ Migration/index evolution tests for backward-compatible rollout behavior. - Benchmarks show no critical regressions versus declared targets for initial release. + Benchmarks show no critical regressions versus declared targets for initial + release. No data corruption or tenant leakage under concurrent load tests. @@ -249,29 +281,41 @@ - Mongo transaction semantics may diverge from ACID expectations used by PostgreSQL/SQLite flows. - Document support level clearly; enforce behavior with dedicated transaction and rollback tests. + Mongo transaction semantics may diverge from ACID expectations used by + PostgreSQL/SQLite flows. + Document support level clearly; enforce behavior with dedicated transaction and + rollback tests. - FHIR search parity gaps due to complex chained/reverse chained parameter translation. - Ship basic search parity first; mark advanced capabilities as partial/planned with explicit tests. + FHIR search parity gaps due to complex chained/reverse chained parameter + translation. + Ship basic search parity first; mark advanced capabilities as partial/planned with + explicit tests. - Dual indexing complexity when Mongo native search and Elasticsearch offloading coexist. - Use explicit search_offloaded controls and test for duplicated/stale index behavior. + Dual indexing complexity when Mongo native search and Elasticsearch offloading + coexist. + Use explicit search_offloaded controls and test for duplicated/stale index + behavior. - CI instability from container startup/resource constraints on self-hosted runners. - Shared container lifecycle, timeouts, and explicit cleanup steps aligned with current CI patterns. + CI instability from container startup/resource constraints on self-hosted + runners. + Shared container lifecycle, timeouts, and explicit cleanup steps aligned with + current CI patterns. - MongoDB backend module is enabled and feature-gated with stable compile and runtime startup paths. - MongoDB mode is selectable via HFS_STORAGE_BACKEND and validated in configuration parsing tests. - Core CRUD/version/history/tenant behavior passes defined contract tests for implemented capabilities. + MongoDB backend module is enabled and feature-gated with stable compile and runtime + startup paths. + MongoDB mode is selectable via HFS_STORAGE_BACKEND and validated in configuration parsing + tests. + Core CRUD/version/history/tenant behavior passes defined contract tests for implemented + capabilities. MongoDB + Elasticsearch composite path is implemented with routing and sync tests passing. - Capability matrix and roadmap documentation reflect actual support levels (no aspirational mismatch). + Capability matrix and roadmap documentation reflect actual support levels (no aspirational + mismatch). CI includes Mongo-targeted stages and remains green for required feature sets. - + \ No newline at end of file From e81b8c85fe46ccc923900450089249194d496f36 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Sun, 8 Mar 2026 17:49:29 -0400 Subject: [PATCH 08/17] feat: checkpoint phase 4 --- .../src/backends/mongodb/backend.rs | 18 +- .../persistence/src/backends/mongodb/mod.rs | 11 +- .../src/backends/mongodb/storage.rs | 27 +- phase4_roadmap.xml | 305 ++++++++++++++++++ roadmap_mongo.xml | 19 +- 5 files changed, 368 insertions(+), 12 deletions(-) create mode 100644 phase4_roadmap.xml diff --git a/crates/persistence/src/backends/mongodb/backend.rs b/crates/persistence/src/backends/mongodb/backend.rs index 1abe0a8a..bcc62e19 100644 --- a/crates/persistence/src/backends/mongodb/backend.rs +++ b/crates/persistence/src/backends/mongodb/backend.rs @@ -20,11 +20,12 @@ use super::schema; /// MongoDB backend for FHIR resource storage. /// -/// The Phase 3 implementation provides backend wiring, schema bootstrap, +/// The Phase 4 implementation provides backend wiring, schema bootstrap, /// core ResourceStorage behavior for CRUD/count + tenant isolation, /// [`crate::core::VersionedStorage`] support, and history providers. /// -/// Search/composite behavior and conditional operations remain in later phases. +/// Basic search and conditional create/update/delete are available. +/// Advanced search/composite behavior remains in later phases. pub struct MongoBackend { config: MongoBackendConfig, /// Search parameter registry (in-memory cache of active parameters). @@ -107,6 +108,7 @@ impl Default for MongoBackendConfig { impl MongoBackend { pub(crate) const RESOURCES_COLLECTION: &'static str = "resources"; pub(crate) const RESOURCE_HISTORY_COLLECTION: &'static str = "resource_history"; + pub(crate) const SEARCH_INDEX_COLLECTION: &'static str = "search_index"; /// Creates a new MongoDB backend from the provided configuration. pub fn new(config: MongoBackendConfig) -> StorageResult { @@ -387,6 +389,12 @@ impl Backend for MongoBackend { | BackendCapability::InstanceHistory | BackendCapability::TypeHistory | BackendCapability::SystemHistory + | BackendCapability::BasicSearch + | BackendCapability::DateSearch + | BackendCapability::ReferenceSearch + | BackendCapability::Sorting + | BackendCapability::OffsetPagination + | BackendCapability::CursorPagination | BackendCapability::OptimisticLocking | BackendCapability::SharedSchema ) @@ -399,6 +407,12 @@ impl Backend for MongoBackend { BackendCapability::InstanceHistory, BackendCapability::TypeHistory, BackendCapability::SystemHistory, + BackendCapability::BasicSearch, + BackendCapability::DateSearch, + BackendCapability::ReferenceSearch, + BackendCapability::Sorting, + BackendCapability::OffsetPagination, + BackendCapability::CursorPagination, BackendCapability::OptimisticLocking, BackendCapability::SharedSchema, ] diff --git a/crates/persistence/src/backends/mongodb/mod.rs b/crates/persistence/src/backends/mongodb/mod.rs index b7dd0555..96596492 100644 --- a/crates/persistence/src/backends/mongodb/mod.rs +++ b/crates/persistence/src/backends/mongodb/mod.rs @@ -1,19 +1,22 @@ //! MongoDB backend implementation. //! //! This module provides MongoDB backend wiring, schema bootstrap helpers, -//! and storage contract support through Phase 3. +//! and storage contract support through Phase 4. //! -//! Phase 3 scope currently includes: +//! Phase 4 scope currently includes: //! - backend/config wiring and health checks //! - core [`crate::core::ResourceStorage`] contract parity for CRUD/count //! - [`crate::core::VersionedStorage`] for vread and If-Match update/delete //! - history providers for instance/type/system history retrieval //! - tenant isolation and soft-delete semantics -//! - schema/index bootstrap foundations +//! - schema/index bootstrap foundations (including search index collection) +//! - basic [`crate::core::SearchProvider`] support for first-wave parameter types +//! - [`crate::core::ConditionalStorage`] support for create/update/delete //! -//! Search/composite behavior and conditional operations remain part of later phases. +//! Advanced search/composite behavior remains part of later phases. mod backend; +mod search_impl; pub(crate) mod schema; mod storage; diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index 62ff1648..af875f79 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -4,7 +4,7 @@ use async_trait::async_trait; use chrono::{DateTime, Utc}; use helios_fhir::FhirVersion; use mongodb::{ - ClientSession, Cursor, + ClientSession, Collection, Cursor, bson::{self, Bson, DateTime as BsonDateTime, Document, doc}, error::Error as MongoError, }; @@ -15,6 +15,9 @@ use crate::core::{ ResourceStorage, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, normalize_etag, }; use crate::error::{BackendError, ConcurrencyError, ResourceError, StorageError, StorageResult}; +use crate::search::converters::IndexValue; +use crate::search::extractor::ExtractedValue; +use crate::search::{SearchParameterLoader, SearchParameterStatus}; use crate::tenant::TenantContext; use crate::types::{CursorValue, Page, PageCursor, PageInfo, StoredResource}; @@ -70,6 +73,28 @@ fn chrono_to_bson(dt: DateTime) -> BsonDateTime { BsonDateTime::from_millis(dt.timestamp_millis()) } +fn normalize_date_for_mongo(value: &str) -> Option> { + let normalized = if value.contains('T') { + if value.contains('Z') || value.contains('+') || value.matches('-').count() > 2 { + value.to_string() + } else { + format!("{}+00:00", value) + } + } else if value.len() == 10 { + format!("{}T00:00:00+00:00", value) + } else if value.len() == 7 { + format!("{}-01T00:00:00+00:00", value) + } else if value.len() == 4 { + format!("{}-01-01T00:00:00+00:00", value) + } else { + value.to_string() + }; + + DateTime::parse_from_rfc3339(&normalized) + .ok() + .map(|dt| dt.with_timezone(&Utc)) +} + fn next_version(version: &str) -> StorageResult { let parsed = version .parse::() diff --git a/phase4_roadmap.xml b/phase4_roadmap.xml new file mode 100644 index 00000000..794f6491 --- /dev/null +++ b/phase4_roadmap.xml @@ -0,0 +1,305 @@ + + + + HeliosSoftware/hfs + planned + TBD + 4 + + + + 2026-03-05 + Detailed Phase 4 plan drafted: MongoDB search/indexing parity and conditional create/update/delete are in scope; bundle semantics and conditional patch remain out of scope for this phase. + + + + + + Mongo CRUD, vread, optimistic locking, and instance/type/system history behavior are available. + Tenant isolation and soft-delete/Gone semantics remain aligned with earlier phases. + Best-effort session-backed consistency for selected multi-write flows is implemented where deployment topology permits. + History-delete Trial Use methods remain explicitly NotSupported for MongoDB. + + + Mongo SearchProvider execution is not implemented yet, so search-dependent FHIR semantics remain unavailable. + Conditional create/update/delete were intentionally deferred from Phase 3 because they depend on deterministic search matching. + Full transaction and batch bundle semantics remain planned beyond this phase scope. + Conditional patch remains separate from the minimum conditional-operation slice required for Phase 4. + + + + + + Deliver a truthful first wave of MongoDB FHIR search support and use that foundation to enable conditional create/update/delete, while keeping advanced search and bundle semantics explicitly out of scope. + + Implement Mongo-native search indexing and query execution for a practical first wave of parameter types needed for meaningful FHIR search parity. + Implement deterministic search_count, sorting, and paging behavior for supported Mongo search paths under strict tenant isolation. + Enable conditional_create, conditional_update, and conditional_delete using reliable zero/one/multiple-match semantics derived from Mongo search execution. + Define and document the native-versus-offloaded search boundary clearly, especially for full-text and other advanced search features that remain partial or planned. + Align roadmap, capability declarations, README status text, and tests so Mongo support is described truthfully with no aspirational mismatch. + + + Conditional patch support in this phase. + Full BundleProvider parity for batch or transaction bundle semantics. + Broad chained search, reverse chaining, _include, and _revinclude support beyond explicit future-phase planning. + Terminology-backed modifiers such as :above, :below, :in, and :not-in. + Composite MongoDB plus Elasticsearch runtime routing changes beyond documenting search_offloaded and feature boundaries. + Differential history or Trial Use history-delete implementation changes. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Define a stable Mongo-native search data model and integrate search value persistence into existing write paths without regressing Phase 2 or Phase 3 guarantees. + + Choose and document the Phase 4 search storage shape for MongoDB (embedded resource-side search values, dedicated search collection, or hybrid model). + Reuse the existing SearchParameter registry and extractor flow so supported search values are derived at write time for create/update/delete paths. + Ensure live resource writes keep search indexes synchronized without changing history storage behavior or soft-delete semantics. + Add schema bootstrap and migration steps for search-related collections and indexes with startup-safe idempotency. + Keep search_offloaded behavior explicit so Mongo native search indexing can be bypassed cleanly when secondary search ownership is configured. + + + Documented Mongo search storage model suitable for Phase 4 query execution and later expansion. + Search indexing hooks integrated into Mongo write paths with schema/bootstrap support. + + + + + Implement SearchProvider behavior for an initial Mongo parameter set with deterministic paging, sorting, and count behavior. + + Implement search and search_count for single-resource-type Mongo queries via SearchProvider. + Support first-wave search parameter classes: string, token, reference, date, number, and uri. + Define explicit support posture for quantity and composite parameters; only mark them implemented if Phase 4 tests and query semantics are complete. + Enforce tenant filters and is_deleted visibility rules consistently across all Mongo search queries. + Implement deterministic sorting and paging behavior with clearly documented practical limits. + Ensure search_count returns results consistent with the primary search path for supported parameter classes. + Return clear unsupported capability or validation errors for search features that remain outside the Phase 4 support boundary. + + + Mongo SearchProvider implementation for the first supported parameter wave. + Deterministic count, paging, and sorting behavior for implemented search paths. + + + + + Use Phase 4 search matching to enable the minimum conditional-operation slice needed for parity-focused Mongo semantics. + + Implement conditional_create using tenant-scoped Mongo search matching with exact zero/one/multiple-match behavior. + Implement conditional_update with deterministic no-match, create-on-upsert, single-match update, and multiple-match outcomes. + Implement conditional_delete with deterministic no-match, single-match delete, and multiple-match outcomes. + Keep conditional operations aligned with existing versioning, soft-delete, and optimistic-locking behavior where those semantics intersect. + Ensure conditional matching remains tenant-safe and never leaks cross-tenant search results or write decisions. + Keep conditional_patch out of scope and explicitly documented as deferred after Phase 4. + + + Mongo ConditionalStorage parity for create, update, and delete. + Explicitly documented deferral of conditional_patch beyond Phase 4. + + + + + Draw hard boundaries around what Mongo search does and does not support after Phase 4 so capabilities, docs, and tests stay aligned. + + Keep _include and _revinclude as planned unless a narrowly testable subset is implemented and documented during the phase. + Keep chained search and reverse chaining as planned unless a narrowly scoped implementation is fully validated. + Decide and document the full-text posture for MongoDB in Phase 4: native text indexes, Elasticsearch offload, or planned-only. + Document unsupported advanced modifiers and search combinations with explicit error or capability behavior. + Keep bundle semantics out of the Phase 4 capability claim set even if related transaction infrastructure exists from earlier work. + + + Clear post-Phase-4 boundary definition for advanced Mongo search capabilities. + Capability claims that match tested behavior exactly. + + + + + Validate Mongo search and conditional behavior against existing contract expectations for every capability claimed after Phase 4. + + Add unit tests for Mongo query translation and filter construction for each supported parameter class. + Add unit tests for search index extraction, persistence hooks, and search-related schema/bootstrap idempotency. + Add integration tests for supported search parameter classes, search_count parity, and deterministic paging/sorting. + Add integration tests for conditional_create, conditional_update, and conditional_delete across no-match, single-match, and multiple-match scenarios. + Add cross-tenant negative tests for all search and conditional paths introduced in this phase. + Add negative tests proving unsupported advanced search features and conditional_patch remain outside the implemented support set. + Update tests/common/capabilities.rs to reflect implemented, partial, and deferred Mongo search/conditional support exactly. + + + Phase 4 unit and integration coverage for Mongo search and conditional semantics. + Capability declarations synchronized with tested behavior. + + + + + Keep the dedicated Phase 4 roadmap, umbrella Mongo roadmap, README, and implementation-facing status notes aligned. + + Create and maintain phase4_roadmap.xml as the detailed execution artifact for Mongo Phase 4. + Update roadmap_mongo.xml progress and phase status text to reflect that Phase 4 is the active detailed planning target after Phase 3. + Update persistence README MongoDB status text and capability matrix rows to reflect actual post-Phase-4 search and conditional support. + Document that full bundle semantics remain planned outside Phase 4 scope. + Document that conditional_patch remains deferred even if conditional create/update/delete ship in this phase. + + + Roadmap and README artifacts that accurately describe the real Mongo feature set after Phase 4 work lands. + + + + + + + Query translation coverage for each supported first-wave parameter type. + Search index extraction and persistence-hook tests for create/update/delete write paths. + Search schema/bootstrap and migration idempotency tests for Phase 4 index additions. + Cursor and/or offset paging token parsing tests with deterministic ordering assumptions. + Conditional match classification tests for no-match, single-match, and multiple-match outcomes. + Unsupported advanced search and conditional_patch error-mapping tests. + + + + Search round-trip tests for supported string, token, reference, date, number, and uri parameters. + search_count returns values consistent with search results for supported Mongo queries. + Paging and sorting remain deterministic across repeated test runs and realistic datasets. + Cross-tenant negative tests prove search results and conditional decisions never leak across tenant boundaries. + conditional_create creates on no match, returns existing on single match, and errors on multiple matches. + conditional_update covers single-match update, no-match without upsert, no-match with upsert, and multiple-match failure paths. + conditional_delete covers no-match, single-match delete, and multiple-match failure paths. + Unsupported advanced search features and conditional_patch remain explicitly unimplemented with expected behavior. + search_offloaded behavior remains coherent with Phase 4 native search boundaries when enabled. + + + + Reuse SQLite and PostgreSQL search and conditional expectations where semantics overlap with the Mongo Phase 4 slice. + Do not mark Mongo advanced search features as implemented without corresponding tests. + Document every remaining planned or partial feature explicitly instead of silently omitting it. + + + + + cargo check -p helios-persistence --features mongodb + cargo check -p helios-rest --features mongodb + cargo check -p helios-hfs --features mongodb + cargo check -p helios-persistence --features "sqlite,postgres,elasticsearch,mongodb" + cargo test -p helios-persistence --features mongodb --test mongodb_tests + cargo test -p helios-persistence --features mongodb mongodb:: + cargo fmt --all -- --check + + + + + WS1.1-WS1.5, WS5.1-WS5.2 + Mongo search storage shape, indexing hooks, and schema/bootstrap behavior are implemented with unit coverage. + + + + WS2.1-WS2.7, WS5.3 + Supported first-wave parameter classes pass search, count, paging, and sorting tests. + + + + WS3.1-WS3.6, WS5.4-WS5.5 + Mongo conditional create/update/delete semantics are implemented and validated under tenant-safe matching. + + + + WS4.1-WS4.5, WS5.6-WS5.7, WS6.3-WS6.5 + Every unsupported or partial advanced search feature is explicitly documented and capability-aligned. + + + + WS6.1-WS6.2 and validation command execution + phase4_roadmap.xml, roadmap_mongo.xml, and README all reflect the same post-Phase-4 truth. + + + + + + + + + + + + + + + + + + + + + + + + Create and maintain phase4_roadmap.xml as the dedicated detailed artifact for Mongo Phase 4 work. + Update roadmap_mongo.xml progress text so it no longer implies Phase 2 is next. + Update persistence README capability matrix rows for Mongo search and conditional support after Phase 4 ships. + Keep conditional_patch and full bundle semantics explicitly documented as outside the Phase 4 support set. + Update test capability declarations so Mongo only enables search and conditional tests that truly pass. + + + + + Mongo search storage design may create excessive write amplification or index maintenance complexity. + Choose a minimal Phase 4 search model, validate write-path overhead early, and keep schema/index scope intentionally narrow. + + + Conditional operations may produce incorrect no-match/single-match/multiple-match behavior if search semantics are incomplete or inconsistent. + Gate conditional support on deterministic search tests and mirror SQLite/PostgreSQL contract expectations for overlapping behavior. + + + Sorting and paging can become nondeterministic without explicit tie-break ordering and stable cursor assumptions. + Define canonical ordering rules and enforce them with unit and integration tests before claiming support. + + + Advanced search features may be perceived as supported because infrastructure exists but end-to-end semantics are incomplete. + Keep advanced capabilities explicitly planned or partial in docs, capabilities, and tests until fully validated. + + + Native Mongo search and offloaded Elasticsearch search boundaries may become ambiguous for operators and developers. + Document search_offloaded behavior clearly and keep full-text posture explicit in both roadmap and README artifacts. + + + + + Mongo SearchProvider support for the first-wave parameter classes is implemented and covered by integration tests. + search_count, paging, and sorting behavior are deterministic and tenant-safe for implemented Mongo search paths. + conditional_create, conditional_update, and conditional_delete are implemented and validated for zero, one, and multiple matches. + conditional_patch remains explicitly deferred and is not misrepresented as implemented. + Advanced search features that remain out of scope are clearly documented as planned or partial with matching capability declarations. + Validation commands pass for mongodb-only and mixed-feature builds. + phase4_roadmap.xml, roadmap_mongo.xml, README, and capability artifacts describe the same real Mongo support level. + + diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml index 8186084a..e9ea8083 100644 --- a/roadmap_mongo.xml +++ b/roadmap_mongo.xml @@ -5,8 +5,9 @@ in-progress TBD date-agnostic - 2026-03-01 - Phase 1 completed; Phase 2 is next. + 2026-03-05 + Dedicated detailed roadmaps now exist through Phase 4; Phase 4 search/indexing plus + conditional semantics are captured in phase4_roadmap.xml. SQLite primary, PostgreSQL primary, Elasticsearch secondary @@ -18,6 +19,9 @@ + + + @@ -126,8 +130,9 @@ - - Support FHIR search behavior with clear native/offloaded boundaries. + + Support FHIR search behavior and enable conditional create/update/delete with clear + native/offloaded boundaries. Implement SearchParameter extraction + indexing path for Mongo resources. Implement basic parameter types @@ -135,7 +140,10 @@ Support paging and sorting; define practical limits. Implement full-text path via Mongo text indexes OR formalize Elasticsearch offload-first strategy. - Define support levels (implemented/partial/planned) in capability matrix and + Implement conditional create/update/delete using deterministic search + matching + semantics; keep conditional patch out of scope. + Define support levels (implemented/partial/planned) in capability matrix and docs. @@ -143,6 +151,7 @@ Search contract tests pass for implemented parameter classes. + Conditional create/update/delete tests pass for implemented matching semantics. Capability matrix updated with truthful MongoDB support levels. From 5417c428829cae3edcf4638fde9755029f73409b Mon Sep 17 00:00:00 2001 From: dougc95 Date: Mon, 9 Mar 2026 15:47:56 -0400 Subject: [PATCH 09/17] feat(persistence): MongoDB Phase 4 - search index schema and conditional operations Add search_index collection with 11 specialized indexes for FHIR search parameters (string, token, date, number, quantity, reference, uri, composite, resource, token_display, identifier_type). Implement SearchProvider with cursor-based pagination, sort support, and count queries. Add ConditionalStorage for conditional create/update/delete with search parameter matching. Bump schema version to 4. --- .../src/backends/mongodb/schema.rs | 98 +- .../src/backends/mongodb/search_impl.rs | 1035 +++++++++++++++++ .../src/backends/mongodb/storage.rs | 326 +++++- .../persistence/tests/common/capabilities.rs | 12 +- crates/persistence/tests/mongodb_tests.rs | 288 ++++- 5 files changed, 1747 insertions(+), 12 deletions(-) create mode 100644 crates/persistence/src/backends/mongodb/search_impl.rs diff --git a/crates/persistence/src/backends/mongodb/schema.rs b/crates/persistence/src/backends/mongodb/schema.rs index d0f953a6..230448b0 100644 --- a/crates/persistence/src/backends/mongodb/schema.rs +++ b/crates/persistence/src/backends/mongodb/schema.rs @@ -12,7 +12,7 @@ use crate::error::{BackendError, StorageError, StorageResult}; use super::backend::MongoBackendConfig; /// Current MongoDB schema version. -pub const SCHEMA_VERSION: i32 = 3; +pub const SCHEMA_VERSION: i32 = 4; /// Initialize MongoDB collections/indexes required by the backend. /// @@ -42,6 +42,7 @@ pub fn migrate_schema(config: &MongoBackendConfig) -> StorageResult<()> { pub async fn initialize_schema_async(database: &Database) -> StorageResult<()> { ensure_resources_indexes(database).await?; ensure_history_indexes(database).await?; + ensure_search_indexes(database).await?; set_schema_version(database, SCHEMA_VERSION).await?; Ok(()) } @@ -52,6 +53,7 @@ pub async fn migrate_schema_async(database: &Database) -> StorageResult<()> { if current < SCHEMA_VERSION { ensure_resources_indexes(database).await?; ensure_history_indexes(database).await?; + ensure_search_indexes(database).await?; set_schema_version(database, SCHEMA_VERSION).await?; } Ok(()) @@ -154,6 +156,100 @@ async fn ensure_history_indexes(database: &Database) -> StorageResult<()> { Ok(()) } +async fn ensure_search_indexes(database: &Database) -> StorageResult<()> { + let search_index = database.collection::("search_index"); + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_string": 1_i32 }, + "idx_search_string", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_token_system": 1_i32, "value_token_code": 1_i32 }, + "idx_search_token", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_date": 1_i32 }, + "idx_search_date", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_number": 1_i32 }, + "idx_search_number", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_quantity_value": 1_i32, "value_quantity_unit": 1_i32 }, + "idx_search_quantity", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_reference": 1_i32 }, + "idx_search_reference", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_uri": 1_i32 }, + "idx_search_uri", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "resource_id": 1_i32, "param_name": 1_i32, "composite_group": 1_i32 }, + "idx_search_composite", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "resource_id": 1_i32 }, + "idx_search_resource", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_token_display": 1_i32 }, + "idx_search_token_display", + false, + ) + .await?; + + create_index( + &search_index, + doc! { "tenant_id": 1_i32, "resource_type": 1_i32, "param_name": 1_i32, "value_identifier_type_system": 1_i32, "value_identifier_type_code": 1_i32 }, + "idx_search_identifier_type", + false, + ) + .await?; + + Ok(()) +} + async fn create_index( collection: &Collection, keys: Document, diff --git a/crates/persistence/src/backends/mongodb/search_impl.rs b/crates/persistence/src/backends/mongodb/search_impl.rs new file mode 100644 index 00000000..3400d182 --- /dev/null +++ b/crates/persistence/src/backends/mongodb/search_impl.rs @@ -0,0 +1,1035 @@ +//! Search and conditional-operation implementation for MongoDB backend. + +use std::collections::HashSet; + +use async_trait::async_trait; +use chrono::{DateTime, Utc}; +use helios_fhir::FhirVersion; +use mongodb::{ + Cursor, + bson::{self, Bson, DateTime as BsonDateTime, Document, doc}, +}; +use regex::escape as regex_escape; +use serde_json::Value; + +use crate::core::{ + ConditionalCreateResult, ConditionalDeleteResult, ConditionalPatchResult, ConditionalStorage, + ConditionalUpdateResult, PatchFormat, ResourceStorage, SearchProvider, SearchResult, +}; +use crate::error::{BackendError, SearchError, StorageError, StorageResult}; +use crate::tenant::TenantContext; +use crate::types::{ + CursorDirection, CursorValue, Page, PageCursor, PageInfo, SearchModifier, SearchParamType, + SearchParameter, SearchPrefix, SearchQuery, SearchValue, StoredResource, +}; + +use super::MongoBackend; + +fn internal_error(message: String) -> StorageError { + StorageError::Backend(BackendError::Internal { + backend_name: "mongodb".to_string(), + message, + source: None, + }) +} + +fn serialization_error(message: String) -> StorageError { + StorageError::Backend(BackendError::SerializationError { message }) +} + +fn bson_to_chrono(dt: &BsonDateTime) -> DateTime { + DateTime::::from_timestamp_millis(dt.timestamp_millis()).unwrap_or_else(Utc::now) +} + +fn chrono_to_bson(dt: DateTime) -> BsonDateTime { + BsonDateTime::from_millis(dt.timestamp_millis()) +} + +fn parse_date_for_query(value: &str) -> Option> { + let normalized = if value.contains('T') { + if value.contains('Z') || value.contains('+') || value.matches('-').count() > 2 { + value.to_string() + } else { + format!("{}+00:00", value) + } + } else if value.len() == 10 { + format!("{}T00:00:00+00:00", value) + } else if value.len() == 7 { + format!("{}-01T00:00:00+00:00", value) + } else if value.len() == 4 { + format!("{}-01-01T00:00:00+00:00", value) + } else { + value.to_string() + }; + + DateTime::parse_from_rfc3339(&normalized) + .ok() + .map(|dt| dt.with_timezone(&Utc)) +} + +async fn collect_documents(mut cursor: Cursor) -> StorageResult> { + let mut docs = Vec::new(); + while cursor + .advance() + .await + .map_err(|e| internal_error(format!("Failed to advance MongoDB cursor: {}", e)))? + { + let doc = cursor.deserialize_current().map_err(|e| { + internal_error(format!("Failed to deserialize MongoDB document: {}", e)) + })?; + docs.push(doc); + } + Ok(docs) +} + +fn parse_simple_search_params(params: &str) -> Vec<(String, String)> { + params + .split('&') + .filter_map(|pair| { + let (name, value) = pair.split_once('=')?; + Some((name.to_string(), value.to_string())) + }) + .collect() +} + +#[async_trait] +impl SearchProvider for MongoBackend { + async fn search( + &self, + tenant: &TenantContext, + query: &SearchQuery, + ) -> StorageResult { + self.validate_query_support(query)?; + + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let cursor = if let Some(cursor_str) = &query.cursor { + Some(PageCursor::decode(cursor_str).map_err(|_| { + StorageError::Search(SearchError::InvalidCursor { + cursor: cursor_str.clone(), + }) + })?) + } else { + None + }; + + if cursor.is_some() && !query.sort.is_empty() { + return Err(StorageError::Search(SearchError::QueryParseError { + message: + "MongoDB cursor pagination currently supports only default _lastUpdated sort" + .to_string(), + })); + } + + let previous_mode = cursor + .as_ref() + .is_some_and(|c| c.direction() == CursorDirection::Previous); + + let matched_ids = self + .matching_resource_ids(&db, tenant_id, &query.resource_type, query) + .await?; + + let filter = self.build_resource_filter( + tenant_id, + &query.resource_type, + query, + matched_ids.as_ref(), + cursor.as_ref(), + )?; + + let sort = self.build_sort_document(query, previous_mode)?; + let page_size = query.count.unwrap_or(100).max(1) as usize; + + let mut find_action = resources.find(filter).sort(sort).limit((page_size + 1) as i64); + + if cursor.is_none() { + if let Some(offset) = query.offset { + find_action = find_action.skip(offset as u64); + } + } + + let docs = collect_documents( + find_action + .await + .map_err(|e| internal_error(format!("Failed to execute MongoDB search: {}", e)))?, + ) + .await?; + + let mut resources = docs + .into_iter() + .map(|doc| self.document_to_stored_resource(tenant, &query.resource_type, doc)) + .collect::>>()?; + + if previous_mode { + resources.reverse(); + } + + let has_next = resources.len() > page_size; + if has_next { + let _ = resources.pop(); + } + + let has_previous = cursor.is_some() || query.offset.unwrap_or(0) > 0; + + let next_cursor = if has_next { + resources + .last() + .map(|resource| { + PageCursor::new( + vec![CursorValue::String(resource.last_modified().to_rfc3339())], + resource.id(), + ) + .encode() + }) + } else { + None + }; + + let previous_cursor = if has_previous { + resources + .first() + .map(|resource| { + PageCursor::previous( + vec![CursorValue::String(resource.last_modified().to_rfc3339())], + resource.id(), + ) + .encode() + }) + } else { + None + }; + + let total = if query.total.is_some() { + Some(self.search_count(tenant, query).await?) + } else { + None + }; + + let page_info = PageInfo { + next_cursor, + previous_cursor, + total, + has_next, + has_previous, + }; + + Ok(SearchResult { + resources: Page::new(resources, page_info), + included: Vec::new(), + total, + }) + } + + async fn search_count(&self, tenant: &TenantContext, query: &SearchQuery) -> StorageResult { + self.validate_query_support(query)?; + + let db = self.get_database().await?; + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let matched_ids = self + .matching_resource_ids(&db, tenant_id, &query.resource_type, query) + .await?; + + let filter = self.build_resource_filter( + tenant_id, + &query.resource_type, + query, + matched_ids.as_ref(), + None, + )?; + + resources + .count_documents(filter) + .await + .map_err(|e| internal_error(format!("Failed to count MongoDB search results: {}", e))) + } +} + +#[async_trait] +impl ConditionalStorage for MongoBackend { + async fn conditional_create( + &self, + tenant: &TenantContext, + resource_type: &str, + resource: Value, + search_params: &str, + fhir_version: FhirVersion, + ) -> StorageResult { + let matches = self + .find_matching_resources(tenant, resource_type, search_params) + .await?; + + match matches.len() { + 0 => { + let created = self + .create(tenant, resource_type, resource, fhir_version) + .await?; + Ok(ConditionalCreateResult::Created(created)) + } + 1 => Ok(ConditionalCreateResult::Exists( + matches.into_iter().next().expect("single match must exist"), + )), + n => Ok(ConditionalCreateResult::MultipleMatches(n)), + } + } + + async fn conditional_update( + &self, + tenant: &TenantContext, + resource_type: &str, + resource: Value, + search_params: &str, + upsert: bool, + fhir_version: FhirVersion, + ) -> StorageResult { + let matches = self + .find_matching_resources(tenant, resource_type, search_params) + .await?; + + match matches.len() { + 0 => { + if upsert { + let created = self + .create(tenant, resource_type, resource, fhir_version) + .await?; + Ok(ConditionalUpdateResult::Created(created)) + } else { + Ok(ConditionalUpdateResult::NoMatch) + } + } + 1 => { + let current = matches.into_iter().next().expect("single match must exist"); + let updated = self.update(tenant, ¤t, resource).await?; + Ok(ConditionalUpdateResult::Updated(updated)) + } + n => Ok(ConditionalUpdateResult::MultipleMatches(n)), + } + } + + async fn conditional_delete( + &self, + tenant: &TenantContext, + resource_type: &str, + search_params: &str, + ) -> StorageResult { + let matches = self + .find_matching_resources(tenant, resource_type, search_params) + .await?; + + match matches.len() { + 0 => Ok(ConditionalDeleteResult::NoMatch), + 1 => { + let current = matches.into_iter().next().expect("single match must exist"); + self.delete(tenant, resource_type, current.id()).await?; + Ok(ConditionalDeleteResult::Deleted) + } + n => Ok(ConditionalDeleteResult::MultipleMatches(n)), + } + } + + async fn conditional_patch( + &self, + tenant: &TenantContext, + resource_type: &str, + search_params: &str, + patch: &PatchFormat, + ) -> StorageResult { + let _ = (tenant, resource_type, search_params, patch); + Err(StorageError::Backend(BackendError::UnsupportedCapability { + backend_name: "mongodb".to_string(), + capability: "conditional_patch".to_string(), + })) + } +} + +impl MongoBackend { + fn validate_query_support(&self, query: &SearchQuery) -> StorageResult<()> { + if query.parameters.iter().any(|param| !param.chain.is_empty()) { + return Err(StorageError::Search(SearchError::ChainedSearchNotSupported { + chain: "forward chain".to_string(), + })); + } + + if !query.reverse_chains.is_empty() { + return Err(StorageError::Search(SearchError::ReverseChainNotSupported)); + } + + if !query.includes.is_empty() { + return Err(StorageError::Search(SearchError::IncludeNotSupported { + operation: "_include/_revinclude".to_string(), + })); + } + + for param in &query.parameters { + if matches!( + param.modifier, + Some(SearchModifier::Above) + | Some(SearchModifier::Below) + | Some(SearchModifier::In) + | Some(SearchModifier::NotIn) + ) { + return Err(StorageError::Search(SearchError::UnsupportedModifier { + modifier: param + .modifier + .as_ref() + .map(ToString::to_string) + .unwrap_or_default(), + param_type: param.param_type.to_string(), + })); + } + } + + Ok(()) + } + + async fn matching_resource_ids( + &self, + db: &mongodb::Database, + tenant_id: &str, + resource_type: &str, + query: &SearchQuery, + ) -> StorageResult>> { + let search_index = db.collection::(MongoBackend::SEARCH_INDEX_COLLECTION); + let mut matched: Option> = None; + + for param in &query.parameters { + if matches!(param.name.as_str(), "_id" | "_lastUpdated") { + continue; + } + + let filter = self.build_search_index_filter(tenant_id, resource_type, param)?; + + let ids = search_index + .distinct("resource_id", filter) + .await + .map_err(|e| internal_error(format!("Failed to query search_index: {}", e)))? + .into_iter() + .filter_map(|value| value.as_str().map(ToString::to_string)) + .collect::>(); + + if ids.is_empty() { + return Ok(Some(HashSet::new())); + } + + matched = Some(match matched { + Some(current) => current + .intersection(&ids) + .cloned() + .collect::>(), + None => ids, + }); + + if matched.as_ref().is_some_and(|set| set.is_empty()) { + return Ok(matched); + } + } + + Ok(matched) + } + + fn build_search_index_filter( + &self, + tenant_id: &str, + resource_type: &str, + param: &SearchParameter, + ) -> StorageResult { + if param.values.is_empty() { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!("Search parameter '{}' has no values", param.name), + })); + } + + let mut filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "param_name": ¶m.name, + }; + + let value_filters = param + .values + .iter() + .map(|value| self.build_index_value_filter(param, value)) + .collect::>>()?; + + if value_filters.len() == 1 { + if let Some(single) = value_filters.into_iter().next() { + for (key, value) in single { + filter.insert(key, value); + } + } + return Ok(filter); + } + + let combine_with_and = matches!(param.param_type, SearchParamType::Date | SearchParamType::Number); + let operator = if combine_with_and { "$and" } else { "$or" }; + filter.insert( + operator, + Bson::Array(value_filters.into_iter().map(Bson::Document).collect()), + ); + + Ok(filter) + } + + fn build_index_value_filter( + &self, + param: &SearchParameter, + value: &SearchValue, + ) -> StorageResult { + match param.name.as_str() { + "_text" | "_content" => { + return Err(StorageError::Search(SearchError::TextSearchNotAvailable)); + } + "_id" | "_lastUpdated" => { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!( + "Special parameter '{}' should be resolved against resources, not search_index", + param.name + ), + })); + } + _ => {} + } + + match param.param_type { + SearchParamType::String => self.build_string_filter(param, value), + SearchParamType::Token => self.build_token_filter(param, value), + SearchParamType::Date => self.build_date_filter(value, "value_date"), + SearchParamType::Number => self.build_number_filter(value), + SearchParamType::Reference => self.build_reference_filter(param, value), + SearchParamType::Uri => self.build_uri_filter(param, value), + SearchParamType::Quantity => Err(StorageError::Search(SearchError::UnsupportedParameterType { + param_type: "quantity".to_string(), + })), + SearchParamType::Composite => Err(StorageError::Search(SearchError::InvalidComposite { + message: "Composite search is not supported in MongoDB Phase 4".to_string(), + })), + SearchParamType::Special => Err(StorageError::Search(SearchError::UnsupportedParameterType { + param_type: format!("special parameter {}", param.name), + })), + } + } + + fn build_string_filter( + &self, + param: &SearchParameter, + value: &SearchValue, + ) -> StorageResult { + if value.prefix != SearchPrefix::Eq { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!( + "Unsupported prefix '{}' for string parameter '{}'", + value.prefix, + param.name + ), + })); + } + + let lowered = value.value.to_lowercase(); + match param.modifier.as_ref() { + None => Ok(doc! { + "value_string": { + "$regex": format!("^{}", regex_escape(&lowered)) + } + }), + Some(SearchModifier::Exact) => Ok(doc! { "value_string": lowered }), + Some(SearchModifier::Contains) => Ok(doc! { + "value_string": { + "$regex": regex_escape(&lowered) + } + }), + Some(other) => Err(StorageError::Search(SearchError::UnsupportedModifier { + modifier: other.to_string(), + param_type: "string".to_string(), + })), + } + } + + fn build_token_filter( + &self, + param: &SearchParameter, + value: &SearchValue, + ) -> StorageResult { + if value.prefix != SearchPrefix::Eq { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!( + "Unsupported prefix '{}' for token parameter '{}'", + value.prefix, + param.name + ), + })); + } + + match param.modifier.as_ref() { + None | Some(SearchModifier::CodeOnly) => {} + Some(other) => { + return Err(StorageError::Search(SearchError::UnsupportedModifier { + modifier: other.to_string(), + param_type: "token".to_string(), + })); + } + } + + if let Some((system, code)) = value.value.split_once('|') { + if system.is_empty() { + Ok(doc! { "value_token_code": code }) + } else if code.is_empty() { + Ok(doc! { "value_token_system": system }) + } else { + Ok(doc! { + "value_token_system": system, + "value_token_code": code, + }) + } + } else { + Ok(doc! { "value_token_code": &value.value }) + } + } + + fn build_reference_filter( + &self, + param: &SearchParameter, + value: &SearchValue, + ) -> StorageResult { + if value.prefix != SearchPrefix::Eq { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!( + "Unsupported prefix '{}' for reference parameter '{}'", + value.prefix, + param.name + ), + })); + } + + if let Some(modifier) = ¶m.modifier { + return Err(StorageError::Search(SearchError::UnsupportedModifier { + modifier: modifier.to_string(), + param_type: "reference".to_string(), + })); + } + + if value.value.contains('/') { + return Ok(doc! { "value_reference": &value.value }); + } + + Ok(doc! { + "$or": [ + { "value_reference": &value.value }, + { + "value_reference": { + "$regex": format!("/{}$", regex_escape(&value.value)) + } + } + ] + }) + } + + fn build_uri_filter( + &self, + param: &SearchParameter, + value: &SearchValue, + ) -> StorageResult { + if value.prefix != SearchPrefix::Eq { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!( + "Unsupported prefix '{}' for uri parameter '{}'", + value.prefix, + param.name + ), + })); + } + + match param.modifier.as_ref() { + None | Some(SearchModifier::Exact) => Ok(doc! { "value_uri": &value.value }), + Some(SearchModifier::Contains) => Ok(doc! { + "value_uri": { + "$regex": regex_escape(&value.value) + } + }), + Some(other) => Err(StorageError::Search(SearchError::UnsupportedModifier { + modifier: other.to_string(), + param_type: "uri".to_string(), + })), + } + } + + fn build_date_filter(&self, value: &SearchValue, field: &str) -> StorageResult { + let parsed = parse_date_for_query(&value.value).ok_or_else(|| { + StorageError::Search(SearchError::QueryParseError { + message: format!("Invalid date value '{}'", value.value), + }) + })?; + + let bson_date = chrono_to_bson(parsed); + + match value.prefix { + SearchPrefix::Ap => { + let lower = chrono_to_bson(parsed - chrono::Duration::hours(12)); + let upper = chrono_to_bson(parsed + chrono::Duration::hours(12)); + Ok(doc! { + field: { + "$gte": lower, + "$lte": upper, + } + }) + } + _ => { + let op = Self::prefix_to_mongo_operator(value.prefix)?; + Ok(doc! { + field: { + op: bson_date, + } + }) + } + } + } + + fn build_number_filter(&self, value: &SearchValue) -> StorageResult { + let parsed = value + .value + .parse::() + .map_err(|e| StorageError::Search(SearchError::QueryParseError { + message: format!("Invalid number value '{}': {}", value.value, e), + }))?; + + match value.prefix { + SearchPrefix::Ap => { + let delta = (parsed.abs() * 0.1).max(0.1); + Ok(doc! { + "value_number": { + "$gte": parsed - delta, + "$lte": parsed + delta, + } + }) + } + _ => { + let op = Self::prefix_to_mongo_operator(value.prefix)?; + Ok(doc! { + "value_number": { + op: parsed, + } + }) + } + } + } + + fn prefix_to_mongo_operator(prefix: SearchPrefix) -> StorageResult<&'static str> { + match prefix { + SearchPrefix::Eq => Ok("$eq"), + SearchPrefix::Ne => Ok("$ne"), + SearchPrefix::Gt | SearchPrefix::Sa => Ok("$gt"), + SearchPrefix::Lt | SearchPrefix::Eb => Ok("$lt"), + SearchPrefix::Ge => Ok("$gte"), + SearchPrefix::Le => Ok("$lte"), + SearchPrefix::Ap => Ok("$eq"), + } + } + + fn build_resource_filter( + &self, + tenant_id: &str, + resource_type: &str, + query: &SearchQuery, + matched_ids: Option<&HashSet>, + cursor: Option<&PageCursor>, + ) -> StorageResult { + let mut conditions = vec![doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "is_deleted": false, + }]; + + if let Some(ids) = matched_ids { + let id_values = ids + .iter() + .cloned() + .map(Bson::String) + .collect::>(); + conditions.push(doc! { + "id": { "$in": Bson::Array(id_values) } + }); + } + + for param in &query.parameters { + match param.name.as_str() { + "_id" => { + conditions.push(self.build_resource_id_condition(param)?); + } + "_lastUpdated" => { + conditions.extend(self.build_resource_last_updated_conditions(param)?); + } + _ => {} + } + } + + if let Some(cursor) = cursor { + conditions.push(self.build_cursor_condition(cursor)?); + } + + if conditions.len() == 1 { + return Ok(conditions.remove(0)); + } + + Ok(doc! { + "$and": Bson::Array(conditions.into_iter().map(Bson::Document).collect()) + }) + } + + fn build_resource_id_condition(&self, param: &SearchParameter) -> StorageResult { + let mut ids = Vec::new(); + + for value in ¶m.values { + if value.prefix != SearchPrefix::Eq { + return Err(StorageError::Search(SearchError::QueryParseError { + message: format!( + "Unsupported prefix '{}' for _id parameter", + value.prefix + ), + })); + } + ids.push(value.value.clone()); + } + + if ids.len() == 1 { + return Ok(doc! { "id": ids.remove(0) }); + } + + Ok(doc! { + "id": { "$in": Bson::Array(ids.into_iter().map(Bson::String).collect()) } + }) + } + + fn build_resource_last_updated_conditions( + &self, + param: &SearchParameter, + ) -> StorageResult> { + param + .values + .iter() + .map(|value| self.build_date_filter(value, "last_updated")) + .collect() + } + + fn build_cursor_condition(&self, cursor: &PageCursor) -> StorageResult { + let timestamp = match cursor.sort_values().first() { + Some(CursorValue::String(value)) => { + DateTime::parse_from_rfc3339(value) + .map_err(|_| { + StorageError::Search(SearchError::InvalidCursor { + cursor: cursor.encode(), + }) + })? + .with_timezone(&Utc) + } + _ => { + return Err(StorageError::Search(SearchError::InvalidCursor { + cursor: cursor.encode(), + })); + } + }; + + let ts = chrono_to_bson(timestamp); + let id = cursor.resource_id().to_string(); + + if cursor.direction() == CursorDirection::Previous { + Ok(doc! { + "$or": [ + { "last_updated": { "$gt": ts } }, + { "last_updated": ts, "id": { "$gt": id } } + ] + }) + } else { + Ok(doc! { + "$or": [ + { "last_updated": { "$lt": ts } }, + { "last_updated": ts, "id": { "$lt": id } } + ] + }) + } + } + + fn build_sort_document(&self, query: &SearchQuery, previous_mode: bool) -> StorageResult { + if query.sort.is_empty() { + return Ok(if previous_mode { + doc! { "last_updated": 1_i32, "id": 1_i32 } + } else { + doc! { "last_updated": -1_i32, "id": -1_i32 } + }); + } + + let mut sort = Document::new(); + + for directive in &query.sort { + let field = match directive.parameter.as_str() { + "_lastUpdated" => "last_updated", + "_id" | "id" => "id", + other => { + return Err(StorageError::Search(SearchError::UnsupportedParameterType { + param_type: format!("sort parameter '{}'", other), + })); + } + }; + + let mut dir = if directive.direction == crate::types::SortDirection::Descending { + -1_i32 + } else { + 1_i32 + }; + + if previous_mode { + dir = -dir; + } + + sort.insert(field, dir); + } + + if !sort.contains_key("id") { + sort.insert("id", if previous_mode { 1_i32 } else { -1_i32 }); + } + + Ok(sort) + } + + fn document_to_stored_resource( + &self, + tenant: &TenantContext, + fallback_resource_type: &str, + doc: Document, + ) -> StorageResult { + let resource_type = doc + .get_str("resource_type") + .ok() + .unwrap_or(fallback_resource_type) + .to_string(); + + let id = doc + .get_str("id") + .map_err(|e| internal_error(format!("Missing resource id in search result: {}", e)))? + .to_string(); + + let version_id = doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing version_id in search result: {}", e)))? + .to_string(); + + let payload = doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing resource payload in search result: {}", e)))?; + + let content = bson::from_bson::(Bson::Document(payload.clone())) + .map_err(|e| serialization_error(format!("Failed to deserialize resource payload: {}", e)))?; + + let now = Utc::now(); + let created_at = doc + .get_datetime("created_at") + .map(bson_to_chrono) + .unwrap_or(now); + + let last_updated = doc + .get_datetime("last_updated") + .map(bson_to_chrono) + .unwrap_or(created_at); + + let deleted_at = match doc.get("deleted_at") { + Some(Bson::DateTime(value)) => Some(bson_to_chrono(value)), + _ => None, + }; + + let fhir_version = doc + .get_str("fhir_version") + .ok() + .and_then(FhirVersion::from_storage) + .unwrap_or_default(); + + Ok(StoredResource::from_storage( + resource_type, + id, + version_id, + tenant.tenant_id().clone(), + content, + created_at, + last_updated, + deleted_at, + fhir_version, + )) + } + + async fn find_matching_resources( + &self, + tenant: &TenantContext, + resource_type: &str, + search_params_str: &str, + ) -> StorageResult> { + let parsed_params = parse_simple_search_params(search_params_str); + + if parsed_params.is_empty() { + return Ok(Vec::new()); + } + + let search_params = self.build_search_parameters(resource_type, &parsed_params); + + let query = SearchQuery { + resource_type: resource_type.to_string(), + parameters: search_params, + count: Some(1000), + ..Default::default() + }; + + let result = ::search(self, tenant, &query).await?; + Ok(result.resources.items) + } + + fn build_search_parameters( + &self, + resource_type: &str, + params: &[(String, String)], + ) -> Vec { + let registry = self.search_registry().read(); + + params + .iter() + .map(|(name, value)| { + let param_type = self + .lookup_param_type(®istry, resource_type, name) + .unwrap_or_else(|| match name.as_str() { + "_id" => SearchParamType::Token, + "_lastUpdated" => SearchParamType::Date, + "_tag" | "_profile" | "_security" => SearchParamType::Token, + "identifier" => SearchParamType::Token, + "patient" | "subject" | "encounter" | "performer" | "author" + | "requester" | "recorder" | "asserter" | "practitioner" + | "organization" | "location" | "device" => SearchParamType::Reference, + _ => SearchParamType::String, + }); + + SearchParameter { + name: name.clone(), + param_type, + modifier: None, + values: vec![SearchValue::parse(value)], + chain: vec![], + components: vec![], + } + }) + .collect() + } + + fn lookup_param_type( + &self, + registry: &crate::search::SearchParameterRegistry, + resource_type: &str, + param_name: &str, + ) -> Option { + if let Some(def) = registry.get_param(resource_type, param_name) { + return Some(def.param_type); + } + + if let Some(def) = registry.get_param("Resource", param_name) { + return Some(def.param_type); + } + + None + } +} diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index af875f79..fe173daf 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -4,7 +4,7 @@ use async_trait::async_trait; use chrono::{DateTime, Utc}; use helios_fhir::FhirVersion; use mongodb::{ - ClientSession, Collection, Cursor, + ClientSession, Cursor, bson::{self, Bson, DateTime as BsonDateTime, Document, doc}, error::Error as MongoError, }; @@ -510,6 +510,13 @@ impl ResourceStorage for MongoBackend { .map_err(|e| internal_error(format!("Failed to insert resource history: {}", e)))?; } + self.index_resource(&db, tenant_id, resource_type, &id, &resource, &mut session) + .await?; + + if resource_type == "SearchParameter" { + self.handle_search_parameter_create(&resource)?; + } + commit_best_effort_multi_write_session(&mut session, transaction_active, "create").await?; Ok(StoredResource::from_storage( @@ -773,6 +780,13 @@ impl ResourceStorage for MongoBackend { })?; } + self.index_resource(&db, tenant_id, resource_type, id, &resource, &mut session) + .await?; + + if resource_type == "SearchParameter" { + self.handle_search_parameter_update(current.content(), &resource)?; + } + commit_best_effort_multi_write_session(&mut session, transaction_active, "update").await?; Ok(StoredResource::from_storage( @@ -844,6 +858,7 @@ impl ResourceStorage for MongoBackend { .get_document("data") .map_err(|e| internal_error(format!("Missing resource payload: {}", e)))? .clone(); + let resource_value = document_to_value(&payload)?; let fhir_version = existing_doc .get_str("fhir_version") .unwrap_or("4.0") @@ -921,6 +936,13 @@ impl ResourceStorage for MongoBackend { })?; } + self.delete_search_index(&db, tenant_id, resource_type, id, &mut session) + .await?; + + if resource_type == "SearchParameter" { + self.handle_search_parameter_delete(&resource_value)?; + } + commit_best_effort_multi_write_session(&mut session, transaction_active, "delete").await?; Ok(()) @@ -991,6 +1013,308 @@ impl ResourceStorage for MongoBackend { } } +impl MongoBackend { + pub(crate) async fn index_resource( + &self, + db: &mongodb::Database, + tenant_id: &str, + resource_type: &str, + resource_id: &str, + resource: &Value, + session: &mut Option, + ) -> StorageResult<()> { + if self.is_search_offloaded() { + return Ok(()); + } + + self.delete_search_index(db, tenant_id, resource_type, resource_id, session) + .await?; + + let index_docs = match self.search_extractor().extract(resource, resource_type) { + Ok(values) => values + .iter() + .filter_map(|value| { + self.build_search_index_document(tenant_id, resource_type, resource_id, value) + }) + .collect::>(), + Err(e) => { + tracing::warn!( + "Search extraction failed for {}/{}: {}. Using minimal fallback index values.", + resource_type, + resource_id, + e + ); + self.index_minimal_fallback_documents(tenant_id, resource_type, resource_id, resource) + } + }; + + if index_docs.is_empty() { + return Ok(()); + } + + let collection = db.collection::(MongoBackend::SEARCH_INDEX_COLLECTION); + + if let Some(active_session) = session.as_mut() { + collection + .insert_many(index_docs) + .session(active_session) + .await + .map_err(|e| internal_error(format!("Failed to insert search index entries: {}", e)))?; + } else { + collection + .insert_many(index_docs) + .await + .map_err(|e| internal_error(format!("Failed to insert search index entries: {}", e)))?; + } + + Ok(()) + } + + pub(crate) async fn delete_search_index( + &self, + db: &mongodb::Database, + tenant_id: &str, + resource_type: &str, + resource_id: &str, + session: &mut Option, + ) -> StorageResult<()> { + if self.is_search_offloaded() { + return Ok(()); + } + + let collection = db.collection::(MongoBackend::SEARCH_INDEX_COLLECTION); + let filter = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "resource_id": resource_id, + }; + + if let Some(active_session) = session.as_mut() { + collection + .delete_many(filter) + .session(active_session) + .await + .map_err(|e| internal_error(format!("Failed to delete search index entries: {}", e)))?; + } else { + collection + .delete_many(filter) + .await + .map_err(|e| internal_error(format!("Failed to delete search index entries: {}", e)))?; + } + + Ok(()) + } + + fn build_search_index_document( + &self, + tenant_id: &str, + resource_type: &str, + resource_id: &str, + value: &ExtractedValue, + ) -> Option { + let mut doc = doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "resource_id": resource_id, + "param_name": &value.param_name, + "param_url": &value.param_url, + }; + + match &value.value { + IndexValue::String(v) => { + doc.insert("value_string", v.to_lowercase()); + } + IndexValue::Token { + system, + code, + display, + identifier_type_system, + identifier_type_code, + } => { + if let Some(system) = system { + doc.insert("value_token_system", system.clone()); + } + doc.insert("value_token_code", code.clone()); + if let Some(display) = display { + doc.insert("value_token_display", display.clone()); + } + if let Some(type_system) = identifier_type_system { + doc.insert("value_identifier_type_system", type_system.clone()); + } + if let Some(type_code) = identifier_type_code { + doc.insert("value_identifier_type_code", type_code.clone()); + } + } + IndexValue::Date { + value: date, + precision, + } => { + let normalized = match normalize_date_for_mongo(date) { + Some(v) => v, + None => { + tracing::warn!( + "Skipping invalid date index value '{}' for parameter '{}'", + date, + value.param_name + ); + return None; + } + }; + doc.insert("value_date", chrono_to_bson(normalized)); + doc.insert("value_date_precision", precision.to_string()); + } + IndexValue::Number(v) => { + doc.insert("value_number", *v); + } + IndexValue::Quantity { + value, + unit, + system, + .. + } => { + doc.insert("value_quantity_value", *value); + if let Some(unit) = unit { + doc.insert("value_quantity_unit", unit.clone()); + } + if let Some(system) = system { + doc.insert("value_quantity_system", system.clone()); + } + } + IndexValue::Reference { reference, .. } => { + doc.insert("value_reference", reference.clone()); + } + IndexValue::Uri(uri) => { + doc.insert("value_uri", uri.clone()); + } + } + + if let Some(group) = value.composite_group { + doc.insert("composite_group", group as i32); + } + + Some(doc) + } + + fn index_minimal_fallback_documents( + &self, + tenant_id: &str, + resource_type: &str, + resource_id: &str, + resource: &Value, + ) -> Vec { + let mut docs = Vec::new(); + + let resource_id_value = resource + .get("id") + .and_then(|v| v.as_str()) + .unwrap_or(resource_id); + + docs.push(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "resource_id": resource_id, + "param_name": "_id", + "param_url": "http://hl7.org/fhir/SearchParameter/Resource-id", + "value_token_code": resource_id_value, + }); + + if let Some(last_updated) = resource + .get("meta") + .and_then(|meta| meta.get("lastUpdated")) + .and_then(|v| v.as_str()) + .and_then(normalize_date_for_mongo) + { + docs.push(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "resource_id": resource_id, + "param_name": "_lastUpdated", + "param_url": "http://hl7.org/fhir/SearchParameter/Resource-lastUpdated", + "value_date": chrono_to_bson(last_updated), + }); + } + + docs + } + + fn handle_search_parameter_create(&self, resource: &Value) -> StorageResult<()> { + let loader = SearchParameterLoader::new(self.config().fhir_version); + + match loader.parse_resource(resource) { + Ok(def) => { + if def.status == SearchParameterStatus::Active { + let mut registry = self.search_registry().write(); + if let Err(e) = registry.register(def) { + tracing::debug!("SearchParameter registration skipped: {}", e); + } + } + } + Err(e) => { + tracing::warn!("Failed to parse SearchParameter for registry update: {}", e); + } + } + + Ok(()) + } + + fn handle_search_parameter_update( + &self, + old_resource: &Value, + new_resource: &Value, + ) -> StorageResult<()> { + let loader = SearchParameterLoader::new(self.config().fhir_version); + + let old_def = loader.parse_resource(old_resource).ok(); + let new_def = loader.parse_resource(new_resource).ok(); + + match (old_def, new_def) { + (Some(old), Some(new)) => { + let mut registry = self.search_registry().write(); + + if old.url != new.url { + let _ = registry.unregister(&old.url); + if new.status == SearchParameterStatus::Active { + let _ = registry.register(new); + } + } else if old.status != new.status { + if let Err(e) = registry.update_status(&new.url, new.status) { + tracing::debug!("SearchParameter status update skipped: {}", e); + } + } else { + let _ = registry.unregister(&old.url); + if new.status == SearchParameterStatus::Active { + let _ = registry.register(new); + } + } + } + (None, Some(new)) => { + if new.status == SearchParameterStatus::Active { + let mut registry = self.search_registry().write(); + let _ = registry.register(new); + } + } + (Some(old), None) => { + let mut registry = self.search_registry().write(); + let _ = registry.unregister(&old.url); + } + (None, None) => {} + } + + Ok(()) + } + + fn handle_search_parameter_delete(&self, resource: &Value) -> StorageResult<()> { + if let Some(url) = resource.get("url").and_then(|v| v.as_str()) { + let mut registry = self.search_registry().write(); + if let Err(e) = registry.unregister(url) { + tracing::debug!("SearchParameter unregistration skipped: {}", e); + } + } + + Ok(()) + } +} + #[async_trait] impl VersionedStorage for MongoBackend { async fn vread( diff --git a/crates/persistence/tests/common/capabilities.rs b/crates/persistence/tests/common/capabilities.rs index e09d7df8..20d59ebd 100644 --- a/crates/persistence/tests/common/capabilities.rs +++ b/crates/persistence/tests/common/capabilities.rs @@ -163,9 +163,9 @@ impl CapabilityMatrix { (BackendCapability::InstanceHistory, SupportLevel::Implemented), (BackendCapability::TypeHistory, SupportLevel::Implemented), (BackendCapability::SystemHistory, SupportLevel::Implemented), - (BackendCapability::BasicSearch, SupportLevel::Planned), - (BackendCapability::DateSearch, SupportLevel::Planned), - (BackendCapability::ReferenceSearch, SupportLevel::Planned), + (BackendCapability::BasicSearch, SupportLevel::Implemented), + (BackendCapability::DateSearch, SupportLevel::Implemented), + (BackendCapability::ReferenceSearch, SupportLevel::Implemented), (BackendCapability::ChainedSearch, SupportLevel::Planned), (BackendCapability::ReverseChaining, SupportLevel::Planned), (BackendCapability::Include, SupportLevel::Planned), @@ -174,9 +174,9 @@ impl CapabilityMatrix { (BackendCapability::TerminologySearch, SupportLevel::Planned), (BackendCapability::Transactions, SupportLevel::Planned), (BackendCapability::OptimisticLocking, SupportLevel::Implemented), - (BackendCapability::CursorPagination, SupportLevel::Planned), - (BackendCapability::OffsetPagination, SupportLevel::Planned), - (BackendCapability::Sorting, SupportLevel::Planned), + (BackendCapability::CursorPagination, SupportLevel::Implemented), + (BackendCapability::OffsetPagination, SupportLevel::Implemented), + (BackendCapability::Sorting, SupportLevel::Implemented), (BackendCapability::BulkExport, SupportLevel::Planned), (BackendCapability::SharedSchema, SupportLevel::Implemented), (BackendCapability::SchemaPerTenant, SupportLevel::NotPlanned), diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index 4a491612..d20b5d8b 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -11,11 +11,14 @@ use helios_fhir::FhirVersion; use helios_persistence::backends::mongodb::{MongoBackend, MongoBackendConfig}; use helios_persistence::core::{ - Backend, BackendCapability, BackendKind, HistoryParams, InstanceHistoryProvider, - ResourceStorage, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, + Backend, BackendCapability, BackendKind, ConditionalCreateResult, ConditionalDeleteResult, + ConditionalStorage, ConditionalUpdateResult, HistoryParams, InstanceHistoryProvider, + PatchFormat, ResourceStorage, SearchProvider, SystemHistoryProvider, TypeHistoryProvider, + VersionedStorage, }; use helios_persistence::error::{BackendError, ConcurrencyError, ResourceError, StorageError}; use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; +use helios_persistence::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue, SortDirective}; use serde_json::json; const MONGODB_MAX_DATABASE_NAME_LEN: usize = 63; @@ -77,7 +80,7 @@ fn test_mongodb_integration_database_name_within_limit() { } #[test] -fn test_mongodb_phase3_capabilities() { +fn test_mongodb_phase4_capabilities() { let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); assert_eq!(backend.kind(), BackendKind::MongoDB); @@ -88,10 +91,15 @@ fn test_mongodb_phase3_capabilities() { assert!(backend.supports(BackendCapability::InstanceHistory)); assert!(backend.supports(BackendCapability::TypeHistory)); assert!(backend.supports(BackendCapability::SystemHistory)); + assert!(backend.supports(BackendCapability::BasicSearch)); + assert!(backend.supports(BackendCapability::DateSearch)); + assert!(backend.supports(BackendCapability::ReferenceSearch)); + assert!(backend.supports(BackendCapability::Sorting)); + assert!(backend.supports(BackendCapability::OffsetPagination)); + assert!(backend.supports(BackendCapability::CursorPagination)); assert!(backend.supports(BackendCapability::OptimisticLocking)); assert!(backend.supports(BackendCapability::SharedSchema)); - assert!(!backend.supports(BackendCapability::BasicSearch)); assert!(!backend.supports(BackendCapability::Transactions)); } @@ -595,3 +603,275 @@ async fn mongodb_integration_history_delete_trial_use_not_supported() { )) )); } + +#[tokio::test] +async fn mongodb_integration_search_token_string_and_offset_pagination() { + let Some(backend) = create_backend("search_token_string").await else { + eprintln!( + "Skipping mongodb_integration_search_token_string_and_offset_pagination (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search"); + + for (id, mrn, family) in [ + ("patient-search-1", "MRN-SEARCH-1", "Smith"), + ("patient-search-2", "MRN-SEARCH-2", "Smiley"), + ("patient-search-3", "MRN-SEARCH-3", "Jones"), + ] { + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": id, + "identifier": [{"system": "http://hospital.org/mrn", "value": mrn}], + "name": [{"family": family}], + }), + FhirVersion::default(), + ) + .await + .unwrap(); + } + + let token_query = SearchQuery::new("Patient").with_parameter(SearchParameter { + name: "identifier".to_string(), + param_type: SearchParamType::Token, + modifier: None, + values: vec![SearchValue::eq("http://hospital.org/mrn|MRN-SEARCH-1")], + chain: vec![], + components: vec![], + }); + + let token_result = backend.search(&tenant, &token_query).await.unwrap(); + assert_eq!(token_result.resources.items.len(), 1); + assert_eq!(token_result.resources.items[0].id(), "patient-search-1"); + + let mut string_query = SearchQuery::new("Patient") + .with_parameter(SearchParameter { + name: "name".to_string(), + param_type: SearchParamType::String, + modifier: None, + values: vec![SearchValue::eq("Smi")], + chain: vec![], + components: vec![], + }) + .with_sort(SortDirective::parse("_id")) + .with_count(1); + + let first_page = backend.search(&tenant, &string_query).await.unwrap(); + assert_eq!(first_page.resources.items.len(), 1); + assert!(first_page.resources.page_info.has_next); + let first_id = first_page.resources.items[0].id().to_string(); + + string_query.offset = Some(1); + let second_page = backend.search(&tenant, &string_query).await.unwrap(); + assert_eq!(second_page.resources.items.len(), 1); + assert_ne!(second_page.resources.items[0].id(), first_id); +} + +#[tokio::test] +async fn mongodb_integration_conditional_create_exists() { + let Some(backend) = create_backend("conditional_create").await else { + eprintln!("Skipping mongodb_integration_conditional_create_exists (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant = create_tenant("tenant-conditional-create"); + + let created = backend + .conditional_create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "identifier": [{"system": "http://hospital.org/mrn", "value": "MRN-COND-1"}], + "name": [{"family": "Original"}], + }), + "identifier=http://hospital.org/mrn|MRN-COND-1", + FhirVersion::default(), + ) + .await + .unwrap(); + + let created_id = match created { + ConditionalCreateResult::Created(resource) => resource.id().to_string(), + other => panic!("expected Created result, got {:?}", other), + }; + + let second = backend + .conditional_create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "identifier": [{"system": "http://hospital.org/mrn", "value": "MRN-COND-1"}], + "name": [{"family": "Duplicate"}], + }), + "identifier=http://hospital.org/mrn|MRN-COND-1", + FhirVersion::default(), + ) + .await + .unwrap(); + + match second { + ConditionalCreateResult::Exists(existing) => assert_eq!(existing.id(), created_id), + other => panic!("expected Exists result, got {:?}", other), + } +} + +#[tokio::test] +async fn mongodb_integration_conditional_update_delete_and_no_match() { + let Some(backend) = create_backend("conditional_update_delete").await else { + eprintln!( + "Skipping mongodb_integration_conditional_update_delete_and_no_match (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-conditional-update-delete"); + + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "patient-cond-update", + "identifier": [{"system": "http://hospital.org/mrn", "value": "MRN-COND-UPDATE"}], + "name": [{"family": "Before"}], + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let updated = backend + .conditional_update( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "patient-cond-update", + "identifier": [{"system": "http://hospital.org/mrn", "value": "MRN-COND-UPDATE"}], + "name": [{"family": "After"}], + }), + "identifier=http://hospital.org/mrn|MRN-COND-UPDATE", + false, + FhirVersion::default(), + ) + .await + .unwrap(); + + let updated_id = match updated { + ConditionalUpdateResult::Updated(resource) => { + assert_eq!(resource.content()["name"][0]["family"], "After"); + resource.id().to_string() + } + other => panic!("expected Updated result, got {:?}", other), + }; + + let deleted = backend + .conditional_delete( + &tenant, + "Patient", + "identifier=http://hospital.org/mrn|MRN-COND-UPDATE", + ) + .await + .unwrap(); + assert!(matches!(deleted, ConditionalDeleteResult::Deleted)); + + let no_match = backend + .conditional_delete( + &tenant, + "Patient", + "identifier=http://hospital.org/mrn|MRN-COND-UPDATE", + ) + .await + .unwrap(); + assert!(matches!(no_match, ConditionalDeleteResult::NoMatch)); + + let read_after_delete = backend.read(&tenant, "Patient", &updated_id).await; + assert!(matches!( + read_after_delete, + Err(StorageError::Resource(ResourceError::Gone { .. })) + )); +} + +#[tokio::test] +async fn mongodb_integration_conditional_create_multiple_matches() { + let Some(backend) = create_backend("conditional_multiple_matches").await else { + eprintln!( + "Skipping mongodb_integration_conditional_create_multiple_matches (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-conditional-multi"); + + for (id, system) in [ + ("patient-cond-multi-1", "http://system-a.org"), + ("patient-cond-multi-2", "http://system-b.org"), + ] { + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": id, + "identifier": [{"system": system, "value": "SHARED-VALUE"}], + }), + FhirVersion::default(), + ) + .await + .unwrap(); + } + + let result = backend + .conditional_create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "identifier": [{"value": "SHARED-VALUE"}], + }), + "identifier=SHARED-VALUE", + FhirVersion::default(), + ) + .await + .unwrap(); + + match result { + ConditionalCreateResult::MultipleMatches(count) => assert_eq!(count, 2), + other => panic!("expected MultipleMatches result, got {:?}", other), + } +} + +#[tokio::test] +async fn mongodb_integration_conditional_patch_not_supported() { + let Some(backend) = create_backend("conditional_patch_not_supported").await else { + eprintln!("Skipping mongodb_integration_conditional_patch_not_supported (set HFS_TEST_MONGODB_URL)"); + return; + }; + + let tenant = create_tenant("tenant-conditional-patch"); + + let result = backend + .conditional_patch( + &tenant, + "Patient", + "identifier=http://hospital.org/mrn|MRN-COND-PATCH", + &PatchFormat::MergePatch(json!({ "active": true })), + ) + .await; + + assert!(matches!( + result, + Err(StorageError::Backend( + BackendError::UnsupportedCapability { .. } + )) + )); +} From 6a3a866ba1ba95ad161e6c489326b20268bc3ab3 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Mon, 9 Mar 2026 16:01:44 -0400 Subject: [PATCH 10/17] test(persistence): add MongoDB search parameter lifecycle and cursor pagination tests Add 6 new integration tests covering search parameter registration/unregistration on create/update/delete (active/draft/retired status handling) and bidirectional cursor-based pagination with has_next/has_previous flags. Tests verify registry state changes and cursor roundtrip navigation. --- crates/persistence/tests/mongodb_tests.rs | 272 ++++++++++++++++++++++ 1 file changed, 272 insertions(+) diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index d20b5d8b..fa518532 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -17,6 +17,7 @@ use helios_persistence::core::{ VersionedStorage, }; use helios_persistence::error::{BackendError, ConcurrencyError, ResourceError, StorageError}; +use helios_persistence::search::SearchParameterStatus; use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; use helios_persistence::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue, SortDirective}; use serde_json::json; @@ -672,6 +673,76 @@ async fn mongodb_integration_search_token_string_and_offset_pagination() { assert_ne!(second_page.resources.items[0].id(), first_id); } +#[tokio::test] +async fn mongodb_integration_search_cursor_pagination_roundtrip() { + let Some(backend) = create_backend("search_cursor_roundtrip").await else { + eprintln!( + "Skipping mongodb_integration_search_cursor_pagination_roundtrip (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-cursor"); + + for id in ["patient-cursor-1", "patient-cursor-2", "patient-cursor-3"] { + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": id, + "name": [{"family": format!("Cursor-{}", id)}], + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + tokio::time::sleep(std::time::Duration::from_millis(2)).await; + } + + let query = SearchQuery::new("Patient").with_count(1); + + let page1 = backend.search(&tenant, &query).await.unwrap(); + assert_eq!(page1.resources.items.len(), 1); + assert!(page1.resources.page_info.has_next); + assert!(!page1.resources.page_info.has_previous); + + let first_id = page1.resources.items[0].id().to_string(); + let next_cursor = page1 + .resources + .page_info + .next_cursor + .clone() + .expect("first page should include next cursor"); + + let page2 = backend + .search(&tenant, &query.clone().with_cursor(next_cursor)) + .await + .unwrap(); + + assert_eq!(page2.resources.items.len(), 1); + assert!(page2.resources.page_info.has_previous); + let second_id = page2.resources.items[0].id().to_string(); + assert_ne!(second_id, first_id); + + let previous_cursor = page2 + .resources + .page_info + .previous_cursor + .clone() + .expect("second page should include previous cursor"); + + let page_back = backend + .search(&tenant, &query.with_cursor(previous_cursor)) + .await + .unwrap(); + + assert_eq!(page_back.resources.items.len(), 1); + assert_eq!(page_back.resources.items[0].id(), first_id.as_str()); +} + #[tokio::test] async fn mongodb_integration_conditional_create_exists() { let Some(backend) = create_backend("conditional_create").await else { @@ -875,3 +946,204 @@ async fn mongodb_integration_conditional_patch_not_supported() { )) )); } + +#[tokio::test] +async fn mongodb_integration_search_parameter_create_registers_active() { + let Some(backend) = create_backend("search_param_create_active").await else { + eprintln!( + "Skipping mongodb_integration_search_parameter_create_registers_active (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-param-create-active"); + + backend + .create( + &tenant, + "SearchParameter", + json!({ + "resourceType": "SearchParameter", + "id": "mongo-custom-patient-nickname", + "url": "http://example.org/fhir/SearchParameter/mongo-custom-patient-nickname", + "name": "MongoPatientNickname", + "status": "active", + "code": "mongo-nickname", + "base": ["Patient"], + "type": "string", + "expression": "Patient.name.where(use='nickname').given" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let registry = backend.search_registry().read(); + let param = registry.get_param("Patient", "mongo-nickname"); + assert!(param.is_some(), "Active SearchParameter should be registered"); + + let param = param.unwrap(); + assert_eq!( + param.url, + "http://example.org/fhir/SearchParameter/mongo-custom-patient-nickname" + ); + assert_eq!(param.status, SearchParameterStatus::Active); +} + +#[tokio::test] +async fn mongodb_integration_search_parameter_create_draft_not_registered() { + let Some(backend) = create_backend("search_param_create_draft").await else { + eprintln!( + "Skipping mongodb_integration_search_parameter_create_draft_not_registered (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-param-create-draft"); + + backend + .create( + &tenant, + "SearchParameter", + json!({ + "resourceType": "SearchParameter", + "id": "mongo-custom-draft-param", + "url": "http://example.org/fhir/SearchParameter/mongo-custom-draft-param", + "name": "MongoDraftParam", + "status": "draft", + "code": "mongo-draft", + "base": ["Patient"], + "type": "string", + "expression": "Patient.extension('draft')" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let registry = backend.search_registry().read(); + let param = registry.get_param("Patient", "mongo-draft"); + assert!( + param.is_none(), + "Draft SearchParameter should not be registered" + ); +} + +#[tokio::test] +async fn mongodb_integration_search_parameter_update_status_change() { + let Some(backend) = create_backend("search_param_update_status").await else { + eprintln!( + "Skipping mongodb_integration_search_parameter_update_status_change (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-param-update-status"); + + let created = backend + .create( + &tenant, + "SearchParameter", + json!({ + "resourceType": "SearchParameter", + "id": "mongo-custom-status-change", + "url": "http://example.org/fhir/SearchParameter/mongo-custom-status-change", + "name": "MongoStatusChange", + "status": "active", + "code": "mongo-statuschange", + "base": ["Condition"], + "type": "token", + "expression": "Condition.code" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + { + let registry = backend.search_registry().read(); + let param = registry.get_param("Condition", "mongo-statuschange"); + assert!(param.is_some(), "Parameter should be registered after create"); + assert_eq!( + param.unwrap().status, + SearchParameterStatus::Active, + "Initial status should be active" + ); + } + + backend + .update( + &tenant, + &created, + json!({ + "resourceType": "SearchParameter", + "id": "mongo-custom-status-change", + "url": "http://example.org/fhir/SearchParameter/mongo-custom-status-change", + "name": "MongoStatusChange", + "status": "retired", + "code": "mongo-statuschange", + "base": ["Condition"], + "type": "token", + "expression": "Condition.code" + }), + ) + .await + .unwrap(); + + let registry = backend.search_registry().read(); + let param = registry.get_param("Condition", "mongo-statuschange"); + assert!(param.is_some(), "Parameter should still exist in registry"); + assert_eq!( + param.unwrap().status, + SearchParameterStatus::Retired, + "Status should be updated to retired" + ); +} + +#[tokio::test] +async fn mongodb_integration_search_parameter_delete_unregisters() { + let Some(backend) = create_backend("search_param_delete_unregister").await else { + eprintln!( + "Skipping mongodb_integration_search_parameter_delete_unregisters (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-param-delete"); + + backend + .create( + &tenant, + "SearchParameter", + json!({ + "resourceType": "SearchParameter", + "id": "mongo-custom-to-delete", + "url": "http://example.org/fhir/SearchParameter/mongo-custom-to-delete", + "name": "MongoToDelete", + "status": "active", + "code": "mongo-todelete", + "base": ["Observation"], + "type": "token", + "expression": "Observation.code" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + { + let registry = backend.search_registry().read(); + assert!(registry.get_param("Observation", "mongo-todelete").is_some()); + } + + backend + .delete(&tenant, "SearchParameter", "mongo-custom-to-delete") + .await + .unwrap(); + + let registry = backend.search_registry().read(); + assert!( + registry.get_param("Observation", "mongo-todelete").is_none(), + "Deleted SearchParameter should be unregistered" + ); +} From 90a43e247f71a26636e5637084e5d007298851d3 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Mon, 9 Mar 2026 16:20:32 -0400 Subject: [PATCH 11/17] feat(roadmap): mark MongoDB phases 2-4 as completed with integration test coverage Update roadmap status from planned to completed for phases 2-4 (core storage parity, versioning/history/conditional semantics, search/indexing/conditional operations). Update last_updated to 2026-03-09 and progress summary to reflect full implementation and validation through MongoDB integration tests. --- roadmap_mongo.xml | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml index e9ea8083..96ee44b2 100644 --- a/roadmap_mongo.xml +++ b/roadmap_mongo.xml @@ -5,9 +5,10 @@ in-progress TBD date-agnostic - 2026-03-05 - Dedicated detailed roadmaps now exist through Phase 4; Phase 4 search/indexing plus - conditional semantics are captured in phase4_roadmap.xml. + 2026-03-09 + Phases 1 through 4 are completed: backend wiring, core storage parity, version/history + semantics, and Phase 4 search/indexing plus conditional create/update/delete coverage are now + implemented and validated in MongoDB integration tests. SQLite primary, PostgreSQL primary, Elasticsearch secondary @@ -86,7 +87,7 @@ - + Reach minimum parity with SQLite/PostgreSQL ResourceStorage behavior. Implement create/read/update/delete/exists/count/read_batch/create_or_update. @@ -104,7 +105,7 @@ - + Implement FHIR version/history semantics and concurrency expectations. Implement VersionedStorage (vread + update_with_match semantics). @@ -130,7 +131,7 @@ - + Support FHIR search behavior and enable conditional create/update/delete with clear native/offloaded boundaries. From 6b2d4573b5b371f6705ade8ebcd2a9e3fe5816e6 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Mon, 9 Mar 2026 19:09:58 -0400 Subject: [PATCH 12/17] feat(hfs): add MongoDB and MongoDB+Elasticsearch backend runtime support Add MongoDB standalone and MongoDB+Elasticsearch composite modes to HFS server. Implement start_mongodb() and start_mongodb_elasticsearch() functions with connection string/env detection, schema initialization, and search offloading. Add empty Elasticsearch node validation for all composite modes. Update documentation to reflect MongoDB Phase 5 completion (CRUD, versioning, search, conditional ops, composite integration). Add El --- crates/hfs/src/main.rs | 188 +++++- crates/persistence/README.md | 139 ++--- .../src/backends/elasticsearch/backend.rs | 46 ++ .../src/backends/mongodb/storage.rs | 33 +- crates/persistence/src/composite/storage.rs | 570 +++++++++++++++++- .../tests/composite_routing_tests.rs | 64 ++ crates/persistence/tests/mongodb_tests.rs | 225 ++++++- crates/rest/src/config.rs | 52 +- phase5_roadmap.xml | 401 ++++++++++++ roadmap_mongo.xml | 24 +- 10 files changed, 1622 insertions(+), 120 deletions(-) create mode 100644 phase5_roadmap.xml diff --git a/crates/hfs/src/main.rs b/crates/hfs/src/main.rs index c5f3ee0c..e402eb76 100644 --- a/crates/hfs/src/main.rs +++ b/crates/hfs/src/main.rs @@ -10,8 +10,11 @@ //! | SQLite + Elasticsearch | `sqlite,elasticsearch` | SQLite for CRUD, Elasticsearch for search | //! | PostgreSQL | `postgres` | Full-featured RDBMS with JSONB storage and tsvector search | //! | PostgreSQL + Elasticsearch | `postgres,elasticsearch` | PostgreSQL for CRUD, Elasticsearch for search | +//! | MongoDB | `mongodb` | Document database with native JSON resource storage | +//! | MongoDB + Elasticsearch | `mongodb,elasticsearch` | MongoDB for CRUD, Elasticsearch for search | //! -//! Set `HFS_STORAGE_BACKEND` to `sqlite`, `sqlite-elasticsearch`, `postgres`, or `postgres-elasticsearch`. +//! Set `HFS_STORAGE_BACKEND` to `sqlite`, `sqlite-elasticsearch`, `postgres`, +//! `postgres-elasticsearch`, `mongodb`, or `mongodb-elasticsearch`. use clap::Parser; use helios_rest::{ServerConfig, StorageBackendMode, create_app_with_config, init_logging}; @@ -20,6 +23,9 @@ use tracing::info; #[cfg(feature = "sqlite")] use helios_persistence::backends::sqlite::{SqliteBackend, SqliteBackendConfig}; +#[cfg(feature = "mongodb")] +use helios_persistence::backends::mongodb::MongoBackend; + /// Creates and initializes a SQLite backend from the server configuration. #[cfg(feature = "sqlite")] fn create_sqlite_backend(config: &ServerConfig) -> anyhow::Result { @@ -42,6 +48,39 @@ fn create_sqlite_backend(config: &ServerConfig) -> anyhow::Result Ok(backend) } +/// Starts the server with MongoDB backend. +#[cfg(feature = "mongodb")] +async fn start_mongodb(config: ServerConfig) -> anyhow::Result<()> { + let backend = if let Some(ref url) = config.database_url { + if url.starts_with("mongodb://") || url.starts_with("mongodb+srv://") { + info!(url = %url, "Initializing MongoDB backend from connection string"); + MongoBackend::from_connection_string(url)? + } else { + info!( + "Initializing MongoDB backend from environment variables (database_url is not MongoDB URI)" + ); + MongoBackend::from_env()? + } + } else { + info!("Initializing MongoDB backend from environment variables"); + MongoBackend::from_env()? + }; + + backend.init_schema().await?; + + let app = create_app_with_config(backend, config.clone()); + serve(app, &config).await +} + +/// Fallback when mongodb feature is not enabled. +#[cfg(not(feature = "mongodb"))] +async fn start_mongodb(_config: ServerConfig) -> anyhow::Result<()> { + anyhow::bail!( + "The mongodb backend requires the 'mongodb' feature. \ + Build with: cargo build -p helios-hfs --features mongodb" + ) +} + /// Starts the Axum HTTP server. async fn serve(app: axum::Router, config: &ServerConfig) -> anyhow::Result<()> { let addr = config.socket_addr(); @@ -88,6 +127,12 @@ async fn main() -> anyhow::Result<()> { StorageBackendMode::PostgresElasticsearch => { start_postgres_elasticsearch(config).await?; } + StorageBackendMode::MongoDB => { + start_mongodb(config).await?; + } + StorageBackendMode::MongoDBElasticsearch => { + start_mongodb_elasticsearch(config).await?; + } } Ok(()) @@ -136,6 +181,12 @@ async fn start_sqlite_elasticsearch(config: ServerConfig) -> anyhow::Result<()> .filter(|s| !s.is_empty()) .collect(); + if es_nodes.is_empty() { + anyhow::bail!( + "sqlite-elasticsearch mode requires at least one Elasticsearch node in HFS_ELASTICSEARCH_NODES" + ); + } + let es_auth = match ( &config.elasticsearch_username, &config.elasticsearch_password, @@ -291,6 +342,12 @@ async fn start_postgres_elasticsearch(config: ServerConfig) -> anyhow::Result<() .filter(|s| !s.is_empty()) .collect(); + if es_nodes.is_empty() { + anyhow::bail!( + "postgres-elasticsearch mode requires at least one Elasticsearch node in HFS_ELASTICSEARCH_NODES" + ); + } + let es_auth = match ( &config.elasticsearch_username, &config.elasticsearch_password, @@ -370,5 +427,134 @@ async fn start_postgres_elasticsearch(_config: ServerConfig) -> anyhow::Result<( ) } +/// Starts the server with MongoDB + Elasticsearch composite backend. +#[cfg(all(feature = "mongodb", feature = "elasticsearch"))] +async fn start_mongodb_elasticsearch(config: ServerConfig) -> anyhow::Result<()> { + use std::collections::HashMap; + use std::sync::Arc; + + use helios_persistence::backends::elasticsearch::{ + ElasticsearchAuth, ElasticsearchBackend, ElasticsearchConfig, + }; + use helios_persistence::composite::{CompositeConfig, CompositeStorage}; + use helios_persistence::core::BackendKind; + + // Create MongoDB backend + let backend = if let Some(ref url) = config.database_url { + if url.starts_with("mongodb://") || url.starts_with("mongodb+srv://") { + info!(url = %url, "Initializing MongoDB backend from connection string"); + MongoBackend::from_connection_string(url)? + } else { + info!( + "Initializing MongoDB backend from environment variables (database_url is not MongoDB URI)" + ); + MongoBackend::from_env()? + } + } else { + info!("Initializing MongoDB backend from environment variables"); + MongoBackend::from_env()? + }; + + backend.init_schema().await?; + + // Offload search to Elasticsearch + let mut backend = backend; + backend.set_search_offloaded(true); + let mongo = Arc::new(backend); + info!("MongoDB search indexing disabled (offloaded to Elasticsearch)"); + + // Build Elasticsearch configuration from server config + let es_nodes: Vec = config + .elasticsearch_nodes + .split(',') + .map(|s| s.trim().to_string()) + .filter(|s| !s.is_empty()) + .collect(); + + if es_nodes.is_empty() { + anyhow::bail!( + "mongodb-elasticsearch mode requires at least one Elasticsearch node in HFS_ELASTICSEARCH_NODES" + ); + } + + let es_auth = match ( + &config.elasticsearch_username, + &config.elasticsearch_password, + ) { + (Some(username), Some(password)) => Some(ElasticsearchAuth::Basic { + username: username.clone(), + password: password.clone(), + }), + _ => None, + }; + + let es_config = ElasticsearchConfig { + nodes: es_nodes.clone(), + index_prefix: config.elasticsearch_index_prefix.clone(), + auth: es_auth, + fhir_version: config.default_fhir_version, + ..Default::default() + }; + + info!( + nodes = ?es_nodes, + index_prefix = %config.elasticsearch_index_prefix, + "Initializing Elasticsearch backend" + ); + + // Create ES backend sharing MongoDB's search parameter registry + let es = Arc::new(ElasticsearchBackend::with_shared_registry( + es_config, + mongo.search_registry().clone(), + )?); + + // Build composite configuration + let composite_config = CompositeConfig::builder() + .primary("mongodb", BackendKind::MongoDB) + .search_backend("es", BackendKind::Elasticsearch) + .build()?; + + // Build backends map for CompositeStorage + let mut backends = HashMap::new(); + backends.insert( + "mongodb".to_string(), + mongo.clone() as helios_persistence::composite::DynStorage, + ); + backends.insert( + "es".to_string(), + es.clone() as helios_persistence::composite::DynStorage, + ); + + // Build search providers map + let mut search_providers = HashMap::new(); + search_providers.insert( + "mongodb".to_string(), + mongo.clone() as helios_persistence::composite::DynSearchProvider, + ); + search_providers.insert( + "es".to_string(), + es.clone() as helios_persistence::composite::DynSearchProvider, + ); + + // Create composite storage with full primary capabilities + let composite = CompositeStorage::new(composite_config, backends)? + .with_search_providers(search_providers) + .with_full_primary(mongo); + + info!("Composite storage initialized: MongoDB (primary) + Elasticsearch (search)"); + + let app = create_app_with_config(composite, config.clone()); + serve(app, &config).await +} + +/// Fallback when mongodb+elasticsearch features are not both enabled. +#[cfg(not(all(feature = "mongodb", feature = "elasticsearch")))] +async fn start_mongodb_elasticsearch(_config: ServerConfig) -> anyhow::Result<()> { + anyhow::bail!( + "The mongodb-elasticsearch backend requires both 'mongodb' and 'elasticsearch' features. \ + Build with: cargo build -p helios-hfs --features mongodb,elasticsearch" + ) +} + #[cfg(not(any(feature = "sqlite", feature = "postgres", feature = "mongodb")))] compile_error!("At least one database backend feature must be enabled"); diff --git a/crates/persistence/README.md b/crates/persistence/README.md index 777ff3f1..dc5becae 100644 --- a/crates/persistence/README.md +++ b/crates/persistence/README.md @@ -303,7 +303,7 @@ The matrix below shows which FHIR operations each backend supports. This reflect **Legend:** ✓ Implemented | ◐ Partial | ○ Planned | ✗ Not planned | † Requires external service -> **MongoDB Status:** Phase 3 core storage semantics are implemented (CRUD, vread/history, optimistic locking, tenant isolation), including best-effort session-backed consistency for multi-write flows where deployment topology permits. Search and conditional operations remain planned, and full transaction bundle semantics remain planned. +> **MongoDB Status:** MongoDB primary support is implemented through Phase 5: CRUD, vread/history, optimistic locking, tenant isolation, Phase 4 native search for the implemented parameter surface, conditional create/update/delete, and best-effort session-backed consistency for multi-write flows where deployment topology permits. MongoDB + Elasticsearch composite mode is implemented with write-primary/read-primary/search-secondary routing. Conditional patch, full transaction bundle semantics, delete-history Trial Use operations, and advanced search features outside the implemented Phase 4 surface remain partial or planned. | Feature | SQLite | PostgreSQL | MongoDB | Cassandra | Neo4j | Elasticsearch | S3 | | --------------------------------------------------------------------------- | ------ | ---------- | ------- | --------- | ----- | ------------- | --- | @@ -311,12 +311,12 @@ The matrix below shows which FHIR operations each backend supports. This reflect | [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | | [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ✓ | ○ | ○ | ○ | ○ | | [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ✓ | ○ | ○ | ✗ | ✗ | -| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ○ | ○ | ✗ | ○ | +| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | | [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | | [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | | [Batch Bundles](https://build.fhir.org/http.html#batch) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | | [Transaction Bundles](https://build.fhir.org/http.html#transaction) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | -| [Conditional Operations](https://build.fhir.org/http.html#cond-update) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | +| [Conditional Operations](https://build.fhir.org/http.html#cond-update) | ✓ | ✓ | ✓ | ✗ | ○ | ○ | ✗ | | [Conditional Patch](https://build.fhir.org/http.html#patch) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | | [Delete History](https://build.fhir.org/http.html#delete) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | | **Multitenancy** | @@ -325,13 +325,13 @@ The matrix below shows which FHIR operations each backend supports. This reflect | Database-per-Tenant | ✓ | ○ | ○ | ○ | ○ | ○ | ○ | | Row-Level Security | ✗ | ○ | ✗ | ✗ | ✗ | ✗ | ✗ | | **[Search Parameters](https://build.fhir.org/search.html#ptypes)** | -| [String](https://build.fhir.org/search.html#string) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| [Token](https://build.fhir.org/search.html#token) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ✗ | -| [Reference](https://build.fhir.org/search.html#reference) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| [Date](https://build.fhir.org/search.html#date) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | -| [Number](https://build.fhir.org/search.html#number) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ○ | +| [String](https://build.fhir.org/search.html#string) | ✓ | ✓ | ✓ | ✗ | ○ | ✓ | ✗ | +| [Token](https://build.fhir.org/search.html#token) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ✗ | +| [Reference](https://build.fhir.org/search.html#reference) | ✓ | ✓ | ✓ | ✗ | ○ | ✓ | ✗ | +| [Date](https://build.fhir.org/search.html#date) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | +| [Number](https://build.fhir.org/search.html#number) | ✓ | ✓ | ✓ | ✗ | ○ | ✓ | ○ | | [Quantity](https://build.fhir.org/search.html#quantity) | ✓ | ✓ | ○ | ✗ | ✗ | ✓ | ○ | -| [URI](https://build.fhir.org/search.html#uri) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| [URI](https://build.fhir.org/search.html#uri) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | | [Composite](https://build.fhir.org/search.html#composite) | ✓ | ○ | ○ | ✗ | ○ | ✓ | ✗ | | **[Search Modifiers](https://build.fhir.org/search.html#modifiers)** | | [:exact](https://build.fhir.org/search.html#modifiers) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | @@ -353,11 +353,11 @@ The matrix below shows which FHIR operations each backend supports. This reflect | [\_include](https://build.fhir.org/search.html#include) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | | [\_revinclude](https://build.fhir.org/search.html#revinclude) | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | | **[Pagination](https://build.fhir.org/http.html#paging)** | -| Offset | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| Cursor (keyset) | ✓ | ✓ | ○ | ○ | ○ | ✓ | ○ | +| Offset | ✓ | ✓ | ✓ | ✗ | ○ | ✓ | ✗ | +| Cursor (keyset) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | | **[Sorting](https://build.fhir.org/search.html#sort)** | -| Single field | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | -| Multiple fields | ✓ | ✓ | ○ | ✗ | ○ | ✓ | ✗ | +| Single field | ✓ | ✓ | ✓ | ✗ | ○ | ✓ | ✗ | +| Multiple fields | ✓ | ✓ | ✓ | ✗ | ○ | ✓ | ✗ | | **[Bulk Operations](https://hl7.org/fhir/uv/bulkdata/)** | | [Bulk Export](https://hl7.org/fhir/uv/bulkdata/export.html) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | | [Bulk Submit](https://hackmd.io/@argonaut/rJoqHZrPle) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | @@ -366,18 +366,19 @@ The matrix below shows which FHIR operations each backend supports. This reflect Backends can serve as primary (CRUD, versioning, transactions) or secondary (optimized for specific query patterns). When a secondary search backend is configured, the primary backend's search indexing is automatically disabled to avoid data duplication. -| Configuration | Primary | Secondary | Status | Use Case | -| -------------------------- | ---------- | ---------------------- | ------------------------------------ | --------------------------------------- | -| SQLite alone | SQLite | — | ✓ Implemented | Development, testing, small deployments | -| SQLite + Elasticsearch | SQLite | Elasticsearch (search) | ✓ Implemented | Small prod with robust search | -| PostgreSQL alone | PostgreSQL | — | ✓ Implemented | Production OLTP | -| PostgreSQL + Elasticsearch | PostgreSQL | Elasticsearch (search) | ✓ Implemented | OLTP + advanced search | -| PostgreSQL + Neo4j | PostgreSQL | Neo4j (graph) | Planned | Graph-heavy queries | -| Cassandra alone | Cassandra | — | Planned | High write throughput | -| Cassandra + Elasticsearch | Cassandra | Elasticsearch (search) | Planned | Write-heavy + search | -| MongoDB alone | MongoDB | — | ◐ In progress (Phase 3 core storage) | Document-centric | -| S3 alone | S3 | — | Planned | Archival/bulk storage | -| S3 + Elasticsearch | S3 | Elasticsearch (search) | Planned | Large-scale + search | +| Configuration | Primary | Secondary | Status | Use Case | +| -------------------------- | ---------- | ---------------------- | ------------- | --------------------------------------- | +| SQLite alone | SQLite | — | ✓ Implemented | Development, testing, small deployments | +| SQLite + Elasticsearch | SQLite | Elasticsearch (search) | ✓ Implemented | Small prod with robust search | +| PostgreSQL alone | PostgreSQL | — | ✓ Implemented | Production OLTP | +| PostgreSQL + Elasticsearch | PostgreSQL | Elasticsearch (search) | ✓ Implemented | OLTP + advanced search | +| PostgreSQL + Neo4j | PostgreSQL | Neo4j (graph) | Planned | Graph-heavy queries | +| Cassandra alone | Cassandra | — | Planned | High write throughput | +| Cassandra + Elasticsearch | Cassandra | Elasticsearch (search) | Planned | Write-heavy + search | +| MongoDB alone | MongoDB | — | ✓ Implemented | Document-centric | +| MongoDB + Elasticsearch | MongoDB | Elasticsearch (search) | ✓ Implemented | Document-centric + offloaded search | +| S3 alone | S3 | — | Planned | Archival/bulk storage | +| S3 + Elasticsearch | S3 | Elasticsearch (search) | Planned | Large-scale + search | ### Backend Selection Guide @@ -516,66 +517,7 @@ HFS_ELASTICSEARCH_NODES=http://localhost:9200 \ ### How Search Offloading Works -When `HFS_STORAGE_BACKEND` is set to `sqlite-elasticsearch` or `postgres-elasticsearch`, the server: - -1. Creates the primary backend (SQLite or PostgreSQL) with search indexing **disabled** -2. Creates an Elasticsearch backend sharing the primary backend's search parameter registry -3. Wraps both in a `CompositeStorage` that routes: - - All **writes** (create, update, delete, conditional ops, transactions) → primary backend, then syncs to ES - - All **reads** (read, vread, history) → primary backend - - All **search** operations → Elasticsearch - -This avoids data duplication in the primary backend's search tables while providing Elasticsearch's superior search capabilities. - -## Elasticsearch Backend - -The Elasticsearch backend serves as a search-optimized secondary in the composite storage layer. It handles all search parameter indexing, full-text search, and query execution when configured alongside a primary backend. - -### Configuration - -```rust -use helios_persistence::backends::elasticsearch::ElasticsearchConfig; - -let config = ElasticsearchConfig { - nodes: vec!["http://localhost:9200".to_string()], - index_prefix: "hfs".to_string(), - username: None, - password: None, - timeout: std::time::Duration::from_secs(30), - number_of_shards: 1, - number_of_replicas: 1, - max_result_window: 10000, - refresh_interval: "1s".to_string(), -}; -``` - -| Option | Default | Description | -| ----------------------- | --------------------------- | ------------------------------------------- | -| `nodes` | `["http://localhost:9200"]` | Elasticsearch node URLs | -| `index_prefix` | `"hfs"` | Prefix for all index names | -| `username` / `password` | `None` | Basic authentication credentials | -| `timeout` | `30s` | Request timeout | -| `number_of_shards` | `1` | Number of primary shards per index | -| `number_of_replicas` | `1` | Number of replica shards per index | -| `max_result_window` | `10000` | Maximum `from + size` for offset pagination | -| `refresh_interval` | `"1s"` | How often new documents become searchable | - -### Index Structure - -Each tenant + resource type combination gets its own index: `{prefix}_{tenant_id}_{resource_type}` (e.g., `hfs_acme_patient`). - -Documents contain: - -- **Metadata**: `resource_type`, `resource_id`, `tenant_id`, `version_id`, `last_updated`, `is_deleted` -- **Content**: Raw FHIR JSON (stored but not indexed) -- **Full-text fields**: `narrative_text` (from `text.div`), `content_text` (all string values) -- **Search parameters**: Nested objects for each parameter type (`string`, `token`, `date`, `number`, `quantity`, `reference`, `uri`, `composite`) - -All search parameter fields use `"type": "nested"` to ensure correct multi-value matching (e.g., system and code must co-occur in the same token object). - -### Search Offloading - -When Elasticsearch is configured as a search secondary, the primary backend automatically disables its own search index population. For a SQLite + Elasticsearch configuration: +When Elasticsearch is configured as a search secondary, the primary backend automatically disables its own search index population. This applies to both SQLite + Elasticsearch and MongoDB + Elasticsearch composite configurations. For a SQLite + Elasticsearch configuration: - SQLite stores only the FHIR resource (the `resources` and `resource_history` tables) - SQLite does **not** populate `search_index` or `resource_fts` tables @@ -800,7 +742,9 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [ ] Cassandra backend (wide-column, partition keys) - [x] MongoDB Phase 1 scaffold (module wiring, config, Backend trait baseline) - [x] MongoDB Phase 2 core storage parity (CRUD/count/read_batch/create_or_update, tenant isolation, soft-delete, schema bootstrap) -- [ ] MongoDB Phase 3+ advanced semantics (versioning/history/conditional/transactions/search execution) +- [x] MongoDB Phase 3 versioning/history plus best-effort session-backed consistency +- [x] MongoDB Phase 4 native search, pagination/sorting, and conditional create/update/delete +- [x] MongoDB Phase 5 composite MongoDB + Elasticsearch integration and runtime wiring - [ ] Neo4j backend (graph queries, Cypher) - [ ] S3 backend (bulk export, object storage) @@ -831,16 +775,19 @@ The composite storage layer enables polyglot persistence by coordinating multipl ### Valid Backend Configurations -| Configuration | Primary | Secondary(s) | Status | Use Case | -| ------------------ | ---------- | ------------- | ------------- | ------------------------------ | -| SQLite-only | SQLite | None | ✓ Implemented | Development, small deployments | -| SQLite + ES | SQLite | Elasticsearch | ✓ Implemented | Small prod with robust search | -| PostgreSQL-only | PostgreSQL | None | ✓ Implemented | Production OLTP | -| PostgreSQL + ES | PostgreSQL | Elasticsearch | ✓ Implemented | OLTP + advanced search | -| PostgreSQL + Neo4j | PostgreSQL | Neo4j | Planned | Graph-heavy queries | -| S3 + ES | S3 | Elasticsearch | Planned | Large-scale, cheap storage | - -> **MongoDB Note:** Phase 2 core storage is implemented under `src/backends/mongodb`, but runtime `HFS_STORAGE_BACKEND` modes for MongoDB are not yet enabled (planned in a later phase). +| Configuration | Primary | Secondary(s) | Status | Use Case | +| ------------------ | ---------- | ------------- | ------------- | --------------------------------------- | +| SQLite-only | SQLite | None | ✓ Implemented | Development, testing, small deployments | +| SQLite + ES | SQLite | Elasticsearch | ✓ Implemented | Small prod with robust search | +| PostgreSQL-only | PostgreSQL | None | ✓ Implemented | Production OLTP | +| PostgreSQL + ES | PostgreSQL | Elasticsearch | ✓ Implemented | OLTP + advanced search | +| PostgreSQL + Neo4j | PostgreSQL | Neo4j | Planned | Graph-heavy queries | +| MongoDB-only | MongoDB | None | ✓ Implemented | Document-centric primary | +| MongoDB + ES | MongoDB | Elasticsearch | ✓ Implemented | Document-centric + search | +| S3 alone | S3 | — | Planned | Archival/bulk storage | +| S3 + ES | S3 | Elasticsearch | Planned | Large-scale + search | + +> **MongoDB Note:** Runtime `HFS_STORAGE_BACKEND` now supports both `mongodb` and `mongodb-elasticsearch`. In composite mode, MongoDB remains the canonical write/read store while Elasticsearch owns delegated search execution. ### Quick Start diff --git a/crates/persistence/src/backends/elasticsearch/backend.rs b/crates/persistence/src/backends/elasticsearch/backend.rs index ad0689bd..4caf4dc9 100644 --- a/crates/persistence/src/backends/elasticsearch/backend.rs +++ b/crates/persistence/src/backends/elasticsearch/backend.rs @@ -577,6 +577,7 @@ impl ElasticsearchBackend { #[cfg(test)] mod tests { use super::*; + use serde_json::json; #[test] fn test_config_defaults() { @@ -626,4 +627,49 @@ mod tests { assert_eq!(backend.kind(), BackendKind::Elasticsearch); assert_eq!(backend.name(), "elasticsearch"); } + + #[test] + fn test_with_shared_registry_reuses_arc() { + let config = ElasticsearchConfig::default(); + let shared_registry = Arc::new(RwLock::new(SearchParameterRegistry::new())); + + let backend = + ElasticsearchBackend::with_shared_registry(config, shared_registry.clone()).unwrap(); + + assert!(Arc::ptr_eq(backend.search_registry(), &shared_registry)); + } + + #[test] + fn test_with_shared_registry_reflects_runtime_updates() { + let config = ElasticsearchConfig::default(); + let shared_registry = Arc::new(RwLock::new(SearchParameterRegistry::new())); + let backend = + ElasticsearchBackend::with_shared_registry(config, shared_registry.clone()).unwrap(); + + let loader = SearchParameterLoader::new(FhirVersion::default()); + let definition = loader + .parse_resource(&json!({ + "resourceType": "SearchParameter", + "id": "mongo-shared-param", + "url": "http://example.org/fhir/SearchParameter/mongo-shared-param", + "name": "MongoSharedParam", + "status": "active", + "code": "mongo-shared-code", + "base": ["Patient"], + "type": "token", + "expression": "Patient.identifier" + })) + .expect("parse shared SearchParameter definition"); + + shared_registry + .write() + .register(definition) + .expect("register shared SearchParameter"); + + let registry = backend.search_registry().read(); + assert!( + registry.get_param("Patient", "mongo-shared-code").is_some(), + "shared registry updates should be visible to Elasticsearch backend" + ); + } } diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index fe173daf..5c0e531b 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -11,10 +11,13 @@ use mongodb::{ use serde_json::Value; use crate::core::{ - HistoryEntry, HistoryMethod, HistoryPage, HistoryParams, InstanceHistoryProvider, - ResourceStorage, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, normalize_etag, + BundleEntry, BundleProvider, BundleResult, HistoryEntry, HistoryMethod, HistoryPage, + HistoryParams, InstanceHistoryProvider, ResourceStorage, SystemHistoryProvider, + TypeHistoryProvider, VersionedStorage, normalize_etag, +}; +use crate::error::{ + BackendError, ConcurrencyError, ResourceError, StorageError, StorageResult, TransactionError, }; -use crate::error::{BackendError, ConcurrencyError, ResourceError, StorageError, StorageResult}; use crate::search::converters::IndexValue; use crate::search::extractor::ExtractedValue; use crate::search::{SearchParameterLoader, SearchParameterStatus}; @@ -1741,3 +1744,27 @@ impl SystemHistoryProvider for MongoBackend { .map_err(|e| internal_error(format!("Failed to count system history: {}", e))) } } + +#[async_trait] +impl BundleProvider for MongoBackend { + async fn process_transaction( + &self, + _tenant: &TenantContext, + _entries: Vec, + ) -> Result { + Err(TransactionError::UnsupportedIsolationLevel { + level: "transaction bundles for mongodb".to_string(), + }) + } + + async fn process_batch( + &self, + _tenant: &TenantContext, + _entries: Vec, + ) -> StorageResult { + Err(StorageError::Backend(BackendError::UnsupportedCapability { + backend_name: "mongodb".to_string(), + capability: "BundleProvider".to_string(), + })) + } +} diff --git a/crates/persistence/src/composite/storage.rs b/crates/persistence/src/composite/storage.rs index 778f3753..7cb7e471 100644 --- a/crates/persistence/src/composite/storage.rs +++ b/crates/persistence/src/composite/storage.rs @@ -53,7 +53,8 @@ use crate::core::{ use crate::error::{BackendError, StorageError, StorageResult, TransactionError}; use crate::tenant::TenantContext; use crate::types::{ - IncludeDirective, Pagination, ReverseChainedParameter, SearchQuery, StoredResource, + IncludeDirective, Pagination, ReverseChainedParameter, SearchParamType, SearchParameter, + SearchQuery, SearchValue, StoredResource, }; use super::config::CompositeConfig; @@ -157,6 +158,67 @@ impl Default for BackendHealth { } impl CompositeStorage { + fn has_dedicated_search_backend(&self) -> bool { + self.config + .backends_with_role(super::config::BackendRole::Search) + .next() + .is_some() + } + + fn parse_simple_search_params(params: &str) -> Vec<(String, String)> { + params + .split('&') + .filter_map(|pair| { + let parts: Vec<&str> = pair.splitn(2, '=').collect(); + if parts.len() == 2 { + Some((parts[0].to_string(), parts[1].to_string())) + } else { + None + } + }) + .collect() + } + + fn infer_conditional_param_type(name: &str) -> SearchParamType { + match name { + "_id" => SearchParamType::Token, + "_lastUpdated" => SearchParamType::Date, + "_tag" | "_profile" | "_security" | "identifier" => SearchParamType::Token, + "patient" | "subject" | "encounter" | "performer" | "author" | "requester" + | "recorder" | "asserter" | "practitioner" | "organization" | "location" + | "device" => SearchParamType::Reference, + _ => SearchParamType::String, + } + } + + async fn find_conditional_matches( + &self, + tenant: &TenantContext, + resource_type: &str, + search_params: &str, + ) -> StorageResult> { + let parsed_params = Self::parse_simple_search_params(search_params); + if parsed_params.is_empty() { + return Ok(Vec::new()); + } + + let mut query = SearchQuery::new(resource_type); + query.count = Some(1000); + for (name, value) in parsed_params { + query = query.with_parameter(SearchParameter { + name: name.clone(), + param_type: Self::infer_conditional_param_type(&name), + modifier: None, + values: vec![SearchValue::parse(&value)], + chain: vec![], + components: vec![], + }); + } + + let result = self.search(tenant, &query).await?; + Ok(result.resources.items) + } + /// Creates a new composite storage with the given configuration and backends. /// /// # Arguments @@ -792,8 +854,18 @@ impl SearchProvider for CompositeStorage { tenant: &TenantContext, query: &SearchQuery, ) -> StorageResult { - // For count, we can just use primary - // A more sophisticated implementation might route based on features + // Prefer dedicated Search backend when configured, matching `search` routing. + if let Some(search_backend) = self + .config + .backends_with_role(super::config::BackendRole::Search) + .next() + { + if let Some(provider) = self.search_providers.get(&search_backend.id) { + return provider.search_count(tenant, query).await; + } + } + + // Fall back to primary provider. if let Some(provider) = self .search_providers .get(self.config.primary_id().unwrap_or("primary")) @@ -818,6 +890,40 @@ impl ConditionalStorage for CompositeStorage { search_params: &str, fhir_version: FhirVersion, ) -> StorageResult { + if self.has_dedicated_search_backend() { + let matches = self + .find_conditional_matches(tenant, resource_type, search_params) + .await?; + + return match matches.len() { + 0 => { + let created = self + .primary + .create(tenant, resource_type, resource, fhir_version) + .await?; + + if let Err(e) = self + .sync_to_secondaries(SyncEvent::Create { + resource_type: resource_type.to_string(), + resource_id: created.id().to_string(), + content: created.content().clone(), + tenant_id: tenant.tenant_id().clone(), + fhir_version, + }) + .await + { + warn!(error = %e, "Failed to sync conditional_create to secondaries"); + } + + Ok(ConditionalCreateResult::Created(created)) + } + 1 => Ok(ConditionalCreateResult::Exists( + matches.into_iter().next().expect("single match must exist"), + )), + n => Ok(ConditionalCreateResult::MultipleMatches(n)), + }; + } + let storage = self.conditional_storage.as_ref().ok_or_else(|| { StorageError::Backend(BackendError::UnsupportedCapability { backend_name: "composite".to_string(), @@ -857,6 +963,64 @@ impl ConditionalStorage for CompositeStorage { upsert: bool, fhir_version: FhirVersion, ) -> StorageResult { + if self.has_dedicated_search_backend() { + let matches = self + .find_conditional_matches(tenant, resource_type, search_params) + .await?; + + return match matches.len() { + 0 => { + if upsert { + let created = self + .primary + .create(tenant, resource_type, resource, fhir_version) + .await?; + + if let Err(e) = self + .sync_to_secondaries(SyncEvent::Create { + resource_type: resource_type.to_string(), + resource_id: created.id().to_string(), + content: created.content().clone(), + tenant_id: tenant.tenant_id().clone(), + fhir_version, + }) + .await + { + warn!( + error = %e, + "Failed to sync conditional_update create to secondaries" + ); + } + + Ok(ConditionalUpdateResult::Created(created)) + } else { + Ok(ConditionalUpdateResult::NoMatch) + } + } + 1 => { + let current = matches.into_iter().next().expect("single match must exist"); + let updated = self.primary.update(tenant, ¤t, resource).await?; + + if let Err(e) = self + .sync_to_secondaries(SyncEvent::Update { + resource_type: resource_type.to_string(), + resource_id: updated.id().to_string(), + content: updated.content().clone(), + tenant_id: tenant.tenant_id().clone(), + version: updated.version_id().to_string(), + fhir_version: updated.fhir_version(), + }) + .await + { + warn!(error = %e, "Failed to sync conditional_update to secondaries"); + } + + Ok(ConditionalUpdateResult::Updated(updated)) + } + n => Ok(ConditionalUpdateResult::MultipleMatches(n)), + }; + } + let storage = self.conditional_storage.as_ref().ok_or_else(|| { StorageError::Backend(BackendError::UnsupportedCapability { backend_name: "composite".to_string(), @@ -918,6 +1082,34 @@ impl ConditionalStorage for CompositeStorage { resource_type: &str, search_params: &str, ) -> StorageResult { + if self.has_dedicated_search_backend() { + let matches = self + .find_conditional_matches(tenant, resource_type, search_params) + .await?; + + return match matches.len() { + 0 => Ok(ConditionalDeleteResult::NoMatch), + 1 => { + let current = matches.into_iter().next().expect("single match must exist"); + self.primary.delete(tenant, resource_type, current.id()).await?; + + if let Err(e) = self + .sync_to_secondaries(SyncEvent::Delete { + resource_type: resource_type.to_string(), + resource_id: current.id().to_string(), + tenant_id: tenant.tenant_id().clone(), + }) + .await + { + warn!(error = %e, "Failed to sync conditional_delete to secondaries"); + } + + Ok(ConditionalDeleteResult::Deleted) + } + n => Ok(ConditionalDeleteResult::MultipleMatches(n)), + }; + } + let storage = self.conditional_storage.as_ref().ok_or_else(|| { StorageError::Backend(BackendError::UnsupportedCapability { backend_name: "composite".to_string(), @@ -1581,7 +1773,119 @@ impl CapabilityProvider for CompositeStorage { #[cfg(test)] mod tests { use super::*; + use async_trait::async_trait; + use helios_fhir::FhirVersion; + use serde_json::{Value, json}; use crate::core::BackendKind; + use crate::error::{BackendError, StorageError, StorageResult}; + use crate::tenant::{TenantContext, TenantId, TenantPermissions}; + use crate::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue, StoredResource}; + + #[derive(Debug)] + struct FailingSearchBackend { + backend_name: &'static str, + error_message: &'static str, + } + + #[async_trait] + impl ResourceStorage for FailingSearchBackend { + fn backend_name(&self) -> &'static str { + self.backend_name + } + + async fn create( + &self, + _tenant: &TenantContext, + _resource_type: &str, + _resource: Value, + _fhir_version: FhirVersion, + ) -> StorageResult { + Err(StorageError::Backend(BackendError::UnsupportedCapability { + backend_name: self.backend_name.to_string(), + capability: "create".to_string(), + })) + } + + async fn create_or_update( + &self, + _tenant: &TenantContext, + _resource_type: &str, + _id: &str, + _resource: Value, + _fhir_version: FhirVersion, + ) -> StorageResult<(StoredResource, bool)> { + Err(StorageError::Backend(BackendError::UnsupportedCapability { + backend_name: self.backend_name.to_string(), + capability: "create_or_update".to_string(), + })) + } + + async fn read( + &self, + _tenant: &TenantContext, + _resource_type: &str, + _id: &str, + ) -> StorageResult> { + Ok(None) + } + + async fn update( + &self, + _tenant: &TenantContext, + _current: &StoredResource, + _resource: Value, + ) -> StorageResult { + Err(StorageError::Backend(BackendError::UnsupportedCapability { + backend_name: self.backend_name.to_string(), + capability: "update".to_string(), + })) + } + + async fn delete( + &self, + _tenant: &TenantContext, + _resource_type: &str, + _id: &str, + ) -> StorageResult<()> { + Err(StorageError::Backend(BackendError::UnsupportedCapability { + backend_name: self.backend_name.to_string(), + capability: "delete".to_string(), + })) + } + + async fn count( + &self, + _tenant: &TenantContext, + _resource_type: Option<&str>, + ) -> StorageResult { + Ok(0) + } + } + + #[async_trait] + impl SearchProvider for FailingSearchBackend { + async fn search( + &self, + _tenant: &TenantContext, + _query: &SearchQuery, + ) -> StorageResult { + Err(StorageError::Backend(BackendError::ConnectionFailed { + backend_name: self.backend_name.to_string(), + message: self.error_message.to_string(), + })) + } + + async fn search_count( + &self, + _tenant: &TenantContext, + _query: &SearchQuery, + ) -> StorageResult { + Err(StorageError::Backend(BackendError::ConnectionFailed { + backend_name: self.backend_name.to_string(), + message: self.error_message.to_string(), + })) + } + } fn test_config() -> CompositeConfig { CompositeConfig::builder() @@ -1605,4 +1909,264 @@ mod tests { assert_eq!(config.primary_id(), Some("sqlite")); assert_eq!(config.secondaries().count(), 1); } + + #[cfg(feature = "sqlite")] + #[tokio::test] + async fn test_search_prefers_configured_search_backend() { + use std::collections::HashMap; + use std::sync::Arc; + + use crate::backends::sqlite::SqliteBackend; + use crate::core::{ResourceStorage, SearchProvider}; + use crate::tenant::{TenantContext, TenantId, TenantPermissions}; + use crate::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue}; + + let primary = Arc::new(SqliteBackend::in_memory().expect("create primary sqlite backend")); + primary.init_schema().expect("init primary sqlite schema"); + + let search = Arc::new(SqliteBackend::in_memory().expect("create search sqlite backend")); + search.init_schema().expect("init search sqlite schema"); + + let tenant = TenantContext::new(TenantId::new("composite-test"), TenantPermissions::full_access()); + + // Seed distinct data so we can tell which provider answered the query. + primary + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "primary-only-patient", + }), + FhirVersion::default(), + ) + .await + .expect("seed primary patient"); + + search + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "search-only-patient", + }), + FhirVersion::default(), + ) + .await + .expect("seed search patient"); + + let composite_config = CompositeConfig::builder() + .primary("primary", BackendKind::Sqlite) + .search_backend("search", BackendKind::Sqlite) + .build() + .expect("build composite config"); + + let mut backends = HashMap::new(); + backends.insert("primary".to_string(), primary.clone() as DynStorage); + backends.insert("search".to_string(), search.clone() as DynStorage); + + let mut search_providers = HashMap::new(); + search_providers.insert("primary".to_string(), primary.clone() as DynSearchProvider); + search_providers.insert("search".to_string(), search.clone() as DynSearchProvider); + + let composite = CompositeStorage::new(composite_config, backends) + .expect("create composite storage") + .with_search_providers(search_providers) + .with_full_primary(primary.clone()); + + let read_result = composite + .read(&tenant, "Patient", "primary-only-patient") + .await + .expect("composite read should succeed"); + assert!( + read_result.is_some(), + "Read path should use primary backend data" + ); + + let query = SearchQuery::new("Patient").with_parameter(SearchParameter { + name: "_id".to_string(), + param_type: SearchParamType::Token, + modifier: None, + values: vec![SearchValue::eq("search-only-patient")], + chain: vec![], + components: vec![], + }); + + let result = composite + .search(&tenant, &query) + .await + .expect("composite search should succeed"); + + assert_eq!(result.resources.len(), 1); + assert_eq!(result.resources.items[0].id(), "search-only-patient"); + + let count = composite + .search_count(&tenant, &query) + .await + .expect("composite search_count should succeed"); + assert_eq!(count, 1); + } + + #[cfg(feature = "sqlite")] + #[tokio::test] + async fn test_search_backend_preserves_tenant_isolation() { + use std::collections::HashMap; + use std::sync::Arc; + + use crate::backends::sqlite::SqliteBackend; + use crate::core::{ResourceStorage, SearchProvider}; + use crate::tenant::{TenantContext, TenantId, TenantPermissions}; + + let primary = Arc::new(SqliteBackend::in_memory().expect("create primary sqlite backend")); + primary.init_schema().expect("init primary sqlite schema"); + + let search = Arc::new(SqliteBackend::in_memory().expect("create search sqlite backend")); + search.init_schema().expect("init search sqlite schema"); + + let tenant_a = TenantContext::new(TenantId::new("tenant-a"), TenantPermissions::full_access()); + let tenant_b = TenantContext::new(TenantId::new("tenant-b"), TenantPermissions::full_access()); + + search + .create( + &tenant_a, + "Patient", + json!({ + "resourceType": "Patient", + "id": "tenant-a-patient", + }), + FhirVersion::default(), + ) + .await + .expect("seed tenant A search patient"); + + search + .create( + &tenant_b, + "Patient", + json!({ + "resourceType": "Patient", + "id": "tenant-b-patient", + }), + FhirVersion::default(), + ) + .await + .expect("seed tenant B search patient"); + + let composite_config = CompositeConfig::builder() + .primary("primary", BackendKind::Sqlite) + .search_backend("search", BackendKind::Sqlite) + .build() + .expect("build composite config"); + + let mut backends = HashMap::new(); + backends.insert("primary".to_string(), primary.clone() as DynStorage); + backends.insert("search".to_string(), search.clone() as DynStorage); + + let mut search_providers = HashMap::new(); + search_providers.insert("primary".to_string(), primary.clone() as DynSearchProvider); + search_providers.insert("search".to_string(), search.clone() as DynSearchProvider); + + let composite = CompositeStorage::new(composite_config, backends) + .expect("create composite storage") + .with_search_providers(search_providers) + .with_full_primary(primary.clone()); + + let query = SearchQuery::new("Patient").with_parameter(SearchParameter { + name: "_id".to_string(), + param_type: SearchParamType::Token, + modifier: None, + values: vec![SearchValue::eq("tenant-a-patient")], + chain: vec![], + components: vec![], + }); + + let tenant_a_result = composite + .search(&tenant_a, &query) + .await + .expect("tenant A composite search should succeed"); + assert_eq!(tenant_a_result.resources.len(), 1); + assert_eq!(tenant_a_result.resources.items[0].id(), "tenant-a-patient"); + + let tenant_b_result = composite + .search(&tenant_b, &query) + .await + .expect("tenant B composite search should succeed"); + assert!( + tenant_b_result.resources.is_empty(), + "delegated search must not leak tenant A data to tenant B" + ); + } + + #[test] + fn test_search_backend_failure_marks_backend_unhealthy() { + use std::collections::HashMap; + use std::sync::Arc; + + use crate::composite::config::HealthConfig; + + let primary = Arc::new(FailingSearchBackend { + backend_name: "primary", + error_message: "primary should not be used", + }); + let search = Arc::new(FailingSearchBackend { + backend_name: "search", + error_message: "simulated search outage", + }); + + let composite_config = CompositeConfig::builder() + .primary("primary", BackendKind::MongoDB) + .search_backend("search", BackendKind::Elasticsearch) + .with_health_config(HealthConfig { + failure_threshold: 1, + ..HealthConfig::default() + }) + .build() + .expect("build composite config"); + + let mut backends = HashMap::new(); + backends.insert("primary".to_string(), primary.clone() as DynStorage); + backends.insert("search".to_string(), search.clone() as DynStorage); + + let mut search_providers = HashMap::new(); + search_providers.insert("primary".to_string(), primary.clone() as DynSearchProvider); + search_providers.insert("search".to_string(), search.clone() as DynSearchProvider); + + let composite = CompositeStorage::new(composite_config, backends) + .expect("create composite storage") + .with_search_providers(search_providers); + + let tenant = TenantContext::new(TenantId::new("tenant-failure"), TenantPermissions::full_access()); + let query = SearchQuery::new("Patient").with_parameter(SearchParameter { + name: "_id".to_string(), + param_type: SearchParamType::Token, + modifier: None, + values: vec![SearchValue::eq("failure-patient")], + chain: vec![], + components: vec![], + }); + + let runtime = tokio::runtime::Builder::new_current_thread() + .enable_all() + .build() + .expect("build tokio runtime"); + let err = runtime + .block_on(composite.search(&tenant, &query)) + .expect_err("delegated search should fail when search backend is down"); + + assert!(matches!( + err, + StorageError::Backend(BackendError::ConnectionFailed { + backend_name, + message, + }) if backend_name == "search" && message.contains("simulated search outage") + )); + + let health = composite + .backend_health("search") + .expect("search backend health should exist"); + assert!(!health.healthy, "search backend should be marked unhealthy after failure"); + assert_eq!(health.failure_count, 1); + assert_eq!(health.last_error.as_deref(), Some("connection failed to search: simulated search outage")); + } } diff --git a/crates/persistence/tests/composite_routing_tests.rs b/crates/persistence/tests/composite_routing_tests.rs index 1b0f11a8..26f8e0bc 100644 --- a/crates/persistence/tests/composite_routing_tests.rs +++ b/crates/persistence/tests/composite_routing_tests.rs @@ -231,6 +231,70 @@ fn test_route_simple_query_to_primary() { assert!(routing.auxiliary_targets.is_empty()); } +/// Test routing a basic query in mongodb-elasticsearch mode. +#[test] +fn test_route_basic_query_mongodb_primary() { + let config = CompositeConfigBuilder::new() + .primary("mongodb", BackendKind::MongoDB) + .search_backend("elasticsearch", BackendKind::Elasticsearch) + .build() + .unwrap(); + + let router = QueryRouter::new(config); + + let query = SearchQuery::new("Patient").with_parameter(SearchParameter { + name: "identifier".to_string(), + param_type: SearchParamType::Token, + modifier: None, + values: vec![SearchValue::token( + Some("http://hospital.org/mrn"), + "MONGO-123", + )], + chain: vec![], + components: vec![], + }); + + let routing = router.route(&query).unwrap(); + + assert_eq!(routing.primary_target, "mongodb"); + assert!( + routing.auxiliary_targets.is_empty(), + "Basic token search should remain on mongodb primary" + ); +} + +/// Test routing full-text query to Elasticsearch in mongodb-elasticsearch mode. +#[test] +fn test_route_fulltext_query_mongodb_elasticsearch_split() { + let config = CompositeConfigBuilder::new() + .primary("mongodb", BackendKind::MongoDB) + .search_backend("elasticsearch", BackendKind::Elasticsearch) + .build() + .unwrap(); + + let router = QueryRouter::new(config); + + let query = SearchQuery::new("Patient").with_parameter(SearchParameter { + name: "_text".to_string(), + param_type: SearchParamType::String, + modifier: None, + values: vec![SearchValue::eq("congestive heart failure")], + chain: vec![], + components: vec![], + }); + + let routing = router.route(&query).unwrap(); + + assert_eq!(routing.primary_target, "mongodb"); + assert!( + routing + .auxiliary_targets + .values() + .any(|backend_id| backend_id == "elasticsearch"), + "Full-text search should route to Elasticsearch secondary" + ); +} + /// Test routing chained search to graph backend. #[test] fn test_route_chained_search_to_graph() { diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index fa518532..50fe15ee 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -11,15 +11,19 @@ use helios_fhir::FhirVersion; use helios_persistence::backends::mongodb::{MongoBackend, MongoBackendConfig}; use helios_persistence::core::{ - Backend, BackendCapability, BackendKind, ConditionalCreateResult, ConditionalDeleteResult, - ConditionalStorage, ConditionalUpdateResult, HistoryParams, InstanceHistoryProvider, - PatchFormat, ResourceStorage, SearchProvider, SystemHistoryProvider, TypeHistoryProvider, - VersionedStorage, + Backend, BackendCapability, BackendKind, BundleProvider, ConditionalCreateResult, + ConditionalDeleteResult, ConditionalStorage, ConditionalUpdateResult, HistoryParams, + InstanceHistoryProvider, PatchFormat, ResourceStorage, SearchProvider, SystemHistoryProvider, + TypeHistoryProvider, VersionedStorage, +}; +use helios_persistence::error::{ + BackendError, ConcurrencyError, ResourceError, StorageError, TransactionError, }; -use helios_persistence::error::{BackendError, ConcurrencyError, ResourceError, StorageError}; use helios_persistence::search::SearchParameterStatus; use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; use helios_persistence::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue, SortDirective}; +use mongodb::bson::{Document, doc}; +use mongodb::Client; use serde_json::json; const MONGODB_MAX_DATABASE_NAME_LEN: usize = 63; @@ -104,6 +108,30 @@ fn test_mongodb_phase4_capabilities() { assert!(!backend.supports(BackendCapability::Transactions)); } +#[tokio::test] +async fn test_mongodb_bundle_provider_transaction_not_supported() { + let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); + let tenant = create_tenant("tenant-bundle-transaction"); + + let result = backend.process_transaction(&tenant, vec![]).await; + assert!(matches!( + result, + Err(TransactionError::UnsupportedIsolationLevel { .. }) + )); +} + +#[tokio::test] +async fn test_mongodb_bundle_provider_batch_not_supported() { + let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); + let tenant = create_tenant("tenant-bundle-batch"); + + let result = backend.process_batch(&tenant, vec![]).await; + assert!(matches!( + result, + Err(StorageError::Backend(BackendError::UnsupportedCapability { .. })) + )); +} + fn test_mongo_url() -> Option { std::env::var("HFS_TEST_MONGODB_URL").ok() } @@ -113,11 +141,19 @@ fn create_tenant(tenant_id: &str) -> TenantContext { } async fn create_backend(test_name: &str) -> Option { + create_backend_with_search_offloaded(test_name, false).await +} + +async fn create_backend_with_search_offloaded( + test_name: &str, + search_offloaded: bool, +) -> Option { let connection_string = test_mongo_url()?; let config = MongoBackendConfig { connection_string, database_name: build_test_database_name(test_name), + search_offloaded, ..Default::default() }; @@ -131,6 +167,28 @@ async fn create_backend(test_name: &str) -> Option { Some(backend) } +async fn search_index_entry_count( + backend: &MongoBackend, + tenant: &TenantContext, + resource_type: &str, + resource_id: &str, +) -> u64 { + let client = Client::with_uri_str(&backend.config().connection_string) + .await + .expect("failed to connect MongoDB client for search_index assertions"); + let database = client.database(&backend.config().database_name); + let search_index = database.collection::("search_index"); + + search_index + .count_documents(doc! { + "tenant_id": tenant.tenant_id().as_str(), + "resource_type": resource_type, + "resource_id": resource_id, + }) + .await + .expect("failed to count search_index entries") +} + #[tokio::test] async fn mongodb_integration_create_read_update_delete() { let Some(backend) = create_backend("crud").await else { @@ -1147,3 +1205,160 @@ async fn mongodb_integration_search_parameter_delete_unregisters() { "Deleted SearchParameter should be unregistered" ); } + +#[tokio::test] +async fn mongodb_integration_search_offloaded_prevents_search_index_writes() { + let Some(backend) = create_backend_with_search_offloaded("search_offloaded_no_index", true).await else { + eprintln!( + "Skipping mongodb_integration_search_offloaded_prevents_search_index_writes (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-offloaded"); + + let created = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "mongo-offloaded-patient", + "name": [{"family": "Offloaded"}], + "identifier": [{"system": "http://hospital.org/mrn", "value": "OFFLOADED-1"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let resource_id = created.id().to_string(); + + let after_create = search_index_entry_count(&backend, &tenant, "Patient", &resource_id).await; + assert_eq!( + after_create, 0, + "search_index should remain empty when search_offloaded=true (create)" + ); + + let updated = backend + .update( + &tenant, + &created, + json!({ + "resourceType": "Patient", + "id": "mongo-offloaded-patient", + "name": [{"family": "StillOffloaded"}], + "identifier": [{"system": "http://hospital.org/mrn", "value": "OFFLOADED-1"}] + }), + ) + .await + .unwrap(); + + let after_update = search_index_entry_count(&backend, &tenant, "Patient", &resource_id).await; + assert_eq!( + after_update, 0, + "search_index should remain empty when search_offloaded=true (update)" + ); + + backend + .delete(&tenant, "Patient", updated.id()) + .await + .unwrap(); + + let after_delete = search_index_entry_count(&backend, &tenant, "Patient", &resource_id).await; + assert_eq!( + after_delete, 0, + "search_index should remain empty when search_offloaded=true (delete)" + ); +} + +#[tokio::test] +async fn mongodb_integration_standalone_search_writes_search_index() { + let Some(backend) = create_backend("search_index_written_standalone").await else { + eprintln!( + "Skipping mongodb_integration_standalone_search_writes_search_index (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-standalone"); + + let created = backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "mongo-standalone-patient", + "name": [{"family": "Indexed"}], + "identifier": [{"system": "http://hospital.org/mrn", "value": "INDEXED-1"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let count = search_index_entry_count(&backend, &tenant, "Patient", created.id()).await; + assert!( + count > 0, + "search_index should contain entries in standalone mode" + ); +} + +#[tokio::test] +async fn mongodb_integration_search_parameter_registry_updates_when_offloaded() { + let Some(backend) = create_backend_with_search_offloaded("search_param_offloaded_registry", true).await else { + eprintln!( + "Skipping mongodb_integration_search_parameter_registry_updates_when_offloaded (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-search-param-offloaded"); + + let created = backend + .create( + &tenant, + "SearchParameter", + json!({ + "resourceType": "SearchParameter", + "id": "mongo-offloaded-search-param", + "url": "http://example.org/fhir/SearchParameter/mongo-offloaded-search-param", + "name": "MongoOffloadedSearchParam", + "status": "active", + "code": "mongo-offloaded-code", + "base": ["Patient"], + "type": "token", + "expression": "Patient.identifier" + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let registry = backend.search_registry().read(); + let param = registry.get_param("Patient", "mongo-offloaded-code"); + assert!(param.is_some(), "Active SearchParameter should register when offloaded"); + assert_eq!(param.unwrap().status, SearchParameterStatus::Active); + drop(registry); + + let search_index_count = + search_index_entry_count(&backend, &tenant, "SearchParameter", created.id()).await; + assert_eq!( + search_index_count, 0, + "SearchParameter resources should not write Mongo search_index when offloaded" + ); + + backend + .delete(&tenant, "SearchParameter", created.id()) + .await + .unwrap(); + + let registry = backend.search_registry().read(); + assert!( + registry + .get_param("Patient", "mongo-offloaded-code") + .is_none(), + "Deleted SearchParameter should unregister when offloaded" + ); +} diff --git a/crates/rest/src/config.rs b/crates/rest/src/config.rs index 928f9c04..f55b265f 100644 --- a/crates/rest/src/config.rs +++ b/crates/rest/src/config.rs @@ -63,6 +63,11 @@ pub enum StorageBackendMode { /// PostgreSQL for CRUD + Elasticsearch for search. /// Requires running PostgreSQL and Elasticsearch instances. PostgresElasticsearch, + /// MongoDB only. Requires a running MongoDB instance. + MongoDB, + /// MongoDB for CRUD + Elasticsearch for search. + /// Requires running MongoDB and Elasticsearch instances. + MongoDBElasticsearch, } impl fmt::Display for StorageBackendMode { @@ -74,6 +79,8 @@ impl fmt::Display for StorageBackendMode { StorageBackendMode::PostgresElasticsearch => { write!(f, "postgres-elasticsearch") } + StorageBackendMode::MongoDB => write!(f, "mongodb"), + StorageBackendMode::MongoDBElasticsearch => write!(f, "mongodb-elasticsearch"), } } } @@ -89,8 +96,12 @@ impl FromStr for StorageBackendMode { "postgres-elasticsearch" | "postgres-es" | "pg-elasticsearch" | "pg-es" => { Ok(StorageBackendMode::PostgresElasticsearch) } + "mongodb" | "mongo" => Ok(StorageBackendMode::MongoDB), + "mongodb-elasticsearch" | "mongodb-es" | "mongo-elasticsearch" | "mongo-es" => { + Ok(StorageBackendMode::MongoDBElasticsearch) + } _ => Err(format!( - "Invalid storage backend '{}'. Valid values: sqlite, sqlite-elasticsearch, postgres, postgres-elasticsearch", + "Invalid storage backend '{}'. Valid values: sqlite, sqlite-elasticsearch, postgres, postgres-elasticsearch, mongodb, mongodb-elasticsearch", s )), } @@ -299,12 +310,14 @@ pub struct ServerConfig { #[arg(long, env = "HFS_MAX_PAGE_SIZE", default_value = "1000")] pub max_page_size: usize, - /// Storage backend mode: sqlite (default), sqlite-elasticsearch, postgres, or postgres-elasticsearch. + /// Storage backend mode: sqlite (default), sqlite-elasticsearch, postgres, + /// postgres-elasticsearch, mongodb, or mongodb-elasticsearch. #[arg(long, env = "HFS_STORAGE_BACKEND", default_value = "sqlite")] pub storage_backend: String, /// Elasticsearch node URLs (comma-separated). - /// Used when storage_backend is sqlite-elasticsearch or postgres-elasticsearch. + /// Used when storage_backend is sqlite-elasticsearch, postgres-elasticsearch, + /// or mongodb-elasticsearch. #[arg( long, env = "HFS_ELASTICSEARCH_NODES", @@ -624,6 +637,34 @@ mod tests { .unwrap(), StorageBackendMode::PostgresElasticsearch ); + assert_eq!( + "mongodb".parse::().unwrap(), + StorageBackendMode::MongoDB + ); + assert_eq!( + "mongo".parse::().unwrap(), + StorageBackendMode::MongoDB + ); + assert_eq!( + "MONGODB".parse::().unwrap(), + StorageBackendMode::MongoDB + ); + assert_eq!( + "mongodb-elasticsearch" + .parse::() + .unwrap(), + StorageBackendMode::MongoDBElasticsearch + ); + assert_eq!( + "mongo-es".parse::().unwrap(), + StorageBackendMode::MongoDBElasticsearch + ); + assert_eq!( + "mongodb_elasticsearch" + .parse::() + .unwrap(), + StorageBackendMode::MongoDBElasticsearch + ); assert!("invalid".parse::().is_err()); } @@ -639,6 +680,11 @@ mod tests { StorageBackendMode::PostgresElasticsearch.to_string(), "postgres-elasticsearch" ); + assert_eq!(StorageBackendMode::MongoDB.to_string(), "mongodb"); + assert_eq!( + StorageBackendMode::MongoDBElasticsearch.to_string(), + "mongodb-elasticsearch" + ); } #[test] diff --git a/phase5_roadmap.xml b/phase5_roadmap.xml new file mode 100644 index 00000000..b2f37418 --- /dev/null +++ b/phase5_roadmap.xml @@ -0,0 +1,401 @@ + + + + HeliosSoftware/hfs + completed + TBD + 5 + + + + 2026-03-09 + Phase 5 is completed: composite MongoDB primary plus Elasticsearch secondary + routing, search_offloaded duplicate-index prevention, shared SearchParameter registry + wiring, mongodb-elasticsearch runtime startup, and composite tenant/failure coverage are + implemented and verified. Feature-gated cargo check commands and focused composite tests + passed; workspace-wide cargo fmt --all -- --check still reports unrelated pre-existing + formatting drift outside the Phase 5 files. + + + + + + MongoDB native search indexing and SearchProvider execution are implemented for the + Phase 4 supported query surface. + Conditional create, conditional update, and conditional delete semantics are + implemented and covered in Mongo integration tests. + SearchParameter lifecycle hooks update registry state on create, update, and delete + operations. + Mongo capabilities and tests were updated to truthfully represent implemented search + and pagination support. + + + Composite MongoDB plus Elasticsearch ownership boundaries are not yet + implemented for production routing paths. + search_offloaded behavior exists but requires strict validation to prevent + duplicate or stale indexing in composite mode. + Shared SearchParameterRegistry and extractor initialization between primary and + secondary backends must be deterministic. + Composite runtime startup and configuration validation must fail fast on invalid + mixed-backend setups. + + + + + + Deliver a robust composite mode where MongoDB owns canonical writes and reads while + Elasticsearch owns search execution, mirroring established sqlite-elasticsearch and + postgres-elasticsearch operating patterns. + + Define and implement explicit ownership boundaries for write-primary, + read-primary, and search-secondary flows in MongoDB plus Elasticsearch composition. + Harden search_offloaded behavior so Mongo does not maintain duplicate + native search indexes when Elasticsearch is configured as secondary. + Ensure shared SearchParameter registry and extraction setup is consistent + across MongoDB and Elasticsearch initialization and runtime updates. + Implement startup and provider wiring for mongodb-elasticsearch mode with + feature-gated, deterministic initialization and clear failure behavior. + Align tests, capability declarations, and docs with actual composite + behavior so support claims stay truthful. + + + Implementing new advanced search features beyond the Phase 4 supported search surface. + Changing MongoDB standalone search behavior outside composite-offload requirements. + Implementing full bundle transaction semantics beyond current backend capabilities. + Expanding deployment guidance beyond composite operational requirements needed for this + phase. + Introducing new database-per-tenant architecture changes. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Define and enforce which backend owns each operation path so composite behavior is + deterministic and parity-aligned with existing primary-secondary patterns. + + Document operation ownership matrix for + create/update/delete/read/history/search/count/conditional operations in MongoDB plus + Elasticsearch mode. + Implement provider delegation rules in composite backend construction and + dispatch paths. + Ensure conditional operation matching semantics delegate to the designated + search owner without changing write ownership. + Define and document expected consistency model for Mongo canonical storage + versus Elasticsearch search visibility. + + + Composite ownership matrix with deterministic routing semantics. + Implemented delegation paths for Mongo primary and Elasticsearch + search-secondary behavior. + + + + + Guarantee that enabling offloaded search prevents duplicate Mongo native indexing while + preserving correct write and SearchParameter lifecycle behavior. + + Enforce search_offloaded checks across Mongo create/update/delete indexing + hooks with no silent bypasses. + Validate SearchParameter create/update/delete handling remains correct when + search indexing ownership is offloaded. + Add assertions and tests that Mongo search_index collection is not written + in composite offload mode. + Preserve standalone Mongo mode behavior to avoid regressions in native + search scenarios. + + + Offload-safe Mongo indexing lifecycle with explicit duplicate-prevention + guarantees. + Regression coverage for standalone and offloaded mode behavior split. + + + + + Provide feature-gated startup and configuration wiring for mongodb-elasticsearch mode + that is consistent with existing composite backend startup patterns. + + Implement composite backend factory path for Mongo primary plus + Elasticsearch secondary in persistence wiring. + If phase scope includes runtime mode exposure, add mongodb-elasticsearch + mode parsing and startup path in REST/HFS configuration layers. + Define configuration validation and startup error behavior for missing or + incompatible Mongo/Elasticsearch settings. + Ensure feature-gated compile behavior remains deterministic for mongodb, + elasticsearch, and combined feature sets. + + + Composite backend startup path for mongodb-elasticsearch mode. + Clear startup validation behavior and errors for mixed backend + misconfiguration. + + + + + Keep SearchParameter registry and extraction semantics synchronized between MongoDB and + Elasticsearch components in composite mode. + + Share or synchronize SearchParameterRegistry initialization so both + providers use the same active parameter definitions. + Ensure extractor behavior used for indexing is consistent with registry + state across both backend components. + Define ordering guarantees for SearchParameter lifecycle updates relative + to indexing operations in composite mode. + Add tests covering active/draft/retired/delete SearchParameter transitions + and composite search visibility implications. + + + Deterministic registry/extractor synchronization between Mongo and + Elasticsearch components. + SearchParameter lifecycle parity tests for composite mode. + + + + + Prove composite behavior through integration and parity tests focused on routing + correctness, consistency expectations, and failure surfaces. + + Add integration tests for write-primary/read-primary/search-secondary + routing correctness. + Add result consistency tests verifying search output aligns with canonical + Mongo resources under expected refresh semantics. + Add negative tests for secondary outage and startup failure behavior, + including expected error propagation. + Add tenant-isolation tests ensuring composite routing never leaks + cross-tenant data. + Extend shared harness/capability tests so mongodb-elasticsearch mode is + exercised similarly to existing composite modes. + + + Composite integration suite covering routing, consistency boundaries, and + failure behavior. + Parity-oriented harness coverage for mongodb-elasticsearch mode. + + + + + Ensure capability declarations and documentation reflect the delivered composite + behavior with no aspirational mismatch. + + Update capability matrix entries for MongoDB and composite mode support + levels after composite behavior is validated. + Update persistence README role matrix and operational notes for Mongo + primary plus Elasticsearch secondary mode. + Update roadmap_mongo.xml progress/status text to mark Phase 5 complete when + exit criteria are met. + Record known limitations and deferred capabilities clearly in roadmap and + docs. + + + Capability and documentation artifacts synchronized with tested composite + behavior. + + + + + + + Composite provider-construction tests verify Mongo primary and Elasticsearch + secondary wiring selection. + Delegation tests verify CRUD/read/history routes remain on Mongo while search + routes delegate to Elasticsearch. + search_offloaded guard tests verify Mongo indexing hooks are bypassed only in + offloaded mode. + SearchParameter registry initialization tests verify shared state and + deterministic update ordering. + Configuration validation tests verify startup errors for invalid mixed-backend + settings. + + + + Composite create/read/search round-trip validates + write-primary/read-primary/search-secondary routing. + Conditional operation tests in composite mode validate deterministic matching + behavior and correct ownership split. + Mongo search_index duplicate-prevention tests assert no local indexing when + search is offloaded. + SearchParameter lifecycle tests verify active/draft/retired/delete behavior + remains coherent across composite components. + Tenant-isolation composite tests validate no cross-tenant leakage in delegated + search paths. + Secondary failure tests validate expected startup/runtime error handling + behavior. + + + + Mirror sqlite-elasticsearch and postgres-elasticsearch routing expectations for + equivalent operations. + Do not claim mongodb-elasticsearch support as implemented until routing and consistency + tests pass. + Keep unsupported advanced search capabilities explicitly documented as planned or + partial. + + + + + cargo check -p helios-persistence --features "mongodb,elasticsearch" + cargo check -p helios-rest --features "mongodb,elasticsearch" + cargo check -p helios-hfs --features "mongodb,elasticsearch" + cargo check -p helios-persistence --features + "sqlite,postgres,mongodb,elasticsearch" + cargo test -p helios-persistence --features "mongodb,elasticsearch" --test + mongodb_tests + cargo test -p helios-persistence --features "mongodb,elasticsearch" --test + elasticsearch_tests + cargo test -p helios-persistence --features "mongodb,elasticsearch" + composite:: + cargo fmt --all -- --check + + + + + WS1.1-WS1.4, WS3.1 + Composite delegation rules are implemented and ownership matrix is documented. + + + + WS2.1-WS2.4, UT3, IT3 + Mongo duplicate indexing is prevented in offloaded mode without standalone-mode + regressions. + + + + WS4.1-WS4.4, UT4, IT4 + SearchParameter registry and extraction behavior are synchronized across composite + components. + + + + WS5.1-WS5.5, IT1, IT2, IT5, IT6 + Composite routing, tenant isolation, consistency boundaries, and failure behavior + are validated. + + + + WS6.1-WS6.4 and validation command execution + Capability matrix, README, and roadmap_mongo.xml are synchronized with delivered + Phase 5 behavior. + + + + + + + + + + + + + + + + + + + + + Create and maintain phase5_roadmap.xml as the detailed execution artifact for MongoDB + Phase 5. + Update roadmap_mongo.xml Phase 5 status and progress text when exit criteria are met. + Update crates/persistence/README.md capability and role matrix entries for + mongodb-elasticsearch mode. + Document expected consistency boundaries and search visibility timing for composite mode + operations. + Keep deferred advanced search features explicitly marked as partial or planned. + + + + + Incorrect operation ownership routing could cause stale reads, wrong provider + execution, or semantic drift from established composite modes. + Define explicit ownership matrix first and enforce via unit plus integration + delegation tests. + + + Duplicate indexing between Mongo native search paths and Elasticsearch offloaded + search could increase write cost and return inconsistent results. + Harden search_offloaded guards and add explicit duplicate-prevention tests. + + + SearchParameter registry divergence between composite components could produce + extraction/query mismatches. + Share or synchronize registry initialization and validate lifecycle transitions + with composite tests. + + + Runtime mode wiring can fail due to incomplete feature-gate combinations or + configuration mismatches. + Add startup validation and mixed-feature compile checks in validation commands. + + + Capability/docs drift may overstate support before composite behavior is + validated. + Gate capability and roadmap updates on passing composite routing and consistency + tests. + + + + + Composite routing enforces Mongo primary ownership for writes/reads and + Elasticsearch ownership for search in mongodb-elasticsearch mode. + search_offloaded behavior prevents duplicate Mongo indexing when + Elasticsearch secondary is configured. + Shared SearchParameter registry and extractor initialization are + deterministic and validated by lifecycle tests. + Composite tests pass for routing correctness, result consistency + boundaries, tenant isolation, and failure behavior. + Startup path for mongodb-elasticsearch mode is implemented and + feature-gated for relevant crates. + Capability matrix, README, and roadmap_mongo.xml reflect the same tested + post-Phase-5 support state. + Feature-gated mongodb-elasticsearch and mixed-feature cargo check + commands passed, and focused composite routing/tenant/failure tests passed. Workspace-wide + cargo fmt --all -- --check remains blocked by unrelated pre-existing formatting drift outside + the Phase 5 files. + + \ No newline at end of file diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml index 96ee44b2..9b30eada 100644 --- a/roadmap_mongo.xml +++ b/roadmap_mongo.xml @@ -2,15 +2,19 @@ HeliosSoftware/hfs - in-progress + completed TBD date-agnostic 2026-03-09 - Phases 1 through 4 are completed: backend wiring, core storage parity, version/history - semantics, and Phase 4 search/indexing plus conditional create/update/delete coverage are now - implemented and validated in MongoDB integration tests. + Phases 1 through 6 are completed: backend wiring, core storage parity, + version/history semantics, native Mongo search plus conditional operations, composite + MongoDB + Elasticsearch integration, and runtime HFS_STORAGE_BACKEND wiring/documentation + are implemented. Feature-gated cargo check commands and focused composite tests passed; + workspace-wide cargo fmt --all -- --check still reports unrelated pre-existing formatting + drift outside the Mongo Phase 5 files. - SQLite primary, PostgreSQL primary, Elasticsearch secondary + SQLite primary, PostgreSQL primary, MongoDB primary, Elasticsearch + secondary, MongoDB + Elasticsearch composite @@ -157,7 +161,7 @@ - + Provide robust primary-secondary mode mirroring sqlite-elasticsearch and postgres-elasticsearch. @@ -173,12 +177,13 @@ Phase 4 search model clarity. - Composite tests verify routing and result consistency for Mongo + Elasticsearch. + Composite tests verify routing, delegated tenant isolation, and failure behavior + for Mongo + Elasticsearch. Startup path for mongo-elasticsearch mode is implemented and feature-gated. - + Expose Mongo modes to HFS runtime and document operational guidance. Add StorageBackendMode values for mongodb and mongodb-elasticsearch in @@ -195,7 +200,8 @@ HFS_STORAGE_BACKEND accepts mongodb and mongodb-elasticsearch values. - All relevant docs and examples are consistent with actual implementation. + All relevant docs and examples are consistent with actual implementation, aside + from unrelated repo-wide formatting drift outside the Mongo phase files. From b07e2da89b467a2f530c77af521266c96385b080 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Mon, 9 Mar 2026 22:26:13 -0400 Subject: [PATCH 13/17] docs(roadmap): mark MongoDB phases 5-6 as completed and update documentation Update ROADMAP.md to reflect MongoDB standalone and MongoDB+Elasticsearch as shipped persistence options. Update persistence README with MongoDB runtime configuration examples, architecture tree, and search offloading documentation. Add phase6_roadmap.xml as detailed closure artifact. Update roadmap_mongo.xml to reference completed Phase 5/6 artifacts and synchronize progress statements across all roadmap documents. --- ROADMAP.md | 38 ++++--- crates/persistence/README.md | 81 +++++++++++++- phase6_roadmap.xml | 200 +++++++++++++++++++++++++++++++++++ roadmap_mongo.xml | 26 +++-- 4 files changed, 318 insertions(+), 27 deletions(-) create mode 100644 phase6_roadmap.xml diff --git a/ROADMAP.md b/ROADMAP.md index fb211a9d..4ce1a7a6 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1,6 +1,5 @@ # Helios FHIR Server — Roadmap -> > This document outlines the development direction for the Helios FHIR Server. It is organized into three horizons — **Now**, **Next**, and **Later** — to set expectations without overpromising timelines. Items may shift between horizons as priorities evolve based on community feedback, production needs, and contributor availability. > > Want to influence the roadmap? Join our [weekly developer meeting](#community) or comment on a [GitHub Discussion](https://github.com/HeliosSoftware/hfs/discussions). @@ -16,12 +15,16 @@ These capabilities are available today in the current release. - [FHIR REST API server](crates/hfs/README.md) with CRUD operations, search, history, and batch/transaction support **Persistence** + - [SQLite as a primary store](crates/persistence/README.md#sqlite-default) - [SQLite as a primary store with Elasticsearch as a query secondary](crates/persistence/README.md#sqlite--elasticsearch) - [PostgreSQL as a primary store](crates/persistence/README.md#postgresql) - [PostgreSQL as a primary store with Elasticsearch as a query secondary](crates/persistence/README.md#postgresql--elasticsearch) +- [MongoDB as a primary store](crates/persistence/README.md#mongodb) +- [MongoDB as a primary store with Elasticsearch as a query secondary](crates/persistence/README.md#mongodb--elasticsearch) **Analytics & Tooling** + - [SQL on FHIR](crates/sof/README.md) — CLI and HTTP server - [FHIRPath expression engine](crates/fhirpath/README.md) — CLI and HTTP server - [Python bindings (pysof)](crates/pysof/README.md) @@ -32,14 +35,13 @@ These capabilities are available today in the current release. Work that is currently underway or planned for the near term. -| Area | Item | Status | -|------|------|--------| -| **Compliance** | Audit logging (AuditEvent resource support) | 🔵 Design | -| **Standards** | FHIR Validation engine | 🔵 Design | -| **Standards** | Authentication & Authorization | 🔵 Design | -| **Documentation** | Project documentation website | 🔵 Design | -| **Persistence** | MongoDB as a primary store | 🟡 In progress | -| **Persistence** | S3 as a primary store | 🟡 In progress | +| Area | Item | Status | +| ----------------- | ------------------------------------------- | -------------- | +| **Compliance** | Audit logging (AuditEvent resource support) | 🔵 Design | +| **Standards** | FHIR Validation engine | 🔵 Design | +| **Standards** | Authentication & Authorization | 🔵 Design | +| **Documentation** | Project documentation website | 🔵 Design | +| **Persistence** | S3 as a primary store | 🟡 In progress | ### Discussion Documents @@ -57,19 +59,22 @@ We are actively developing community discussion documents on the following topic These items are well-understood and will be picked up once current work completes. ### FHIR Server Capabilities + - **Bulk Data API** — Import and export (`$export` / `$import` operations) - **FHIR Subscriptions** — Topic-based notification support - **Terminology Server** — CodeSystem `$lookup`, ValueSet `$expand`, ConceptMap `$translate` - **SMART on FHIR** — Full launch framework and scoped access -- **SQL on FHIR** — [SQL on FHIR operations](https://sql-on-fhir.org/ig/latest/operations.html) - using read-only database connections +- **SQL on FHIR** — [SQL on FHIR operations](https://sql-on-fhir.org/ig/latest/operations.html) - using read-only database connections ### Persistence Backends + - Cassandra as a primary store - ClickHouse as a primary store - S3 with Elasticsearch as a query secondary - Cassandra with Elasticsearch as a query secondary ### Developer Experience + - **Administrative UI** — Web-based management console for server configuration and monitoring - **MCP Server for FHIR API** — Model Context Protocol integration for the FHIR REST API - **MCP Server for SQL on FHIR** — Model Context Protocol integration for analytics workflows @@ -82,11 +87,14 @@ These items are well-understood and will be picked up once current work complete Longer-term ideas we are exploring. These are not yet committed and may evolve significantly based on community input. ### Advanced Persistence + - Neo4j as a primary store - PostgreSQL with Neo4j as a graph query secondary ### Persistence Advisor + An intelligent recommendation engine for storage configuration: + - Analyze a FHIR query and recommend an optimal persistence configuration - Leverage historical benchmark data to inform recommendations - Web UI for interactive configuration guidance @@ -95,10 +103,10 @@ An intelligent recommendation engine for storage configuration: ## Status Legend -| Icon | Meaning | -|------|---------| -| 🟡 | In progress — actively being developed | -| 🔵 | Design — in planning or community discussion phase | +| Icon | Meaning | +| ---- | -------------------------------------------------- | +| 🟡 | In progress — actively being developed | +| 🔵 | Design — in planning or community discussion phase | --- @@ -119,4 +127,4 @@ We welcome contributors and feedback at every level — from opening issues to j --- -*This roadmap is a living document. It does not represent a commitment or guarantee to deliver any feature by any particular date. Items may be reprioritized based on community needs, production feedback, and resource availability.* +_This roadmap is a living document. It does not represent a commitment or guarantee to deliver any feature by any particular date. Items may be reprioritized based on community needs, production feedback, and resource availability._ diff --git a/crates/persistence/README.md b/crates/persistence/README.md index dc5becae..d45df2ff 100644 --- a/crates/persistence/README.md +++ b/crates/persistence/README.md @@ -129,9 +129,11 @@ helios-persistence/ │ │ │ └── search/ # Search query building │ │ │ ├── query_builder.rs # SQL with $N params, ILIKE, TIMESTAMPTZ │ │ │ └── writer.rs # Search index writer -│ │ ├── mongodb/ # MongoDB backend (phase 1 scaffold) +│ │ ├── mongodb/ # MongoDB primary backend │ │ │ ├── backend.rs # MongoBackend + MongoBackendConfig -│ │ │ ├── schema.rs # Schema/index bootstrap placeholders +│ │ │ ├── schema.rs # Schema/index bootstrap helpers +│ │ │ ├── search_impl.rs # SearchProvider implementation +│ │ │ ├── storage.rs # ResourceStorage/history/versioning implementation │ │ │ └── mod.rs # Module wiring and re-exports │ │ └── elasticsearch/ # Search-optimized secondary backend │ │ ├── backend.rs # ElasticsearchBackend with config @@ -515,6 +517,73 @@ HFS_ELASTICSEARCH_NODES=http://localhost:9200 \ ./target/release/hfs ``` +### MongoDB + +MongoDB provides document-centric primary storage with tenant-aware CRUD, version/history support, optimistic locking, the Phase 4 supported native search surface, and conditional create/update/delete. + +- Full CRUD operations with document-native resource storage +- Versioning and history providers (`vread`, instance/type/system history) +- Conditional create, update, and delete for the implemented search surface +- Offset and cursor pagination plus single- and multi-field sorting +- Shared-schema multitenancy with strict tenant filtering + +**Prerequisites:** A running MongoDB instance (standalone for basic deployments, replica set/sharded topology if you want Mongo transactions where topology permits). + +```bash +# Build with MongoDB support +cargo build --bin hfs --features mongodb --release + +# Start MongoDB (example using Docker) +docker run -d --name mongo -p 27017:27017 \ + mongo:8.0 + +# Start the server +HFS_STORAGE_BACKEND=mongodb \ +HFS_DATABASE_URL="mongodb://localhost:27017" \ +HFS_MONGODB_DATABASE=helios \ + ./target/release/hfs +``` + +MongoDB runtime configuration also supports: + +- `HFS_MONGODB_URL` or `HFS_MONGODB_URI` as preferred connection-string inputs +- `HFS_MONGODB_DATABASE` to select the database name (default: `helios`) +- `HFS_MONGODB_MAX_CONNECTIONS` to control the driver pool size (default: `10`) +- `HFS_MONGODB_CONNECT_TIMEOUT_MS` to control the connection timeout (default: `5000`) + +### MongoDB + Elasticsearch + +MongoDB remains the canonical write/read store while Elasticsearch owns delegated search execution. This mode mirrors the existing SQLite + Elasticsearch and PostgreSQL + Elasticsearch composite patterns. + +- MongoDB handles CRUD, versioning, history, and conditional write behavior +- Elasticsearch handles delegated search queries, including full-text search +- MongoDB search index population is automatically disabled via `search_offloaded` +- Composite routing preserves MongoDB as the source of truth for reads and writes + +**Prerequisites:** Running MongoDB and Elasticsearch 8.x instances. + +```bash +# Build with MongoDB and Elasticsearch support +cargo build --bin hfs --features mongodb,elasticsearch --release + +# Start MongoDB (example using Docker) +docker run -d --name mongo -p 27017:27017 \ + mongo:8.0 + +# Start Elasticsearch (example using Docker) +docker run -d --name es -p 9200:9200 \ + -e "discovery.type=single-node" \ + -e "xpack.security.enabled=false" \ + elasticsearch:8.15.0 + +# Start the server +HFS_STORAGE_BACKEND=mongodb-elasticsearch \ +HFS_DATABASE_URL="mongodb://localhost:27017" \ +HFS_MONGODB_DATABASE=helios \ +HFS_ELASTICSEARCH_NODES=http://localhost:9200 \ + ./target/release/hfs +``` + ### How Search Offloading Works When Elasticsearch is configured as a search secondary, the primary backend automatically disables its own search index population. This applies to both SQLite + Elasticsearch and MongoDB + Elasticsearch composite configurations. For a SQLite + Elasticsearch configuration: @@ -524,6 +593,13 @@ When Elasticsearch is configured as a search secondary, the primary backend auto - Elasticsearch handles all search indexing and query execution - The composite storage layer routes search operations to Elasticsearch +For a MongoDB + Elasticsearch configuration: + +- MongoDB stores only canonical FHIR resources and version/history documents +- MongoDB does **not** maintain native search index documents while search is offloaded +- Elasticsearch handles delegated search indexing and query execution +- The composite storage layer routes search operations to Elasticsearch while MongoDB remains the write/read primary + This is controlled by the `search_offloaded` flag on the primary backend, which the composite layer sets automatically when a search secondary is configured. ### Composite Usage @@ -745,6 +821,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] MongoDB Phase 3 versioning/history plus best-effort session-backed consistency - [x] MongoDB Phase 4 native search, pagination/sorting, and conditional create/update/delete - [x] MongoDB Phase 5 composite MongoDB + Elasticsearch integration and runtime wiring +- [x] MongoDB Phase 6 runtime wiring verification, documentation sync, and release-readiness validation - [ ] Neo4j backend (graph queries, Cypher) - [ ] S3 backend (bulk export, object storage) diff --git a/phase6_roadmap.xml b/phase6_roadmap.xml new file mode 100644 index 00000000..6c506077 --- /dev/null +++ b/phase6_roadmap.xml @@ -0,0 +1,200 @@ + + + + HeliosSoftware/hfs + completed + TBD + 6 + + + + 2026-03-09 + Phase 6 is completed: MongoDB runtime mode exposure in REST/HFS is present, operator-facing MongoDB standalone and mongodb-elasticsearch documentation is synchronized, the top-level roadmap now reflects shipped Mongo support, and focused runtime/config validation is defined for release-readiness closure. + + + + + + MongoDB primary plus Elasticsearch secondary composite routing is implemented. + search_offloaded duplicate-index prevention and shared SearchParameter registry wiring are implemented. + mongodb-elasticsearch runtime startup exists in `crates/hfs/src/main.rs`. + Composite tests and capability updates established truthful post-Phase-5 support boundaries. + + + Top-level operator docs still needed to reflect the shipped Mongo runtime surface accurately. + A dedicated detailed Phase 6 artifact did not yet exist in the repository. + Release-readiness needed to be framed as runtime/config validation and documentation truthfulness, not new Mongo capability work. + + + + + + Close out MongoDB delivery by validating the already-implemented runtime modes, synchronizing documentation and roadmap artifacts, and defining a focused validation bar for Mongo standalone and MongoDB plus Elasticsearch operation. + + Confirm `HFS_STORAGE_BACKEND` supports `mongodb` and `mongodb-elasticsearch` through the REST configuration layer and HFS startup dispatch. + Document MongoDB standalone and MongoDB plus Elasticsearch operator flows with accurate environment variables, feature flags, and runtime examples. + Align `ROADMAP.md`, `crates/persistence/README.md`, and `roadmap_mongo.xml` so they describe the same delivered Mongo state. + Record a dedicated completed `phase6_roadmap.xml` artifact for future reference and phase traceability. + Define and execute focused validation for Mongo runtime/config surfaces as the Phase 6 release-readiness bar. + + + New Mongo CRUD, history, search, or conditional capability implementation. + Conditional patch support. + Full transaction bundle parity for MongoDB. + Advanced search features beyond the Phase 4/5 supported surface. + New composite backend combinations outside MongoDB plus Elasticsearch. + + + + + + + + + + + + + + + + + + + + + Verify that the MongoDB runtime surface already present in code is accurately represented and remains feature-gated with clear operator expectations. + + Confirm `StorageBackendMode` includes `mongodb` and `mongodb-elasticsearch` parsing/display behavior in `crates/rest/src/config.rs`. + Confirm `crates/hfs/src/main.rs` dispatches to `start_mongodb` and `start_mongodb_elasticsearch`. + Confirm composite startup requires Elasticsearch node configuration and preserves MongoDB as the primary store. + Treat feature-gated fallback messages as part of the Phase 6 runtime contract. + + + Verified runtime-mode surface for standalone and composite MongoDB operation. + + + + + Make the repository docs truthful and usable for MongoDB standalone and composite deployment modes. + + Update the persistence architecture tree to reflect the implemented Mongo backend file layout. + Add MongoDB standalone build/run guidance with the real environment variables used by `MongoBackend::from_env`. + Add MongoDB plus Elasticsearch composite build/run guidance and explain search offloading ownership. + Update implementation-status notes to mark Phase 6 closure work complete. + + + README guidance that matches actual Mongo runtime behavior and supported operating modes. + + + + + Ensure roadmap artifacts consistently describe MongoDB as shipped and Phase 6 as completed. + + Update `ROADMAP.md` shipped persistence items to include MongoDB standalone and MongoDB plus Elasticsearch. + Remove stale top-level roadmap wording that still marks MongoDB primary support as in progress. + Update `roadmap_mongo.xml` progress text and reference files for completed Phase 5/6 artifacts. + Create `phase6_roadmap.xml` as the dedicated detailed Phase 6 record. + + + Top-level roadmap, umbrella roadmap, and detailed phase roadmap all reflect the same completed Mongo state. + + + + + Define the minimal validation matrix needed to treat Phase 6 as closed without claiming unrelated repo-wide cleanliness. + + Run focused tests covering `StorageBackendMode` parsing/display behavior in the REST configuration crate. + Run feature-gated `cargo check` for `helios-hfs` with `mongodb` and with `mongodb,elasticsearch`. + Keep unrelated repo-wide formatting drift explicitly out of the Phase 6 success claim. + + + Focused validation evidence for Mongo runtime/config readiness. + + + + + + + `helios-rest` storage backend mode parsing tests for `mongodb` and `mongodb-elasticsearch`. + `helios-rest` storage backend mode display tests for Mongo modes. + + + + `cargo check -p helios-hfs --features mongodb` validates Mongo standalone runtime wiring. + `cargo check -p helios-hfs --features mongodb,elasticsearch` validates composite runtime wiring. + Documentation examples and roadmap references are reviewed for consistency with the implemented runtime surface. + + + + Do not claim new Mongo capabilities in Phase 6; only claim runtime/doc/release-readiness closure work. + Keep unsupported advanced search, conditional patch, and full transaction bundle semantics explicitly outside the Phase 6 completion statement. + + + + + cargo test -p helios-rest storage_backend_mode + cargo check -p helios-hfs --features mongodb + cargo check -p helios-hfs --features mongodb,elasticsearch + + + + + WS2.1-WS2.4 + Persistence README reflects the implemented Mongo standalone and composite runtime surface. + + + + WS3.1-WS3.4 + `ROADMAP.md`, `roadmap_mongo.xml`, and `phase6_roadmap.xml` agree on completed Mongo delivery. + + + + WS1.1-WS1.4, WS4.1-WS4.3 + Focused Mongo runtime/config validation commands are executed successfully. + + + + + + + + + + + + + + + + + + Update the persistence README architecture tree and Mongo runtime guidance. + Update `ROADMAP.md` shipped persistence items to include MongoDB standalone and composite support. + Update `roadmap_mongo.xml` to reference the completed Phase 5 and Phase 6 artifacts. + Create and maintain `phase6_roadmap.xml` as the detailed closure artifact for MongoDB Phase 6. + + + + + Docs and roadmap artifacts can lag behind runtime reality and understate shipped Mongo support. + Synchronize README, top-level roadmap, and umbrella Mongo roadmap in the same phase. + + + Phase 6 could over-claim completion if release-readiness is treated as broader than the actual validation performed. + Keep the validation scope focused on runtime/config surfaces and explicitly exclude unrelated repo-wide formatting drift. + + + Operators may use the wrong MongoDB environment variable names if examples do not match the backend loader. + Document `HFS_MONGODB_URL`, `HFS_MONGODB_URI`, `HFS_DATABASE_URL`, and `HFS_MONGODB_DATABASE` consistently. + + + + + `HFS_STORAGE_BACKEND` supports `mongodb` and `mongodb-elasticsearch` through the implemented REST/HFS runtime surface. + Persistence README examples and operator guidance reflect the real Mongo standalone and composite configuration paths. + `ROADMAP.md`, `roadmap_mongo.xml`, and `phase6_roadmap.xml` all describe MongoDB support as shipped/completed. + Focused validation commands are defined and executed for Mongo runtime/config readiness. + Unrelated repo-wide formatting drift remains explicitly outside the Phase 6 completion claim. + + diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml index 9b30eada..48f7c2d4 100644 --- a/roadmap_mongo.xml +++ b/roadmap_mongo.xml @@ -8,10 +8,11 @@ 2026-03-09 Phases 1 through 6 are completed: backend wiring, core storage parity, version/history semantics, native Mongo search plus conditional operations, composite - MongoDB + Elasticsearch integration, and runtime HFS_STORAGE_BACKEND wiring/documentation - are implemented. Feature-gated cargo check commands and focused composite tests passed; - workspace-wide cargo fmt --all -- --check still reports unrelated pre-existing formatting - drift outside the Mongo Phase 5 files. + MongoDB + Elasticsearch integration, and runtime HFS_STORAGE_BACKEND wiring, + documentation synchronization, and release-readiness validation are implemented. + Feature-gated cargo validation for Mongo runtime surfaces passed; workspace-wide cargo + fmt --all -- --check still reports unrelated pre-existing formatting drift outside the + Mongo Phase 5/6 files. SQLite primary, PostgreSQL primary, MongoDB primary, Elasticsearch secondary, MongoDB + Elasticsearch composite @@ -27,6 +28,8 @@ + + @@ -190,18 +193,21 @@ crates/rest/src/config.rs. Add start_mongodb and start_mongodb_elasticsearch flows in crates/hfs/src/main.rs. - Update persistence README capability matrix and role matrix to reflect - implemented Mongo status. - Update top-level ROADMAP.md persistence section when milestones ship. - Document deployment examples, environment variables, and feature flags. + Verify the implemented runtime paths and feature-gated failure behavior for + mongodb and mongodb-elasticsearch modes. + Update persistence README capability matrix, role matrix, and operator + examples to reflect implemented Mongo status. + Update top-level ROADMAP.md persistence section and document deployment + examples, environment variables, and feature flags. Phases 1 through 5 complete or explicitly scoped. HFS_STORAGE_BACKEND accepts mongodb and mongodb-elasticsearch values. - All relevant docs and examples are consistent with actual implementation, aside - from unrelated repo-wide formatting drift outside the Mongo phase files. + All relevant docs, roadmap artifacts, and examples are consistent with actual + implementation, aside from unrelated repo-wide formatting drift outside the Mongo phase + files. From c864b1c90e24e3bde59f3a3d1e9277fa5fec74c2 Mon Sep 17 00:00:00 2001 From: dougc95 Date: Tue, 10 Mar 2026 17:13:55 -0400 Subject: [PATCH 14/17] feat(ci): add MongoDB backend to Inferno test suite with transaction bundle support Add MongoDB to Inferno workflow matrix alongside sqlite/postgres backends. Implement MongoDB replica set initialization with 60s primary election timeout and container lifecycle management. Add transaction bundle support to MongoDB backend with ClientSession-based ACID semantics, reference resolution, SearchParameter registry integration, and rollback on entry failures. Declare Transactions capability in backend --- .github/workflows/inferno.yml | 69 +- .../src/backends/mongodb/backend.rs | 2 + .../src/backends/mongodb/storage.rs | 1137 ++++++++++++++++- .../persistence/tests/common/capabilities.rs | 2 +- crates/persistence/tests/mongodb_tests.rs | 472 ++++++- final_roadmap.xml | 228 ++++ 6 files changed, 1887 insertions(+), 23 deletions(-) create mode 100644 final_roadmap.xml diff --git a/.github/workflows/inferno.yml b/.github/workflows/inferno.yml index de691d0c..5c5c7c7a 100644 --- a/.github/workflows/inferno.yml +++ b/.github/workflows/inferno.yml @@ -1,7 +1,7 @@ name: Inferno US Core Test Suite on: - workflow_dispatch: # Manual trigger only for initial implementation + workflow_dispatch: # Manual trigger only for initial implementation env: CARGO_TERM_COLOR: always @@ -35,7 +35,7 @@ jobs: echo 'rustflags = ["-C", "link-arg=-fuse-ld=lld", "-C", "link-arg=-Wl,-zstack-size=8388608"]' >> ~/.cargo/config.toml - name: Build HFS - run: cargo build -p helios-hfs --features R4,sqlite,elasticsearch,postgres + run: cargo build -p helios-hfs --features R4,sqlite,elasticsearch,postgres,mongodb - name: Upload HFS binary uses: actions/upload-artifact@v4 @@ -52,8 +52,16 @@ jobs: fail-fast: false max-parallel: 3 matrix: - suite_id: [us_core_v311, us_core_v400, us_core_v501, us_core_v610, us_core_v700, us_core_v800] - backend: [sqlite, sqlite-elasticsearch, postgres] + suite_id: + [ + us_core_v311, + us_core_v400, + us_core_v501, + us_core_v610, + us_core_v700, + us_core_v800, + ] + backend: [sqlite, sqlite-elasticsearch, postgres, mongodb] include: - { suite_id: us_core_v311, version_label: "v3.1.1" } - { suite_id: us_core_v400, version_label: "v4.0.0" } @@ -129,6 +137,50 @@ jobs: echo "OMITTED_TESTS=[${OMITTED}]" >> $GITHUB_ENV + - name: Start MongoDB replica set + if: matrix.backend == 'mongodb' + run: | + MONGO_CONTAINER="mongo-${{ matrix.suite_id }}-${{ matrix.backend }}-${{ github.run_id }}-${{ github.run_attempt }}" + docker rm -f "$MONGO_CONTAINER" 2>/dev/null || true + docker run -d --name "$MONGO_CONTAINER" -p 0:27017 mongo:8.0 --replSet rs0 --bind_ip_all + + echo "MONGO_CONTAINER=$MONGO_CONTAINER" >> $GITHUB_ENV + + READY=0 + for i in {1..30}; do + if docker exec "$MONGO_CONTAINER" mongosh --quiet --eval 'db.adminCommand({ ping: 1 }).ok' > /dev/null 2>&1; then + READY=1 + break + fi + echo "Attempt $i/30: MongoDB not ready yet..." + sleep 2 + done + + if [ "$READY" -ne 1 ]; then + echo "MongoDB failed to start" + docker logs "$MONGO_CONTAINER" + exit 1 + fi + + docker exec "$MONGO_CONTAINER" mongosh --quiet --eval 'try { rs.status(); } catch (e) { rs.initiate({ _id: "rs0", members: [{ _id: 0, host: "localhost:27017" }] }); }' > /dev/null 2>&1 + + echo "Waiting for MongoDB replica set primary..." + for i in {1..60}; do + PRIMARY_STATE=$(docker exec "$MONGO_CONTAINER" mongosh --quiet --eval 'try { rs.status().myState } catch (e) { 0 }' | tr -d '\r\n ') + if [ "$PRIMARY_STATE" = "1" ]; then + MONGO_PORT=$(docker port "$MONGO_CONTAINER" 27017 | head -1 | sed 's/.*://') + echo "MongoDB replica set is ready on port $MONGO_PORT" + echo "MONGO_PORT=$MONGO_PORT" >> $GITHUB_ENV + exit 0 + fi + echo "Attempt $i/60: MongoDB replica set not primary yet..." + sleep 2 + done + + echo "MongoDB replica set failed to become primary" + docker logs "$MONGO_CONTAINER" + exit 1 + - name: Start Elasticsearch if: matrix.backend == 'sqlite-elasticsearch' run: | @@ -204,6 +256,12 @@ jobs: HFS_PG_USER=helios \ HFS_PG_PASSWORD=helios \ ./target/debug/hfs --log-level info --port $HFS_PORT --host 0.0.0.0 > "$HFS_LOG" 2>&1 & + elif [ "${{ matrix.backend }}" = "mongodb" ]; then + MONGO_HOST="${DOCKER_HOST_IP:-127.0.0.1}" + HFS_STORAGE_BACKEND=mongodb \ + HFS_DATABASE_URL="mongodb://$MONGO_HOST:$MONGO_PORT/?replicaSet=rs0&directConnection=true" \ + HFS_MONGODB_DATABASE="helios_inferno_${{ matrix.suite_id }}_${{ github.run_id }}_${{ github.run_attempt }}" \ + ./target/debug/hfs --log-level info --port $HFS_PORT --host 0.0.0.0 > "$HFS_LOG" 2>&1 & else ./target/debug/hfs --database-url :memory: --log-level info --port $HFS_PORT --host 0.0.0.0 > "$HFS_LOG" 2>&1 & fi @@ -472,4 +530,7 @@ jobs: echo "Stopping PostgreSQL..." docker rm -f "${PG_CONTAINER:-none}" 2>/dev/null || true + echo "Stopping MongoDB..." + docker rm -f "${MONGO_CONTAINER:-none}" 2>/dev/null || true + echo "Cleanup complete" diff --git a/crates/persistence/src/backends/mongodb/backend.rs b/crates/persistence/src/backends/mongodb/backend.rs index bcc62e19..a718148d 100644 --- a/crates/persistence/src/backends/mongodb/backend.rs +++ b/crates/persistence/src/backends/mongodb/backend.rs @@ -395,6 +395,7 @@ impl Backend for MongoBackend { | BackendCapability::Sorting | BackendCapability::OffsetPagination | BackendCapability::CursorPagination + | BackendCapability::Transactions | BackendCapability::OptimisticLocking | BackendCapability::SharedSchema ) @@ -413,6 +414,7 @@ impl Backend for MongoBackend { BackendCapability::Sorting, BackendCapability::OffsetPagination, BackendCapability::CursorPagination, + BackendCapability::Transactions, BackendCapability::OptimisticLocking, BackendCapability::SharedSchema, ] diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index 5c0e531b..52ee47e2 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -1,19 +1,21 @@ //! ResourceStorage implementation for MongoDB. +use std::collections::HashMap; + use async_trait::async_trait; use chrono::{DateTime, Utc}; use helios_fhir::FhirVersion; use mongodb::{ - ClientSession, Cursor, + ClientSession, Cursor, SessionCursor, bson::{self, Bson, DateTime as BsonDateTime, Document, doc}, error::Error as MongoError, }; use serde_json::Value; use crate::core::{ - BundleEntry, BundleProvider, BundleResult, HistoryEntry, HistoryMethod, HistoryPage, - HistoryParams, InstanceHistoryProvider, ResourceStorage, SystemHistoryProvider, - TypeHistoryProvider, VersionedStorage, normalize_etag, + BundleEntry, BundleEntryResult, BundleMethod, BundleProvider, BundleResult, BundleType, + HistoryEntry, HistoryMethod, HistoryPage, HistoryParams, InstanceHistoryProvider, + ResourceStorage, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, normalize_etag, }; use crate::error::{ BackendError, ConcurrencyError, ResourceError, StorageError, StorageResult, TransactionError, @@ -34,6 +36,13 @@ fn internal_error(message: String) -> StorageError { }) } +#[derive(Debug, Clone)] +enum PendingSearchParameterChange { + Create(Value), + Update { old: Value, new: Value }, + Delete(Value), +} + fn serialization_error(message: String) -> StorageError { StorageError::Backend(BackendError::SerializationError { message }) } @@ -178,6 +187,27 @@ async fn collect_documents(mut cursor: Cursor) -> StorageResult, + session: &mut ClientSession, +) -> StorageResult> { + let mut docs = Vec::new(); + while cursor + .advance(session) + .await + .map_err(|e| internal_error(format!("Failed to advance MongoDB session cursor: {}", e)))? + { + let doc = cursor.deserialize_current().map_err(|e| { + internal_error(format!( + "Failed to deserialize MongoDB session document: {}", + e + )) + })?; + docs.push(doc); + } + Ok(docs) +} + fn parse_cursor_version(params: &HistoryParams) -> Option { let cursor = params.pagination.cursor_value()?; let value = cursor.sort_values().first()?; @@ -326,6 +356,109 @@ fn parse_history_row( }) } +fn parse_simple_bundle_search_params(params: &str) -> Vec<(String, String)> { + params + .split('&') + .filter_map(|pair| { + let mut iter = pair.splitn(2, '='); + let key = iter.next()?.trim(); + let value = iter.next()?.trim(); + + if key.is_empty() || value.is_empty() { + return None; + } + + Some((key.to_string(), value.to_string())) + }) + .collect() +} + +fn document_to_stored_resource( + doc: &Document, + tenant: &TenantContext, + fallback_resource_type: &str, +) -> StorageResult { + let resource_type = doc + .get_str("resource_type") + .ok() + .unwrap_or(fallback_resource_type) + .to_string(); + + let id = doc + .get_str("id") + .map_err(|e| internal_error(format!("Missing resource id in MongoDB document: {}", e)))? + .to_string(); + + let version_id = doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing version_id in MongoDB document: {}", e)))? + .to_string(); + + let payload = doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing resource payload in MongoDB document: {}", e)))?; + let content = document_to_value(payload)?; + + let now = Utc::now(); + let created_at = extract_created_at(doc, now); + let last_updated = extract_last_updated(doc, now); + let deleted_at = extract_deleted_at(doc); + let fhir_version = extract_fhir_version(doc, FhirVersion::default()); + + Ok(StoredResource::from_storage( + resource_type, + id, + version_id, + tenant.tenant_id().clone(), + content, + created_at, + last_updated, + deleted_at, + fhir_version, + )) +} + +async fn begin_required_bundle_transaction_session( + db: &mongodb::Database, +) -> Result { + let mut session = db + .client() + .start_session() + .await + .map_err(|e| TransactionError::RolledBack { + reason: format!("Failed to start MongoDB session: {}", e), + })?; + + let hello = db + .run_command(doc! { "hello": 1_i32 }) + .await + .map_err(|e| TransactionError::RolledBack { + reason: format!("Failed to inspect MongoDB topology: {}", e), + })?; + + let supports_transactions = hello.contains_key("setName") + || hello + .get_str("msg") + .map(|value| value == "isdbgrid") + .unwrap_or(false); + + if !supports_transactions { + return Err(TransactionError::UnsupportedIsolationLevel { + level: "transaction bundles for mongodb require replica-set or sharded topology" + .to_string(), + }); + } + + session + .start_transaction() + .await + .map_err(|e| TransactionError::RolledBack { + reason: format!("Failed to start MongoDB transaction: {}", e), + })?; + + Ok(session) +} + async fn begin_best_effort_multi_write_session( db: &mongodb::Database, ) -> (Option, bool) { @@ -1749,11 +1882,107 @@ impl SystemHistoryProvider for MongoBackend { impl BundleProvider for MongoBackend { async fn process_transaction( &self, - _tenant: &TenantContext, - _entries: Vec, + tenant: &TenantContext, + entries: Vec, ) -> Result { - Err(TransactionError::UnsupportedIsolationLevel { - level: "transaction bundles for mongodb".to_string(), + let db = self + .get_database() + .await + .map_err(|e| TransactionError::RolledBack { + reason: format!("Failed to acquire MongoDB database: {}", e), + })?; + + let mut session = begin_required_bundle_transaction_session(&db).await?; + + let mut results = Vec::with_capacity(entries.len()); + let mut error_info: Option<(usize, String)> = None; + let mut reference_map: HashMap = HashMap::new(); + let mut pending_search_parameter_changes: Vec = Vec::new(); + let mut entries = entries; + + for (idx, entry) in entries.iter_mut().enumerate() { + if let Some(resource) = entry.resource.as_mut() { + resolve_bundle_references(resource, &reference_map); + } + + let result = self + .process_bundle_entry_transaction( + &db, + &mut session, + tenant, + entry, + &mut pending_search_parameter_changes, + ) + .await; + + match result { + Ok(entry_result) => { + if entry_result.status >= 400 { + error_info = Some(( + idx, + format!("Entry failed with status {}", entry_result.status), + )); + break; + } + + if entry.method == BundleMethod::Post { + if let Some(full_url) = entry.full_url.as_ref() { + if let Some(location) = entry_result.location.as_ref() { + let reference = location + .split("/_history") + .next() + .unwrap_or(location) + .to_string(); + reference_map.insert(full_url.clone(), reference); + } + } + } + + results.push(entry_result); + } + Err(e) => { + error_info = Some((idx, format!("Entry processing failed: {}", e))); + break; + } + } + } + + if let Some((index, message)) = error_info { + let _ = session.abort_transaction().await; + return Err(TransactionError::BundleError { index, message }); + } + + session + .commit_transaction() + .await + .map_err(|e| TransactionError::RolledBack { + reason: format!("Commit failed: {}", e), + })?; + + for change in pending_search_parameter_changes { + let result = match change { + PendingSearchParameterChange::Create(resource) => { + self.handle_search_parameter_create(&resource) + } + PendingSearchParameterChange::Update { old, new } => { + self.handle_search_parameter_update(&old, &new) + } + PendingSearchParameterChange::Delete(resource) => { + self.handle_search_parameter_delete(&resource) + } + }; + + if let Err(e) = result { + tracing::warn!( + "Transaction committed but failed to apply SearchParameter registry update: {}", + e + ); + } + } + + Ok(BundleResult { + bundle_type: BundleType::Transaction, + entries: results, }) } @@ -1768,3 +1997,895 @@ impl BundleProvider for MongoBackend { })) } } + +impl MongoBackend { + async fn process_bundle_entry_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + entry: &BundleEntry, + pending_search_parameter_changes: &mut Vec, + ) -> StorageResult { + match entry.method { + BundleMethod::Get => { + let (resource_type, id) = self.parse_url(&entry.url)?; + match self + .read_resource_in_bundle_transaction(db, session, tenant, &resource_type, &id) + .await? + { + Some(resource) => Ok(BundleEntryResult::ok(resource)), + None => Ok(BundleEntryResult::error( + 404, + serde_json::json!({ + "resourceType": "OperationOutcome", + "issue": [{"severity": "error", "code": "not-found"}] + }), + )), + } + } + BundleMethod::Post => { + let resource = entry.resource.clone().ok_or_else(|| { + StorageError::Validation(crate::error::ValidationError::MissingRequiredField { + field: "resource".to_string(), + }) + })?; + + let resource_type = resource + .get("resourceType") + .and_then(|v| v.as_str()) + .map(str::to_string) + .ok_or_else(|| { + StorageError::Validation( + crate::error::ValidationError::MissingRequiredField { + field: "resourceType".to_string(), + }, + ) + })?; + + if let Some(search_params) = entry.if_none_exist.as_ref() { + let matches = self + .find_matching_resources_in_bundle_transaction( + db, + session, + tenant, + &resource_type, + search_params, + ) + .await?; + + match matches.len() { + 0 => {} + 1 => { + return Ok(BundleEntryResult::ok( + matches.into_iter().next().expect("single match must exist"), + )); + } + n => { + return Ok(BundleEntryResult::error( + 412, + serde_json::json!({ + "resourceType": "OperationOutcome", + "issue": [{ + "severity": "error", + "code": "multiple-matches", + "diagnostics": format!( + "Conditional create matched {} resources", + n + ) + }] + }), + )); + } + } + } + + let created = self + .create_resource_in_bundle_transaction( + db, + session, + tenant, + &resource_type, + resource, + pending_search_parameter_changes, + ) + .await?; + Ok(BundleEntryResult::created(created)) + } + BundleMethod::Put => { + let resource = entry.resource.clone().ok_or_else(|| { + StorageError::Validation(crate::error::ValidationError::MissingRequiredField { + field: "resource".to_string(), + }) + })?; + + let (resource_type, id) = self.parse_url(&entry.url)?; + + match self + .read_resource_in_bundle_transaction(db, session, tenant, &resource_type, &id) + .await? + { + Some(existing) => { + if let Some(if_match) = entry.if_match.as_ref() { + let expected = normalize_etag(if_match); + let actual = normalize_etag(existing.version_id()); + if expected != actual { + return Ok(BundleEntryResult::error( + 412, + serde_json::json!({ + "resourceType": "OperationOutcome", + "issue": [{"severity": "error", "code": "conflict", "diagnostics": "ETag mismatch"}] + }), + )); + } + } + + let updated = self + .update_resource_in_bundle_transaction( + db, + session, + tenant, + &existing, + resource, + pending_search_parameter_changes, + ) + .await?; + Ok(BundleEntryResult::ok(updated)) + } + None => { + let mut resource_with_id = resource; + resource_with_id["id"] = serde_json::json!(id); + + let created = self + .create_resource_in_bundle_transaction( + db, + session, + tenant, + &resource_type, + resource_with_id, + pending_search_parameter_changes, + ) + .await?; + Ok(BundleEntryResult::created(created)) + } + } + } + BundleMethod::Delete => { + let (resource_type, id) = self.parse_url(&entry.url)?; + + if let Some(if_match) = entry.if_match.as_ref() { + match self + .delete_with_match_resource_in_bundle_transaction( + db, + session, + tenant, + &resource_type, + &id, + if_match, + pending_search_parameter_changes, + ) + .await + { + Ok(()) => Ok(BundleEntryResult::deleted()), + Err(StorageError::Resource(ResourceError::NotFound { .. })) => { + Ok(BundleEntryResult::error( + 404, + serde_json::json!({ + "resourceType": "OperationOutcome", + "issue": [{"severity": "error", "code": "not-found"}] + }), + )) + } + Err(e) => Err(e), + } + } else { + match self + .delete_resource_in_bundle_transaction( + db, + session, + tenant, + &resource_type, + &id, + pending_search_parameter_changes, + ) + .await + { + Ok(()) => Ok(BundleEntryResult::deleted()), + Err(StorageError::Resource(ResourceError::NotFound { .. })) => { + Ok(BundleEntryResult::deleted()) + } + Err(e) => Err(e), + } + } + } + BundleMethod::Patch => Ok(BundleEntryResult::error( + 501, + serde_json::json!({ + "resourceType": "OperationOutcome", + "issue": [{"severity": "error", "code": "not-supported", "diagnostics": "PATCH not implemented in transaction bundles"}] + }), + )), + } + } + + async fn create_resource_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + resource_type: &str, + resource: Value, + pending_search_parameter_changes: &mut Vec, + ) -> StorageResult { + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let id = resource + .get("id") + .and_then(|v| v.as_str()) + .map(str::to_string) + .unwrap_or_else(|| uuid::Uuid::new_v4().to_string()); + + let existing = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to check resource existence in transaction: {}", e)))?; + + if existing.is_some() { + return Err(StorageError::Resource(ResourceError::AlreadyExists { + resource_type: resource_type.to_string(), + id, + })); + } + + let mut resource = resource; + ensure_resource_identity(resource_type, &id, &mut resource); + let payload = value_to_document(&resource)?; + + let now = Utc::now(); + let now_bson = chrono_to_bson(now); + let version_id = "1".to_string(); + let fhir_version = FhirVersion::default(); + let fhir_version_str = fhir_version.as_mime_param().to_string(); + + resources + .insert_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + "version_id": &version_id, + "data": Bson::Document(payload.clone()), + "created_at": now_bson, + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + }) + .session(&mut *session) + .await + .map_err(|e| { + if is_duplicate_key_error(&e) { + StorageError::Resource(ResourceError::AlreadyExists { + resource_type: resource_type.to_string(), + id: id.clone(), + }) + } else { + internal_error(format!("Failed to insert resource in transaction: {}", e)) + } + })?; + + history + .insert_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": &id, + "version_id": &version_id, + "data": Bson::Document(payload), + "created_at": now_bson, + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to insert history in transaction: {}", e)))?; + + self.index_resource_in_bundle_transaction(db, session, tenant_id, resource_type, &id, &resource) + .await?; + + if resource_type == "SearchParameter" { + pending_search_parameter_changes.push(PendingSearchParameterChange::Create( + resource.clone(), + )); + } + + Ok(StoredResource::from_storage( + resource_type, + &id, + version_id, + tenant.tenant_id().clone(), + resource, + now, + now, + None, + fhir_version, + )) + } + + async fn update_resource_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + current: &StoredResource, + resource: Value, + pending_search_parameter_changes: &mut Vec, + ) -> StorageResult { + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + let resource_type = current.resource_type(); + let id = current.id(); + + let existing_doc = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to load current resource in transaction: {}", e)))? + .ok_or_else(|| { + StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + }) + })?; + + let actual_version = existing_doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing current version in transaction: {}", e)))? + .to_string(); + + if actual_version != current.version_id() { + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: current.version_id().to_string(), + actual_version, + }, + )); + } + + let new_version = next_version(current.version_id())?; + let mut resource = resource; + ensure_resource_identity(resource_type, id, &mut resource); + let payload = value_to_document(&resource)?; + + let now = Utc::now(); + let now_bson = chrono_to_bson(now); + let fhir_version = current.fhir_version(); + let fhir_version_str = fhir_version.as_mime_param().to_string(); + + let update_result = resources + .update_one( + doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": current.version_id(), + "is_deleted": false, + }, + doc! { + "$set": { + "version_id": &new_version, + "data": Bson::Document(payload.clone()), + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + } + }, + ) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to update resource in transaction: {}", e)))?; + + if update_result.matched_count == 0 { + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: current.version_id().to_string(), + actual_version: "unknown".to_string(), + }, + )); + } + + let created_at = extract_created_at(&existing_doc, now); + + history + .insert_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": &new_version, + "data": Bson::Document(payload), + "created_at": chrono_to_bson(created_at), + "last_updated": now_bson, + "is_deleted": false, + "deleted_at": Bson::Null, + "fhir_version": &fhir_version_str, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to insert history in transaction: {}", e)))?; + + self.index_resource_in_bundle_transaction(db, session, tenant_id, resource_type, id, &resource) + .await?; + + if resource_type == "SearchParameter" { + pending_search_parameter_changes.push(PendingSearchParameterChange::Update { + old: current.content().clone(), + new: resource.clone(), + }); + } + + Ok(StoredResource::from_storage( + resource_type, + id, + new_version, + tenant.tenant_id().clone(), + resource, + created_at, + now, + None, + fhir_version, + )) + } + + async fn delete_resource_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + resource_type: &str, + id: &str, + pending_search_parameter_changes: &mut Vec, + ) -> StorageResult<()> { + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let history = db.collection::(MongoBackend::RESOURCE_HISTORY_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let existing_doc = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to load resource for delete in transaction: {}", e)))? + .ok_or_else(|| { + StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + }) + })?; + + let current_version = existing_doc + .get_str("version_id") + .map_err(|e| internal_error(format!("Missing current version in transaction delete: {}", e)))? + .to_string(); + let new_version = next_version(¤t_version)?; + + let payload = existing_doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing resource payload in transaction delete: {}", e)))? + .clone(); + let resource_value = document_to_value(&payload)?; + let fhir_version = existing_doc + .get_str("fhir_version") + .unwrap_or("4.0") + .to_string(); + let created_at = extract_created_at(&existing_doc, Utc::now()); + + let now = Utc::now(); + let now_bson = chrono_to_bson(now); + + let update_result = resources + .update_one( + doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": ¤t_version, + "is_deleted": false, + }, + doc! { + "$set": { + "version_id": &new_version, + "is_deleted": true, + "deleted_at": now_bson, + "last_updated": now_bson, + } + }, + ) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to soft-delete resource in transaction: {}", e)))?; + + if update_result.matched_count == 0 { + return Err(StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + })); + } + + history + .insert_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "version_id": &new_version, + "data": Bson::Document(payload), + "created_at": chrono_to_bson(created_at), + "last_updated": now_bson, + "is_deleted": true, + "deleted_at": now_bson, + "fhir_version": fhir_version, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to insert delete history in transaction: {}", e)))?; + + self.delete_search_index_in_bundle_transaction(db, session, tenant_id, resource_type, id) + .await?; + + if resource_type == "SearchParameter" { + pending_search_parameter_changes.push(PendingSearchParameterChange::Delete(resource_value)); + } + + Ok(()) + } + + async fn delete_with_match_resource_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + resource_type: &str, + id: &str, + expected_version: &str, + pending_search_parameter_changes: &mut Vec, + ) -> StorageResult<()> { + let existing = self + .read_resource_in_bundle_transaction(db, session, tenant, resource_type, id) + .await? + .ok_or_else(|| { + StorageError::Resource(ResourceError::NotFound { + resource_type: resource_type.to_string(), + id: id.to_string(), + }) + })?; + + let expected = normalize_etag(expected_version); + let actual = normalize_etag(existing.version_id()); + if expected != actual { + return Err(StorageError::Concurrency( + ConcurrencyError::VersionConflict { + resource_type: resource_type.to_string(), + id: id.to_string(), + expected_version: expected.to_string(), + actual_version: actual.to_string(), + }, + )); + } + + self.delete_resource_in_bundle_transaction( + db, + session, + tenant, + resource_type, + id, + pending_search_parameter_changes, + ) + .await + } + + async fn read_resource_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + resource_type: &str, + id: &str, + ) -> StorageResult> { + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let maybe_doc = resources + .find_one(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "id": id, + "is_deleted": false, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to read resource in transaction: {}", e)))?; + + maybe_doc + .as_ref() + .map(|doc| document_to_stored_resource(doc, tenant, resource_type)) + .transpose() + } + + async fn find_matching_resources_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant: &TenantContext, + resource_type: &str, + search_params: &str, + ) -> StorageResult> { + let parsed_params = parse_simple_bundle_search_params(search_params); + if parsed_params.is_empty() { + return Ok(Vec::new()); + } + + let resources = db.collection::(MongoBackend::RESOURCES_COLLECTION); + let tenant_id = tenant.tenant_id().as_str(); + + let cursor = resources + .find(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "is_deleted": false, + }) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to query conditional matches in transaction: {}", e)))?; + + let docs = collect_session_documents(cursor, session).await?; + let mut matches = Vec::new(); + + for doc in docs { + let payload = doc + .get_document("data") + .map_err(|e| internal_error(format!("Missing payload while matching conditionals: {}", e)))?; + let resource = document_to_value(payload)?; + + if resource_matches_bundle_search_params(&resource, &parsed_params) + && doc + .get_str("resource_type") + .map(|rt| rt == resource_type) + .unwrap_or(true) + { + matches.push(document_to_stored_resource(&doc, tenant, resource_type)?); + } + } + + Ok(matches) + } + + async fn index_resource_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant_id: &str, + resource_type: &str, + resource_id: &str, + resource: &Value, + ) -> StorageResult<()> { + if self.is_search_offloaded() { + return Ok(()); + } + + self.delete_search_index_in_bundle_transaction(db, session, tenant_id, resource_type, resource_id) + .await?; + + let index_docs = match self.search_extractor().extract(resource, resource_type) { + Ok(values) => values + .iter() + .filter_map(|value| { + self.build_search_index_document(tenant_id, resource_type, resource_id, value) + }) + .collect::>(), + Err(e) => { + tracing::warn!( + "Search extraction failed for {}/{} in transaction: {}. Using minimal fallback index values.", + resource_type, + resource_id, + e + ); + self.index_minimal_fallback_documents(tenant_id, resource_type, resource_id, resource) + } + }; + + if index_docs.is_empty() { + return Ok(()); + } + + db.collection::(MongoBackend::SEARCH_INDEX_COLLECTION) + .insert_many(index_docs) + .session(&mut *session) + .await + .map_err(|e| internal_error(format!("Failed to insert search_index entries in transaction: {}", e)))?; + + Ok(()) + } + + async fn delete_search_index_in_bundle_transaction( + &self, + db: &mongodb::Database, + session: &mut ClientSession, + tenant_id: &str, + resource_type: &str, + resource_id: &str, + ) -> StorageResult<()> { + if self.is_search_offloaded() { + return Ok(()); + } + + db.collection::(MongoBackend::SEARCH_INDEX_COLLECTION) + .delete_many(doc! { + "tenant_id": tenant_id, + "resource_type": resource_type, + "resource_id": resource_id, + }) + .session(&mut *session) + .await + .map_err(|e| { + internal_error(format!( + "Failed to delete search_index entries in transaction: {}", + e + )) + })?; + + Ok(()) + } + + fn parse_url(&self, url: &str) -> StorageResult<(String, String)> { + let path = url + .strip_prefix("http://") + .or_else(|| url.strip_prefix("https://")) + .map(|s| s.find('/').map(|i| &s[i..]).unwrap_or(s)) + .unwrap_or(url); + + let path = path.trim_start_matches('/'); + let parts: Vec<&str> = path.split('/').filter(|segment| !segment.is_empty()).collect(); + + if parts.len() >= 2 { + let len = parts.len(); + Ok((parts[len - 2].to_string(), parts[len - 1].to_string())) + } else { + Err(StorageError::Validation( + crate::error::ValidationError::InvalidReference { + reference: url.to_string(), + message: "URL must be in format ResourceType/id".to_string(), + }, + )) + } + } +} + +fn resource_matches_bundle_search_params(resource: &Value, params: &[(String, String)]) -> bool { + params.iter().all(|(name, expected)| match name.as_str() { + "_id" => resource + .get("id") + .and_then(Value::as_str) + .is_some_and(|id| id == expected), + "identifier" => resource_identifier_matches(resource, expected), + _ => resource_field_matches(resource.get(name), expected), + }) +} + +fn resource_identifier_matches(resource: &Value, expected: &str) -> bool { + let Some(identifier_value) = resource.get("identifier") else { + return false; + }; + + let (system, value, has_separator) = if let Some((system, value)) = expected.split_once('|') { + (system, value, true) + } else { + ("", expected, false) + }; + + match identifier_value { + Value::Array(items) => items.iter().any(|item| match_identifier_item(item, system, value, has_separator)), + Value::Object(_) => match_identifier_item(identifier_value, system, value, has_separator), + _ => false, + } +} + +fn match_identifier_item(item: &Value, system: &str, value: &str, has_separator: bool) -> bool { + let item_system = item.get("system").and_then(Value::as_str); + let item_value = item.get("value").and_then(Value::as_str); + + if has_separator { + let system_matches = if system.is_empty() { + true + } else { + item_system == Some(system) + }; + let value_matches = if value.is_empty() { + true + } else { + item_value == Some(value) + }; + + system_matches && value_matches + } else { + item_value == Some(value) + } +} + +fn resource_field_matches(value: Option<&Value>, expected: &str) -> bool { + let Some(value) = value else { + return false; + }; + + match value { + Value::String(s) => s == expected, + Value::Array(items) => items + .iter() + .any(|item| resource_field_matches(Some(item), expected)), + Value::Object(map) => { + if map + .get("reference") + .and_then(Value::as_str) + .is_some_and(|reference| reference == expected) + { + return true; + } + + if map + .get("value") + .and_then(Value::as_str) + .is_some_and(|value| value == expected) + { + return true; + } + + map.values() + .any(|nested| resource_field_matches(Some(nested), expected)) + } + _ => false, + } +} + +fn resolve_bundle_references(value: &mut Value, reference_map: &HashMap) { + match value { + Value::Object(map) => { + if let Some(Value::String(reference)) = map.get("reference") { + if reference.starts_with("urn:uuid:") { + if let Some(resolved) = reference_map.get(reference) { + map.insert("reference".to_string(), Value::String(resolved.clone())); + } + } + } + + for nested in map.values_mut() { + resolve_bundle_references(nested, reference_map); + } + } + Value::Array(items) => { + for item in items { + resolve_bundle_references(item, reference_map); + } + } + _ => {} + } +} diff --git a/crates/persistence/tests/common/capabilities.rs b/crates/persistence/tests/common/capabilities.rs index 20d59ebd..4e9e2bb1 100644 --- a/crates/persistence/tests/common/capabilities.rs +++ b/crates/persistence/tests/common/capabilities.rs @@ -172,7 +172,7 @@ impl CapabilityMatrix { (BackendCapability::Revinclude, SupportLevel::Planned), (BackendCapability::FullTextSearch, SupportLevel::Planned), (BackendCapability::TerminologySearch, SupportLevel::Planned), - (BackendCapability::Transactions, SupportLevel::Planned), + (BackendCapability::Transactions, SupportLevel::Implemented), (BackendCapability::OptimisticLocking, SupportLevel::Implemented), (BackendCapability::CursorPagination, SupportLevel::Implemented), (BackendCapability::OffsetPagination, SupportLevel::Implemented), diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index 50fe15ee..61a68971 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -11,7 +11,8 @@ use helios_fhir::FhirVersion; use helios_persistence::backends::mongodb::{MongoBackend, MongoBackendConfig}; use helios_persistence::core::{ - Backend, BackendCapability, BackendKind, BundleProvider, ConditionalCreateResult, + Backend, BackendCapability, BackendKind, BundleEntry, BundleMethod, BundleProvider, + BundleResult, ConditionalCreateResult, ConditionalDeleteResult, ConditionalStorage, ConditionalUpdateResult, HistoryParams, InstanceHistoryProvider, PatchFormat, ResourceStorage, SearchProvider, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, @@ -38,6 +39,11 @@ fn build_test_database_name(test_name: &str) -> String { format!("{MONGODB_TEST_DB_PREFIX}{truncated_test_name}_{suffix}") } +fn extract_resource_id_from_location(location: &str) -> String { + let resource_path = location.split("/_history").next().unwrap_or(location); + resource_path.rsplit('/').next().unwrap_or(resource_path).to_string() +} + #[test] fn test_mongodb_config_defaults() { let config = MongoBackendConfig::default(); @@ -105,19 +111,29 @@ fn test_mongodb_phase4_capabilities() { assert!(backend.supports(BackendCapability::OptimisticLocking)); assert!(backend.supports(BackendCapability::SharedSchema)); - assert!(!backend.supports(BackendCapability::Transactions)); + assert!(backend.supports(BackendCapability::Transactions)); } #[tokio::test] -async fn test_mongodb_bundle_provider_transaction_not_supported() { - let backend = MongoBackend::new(MongoBackendConfig::default()).unwrap(); - let tenant = create_tenant("tenant-bundle-transaction"); +async fn mongodb_integration_transaction_bundle_topology_behavior() { + let Some(backend) = create_backend("bundle_topology_behavior").await else { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_topology_behavior (set HFS_TEST_MONGODB_URL)" + ); + return; + }; - let result = backend.process_transaction(&tenant, vec![]).await; - assert!(matches!( - result, - Err(TransactionError::UnsupportedIsolationLevel { .. }) - )); + let tenant = create_tenant("tenant-bundle-topology"); + + match backend.process_transaction(&tenant, vec![]).await { + Ok(bundle_result) => assert!(bundle_result.entries.is_empty()), + Err(TransactionError::UnsupportedIsolationLevel { .. }) => { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_topology_behavior (MongoDB topology does not support transactions)" + ); + } + Err(other) => panic!("Unexpected transaction result: {}", other), + } } #[tokio::test] @@ -132,6 +148,25 @@ async fn test_mongodb_bundle_provider_batch_not_supported() { )); } +async fn process_transaction_or_skip( + backend: &MongoBackend, + tenant: &TenantContext, + entries: Vec, + test_name: &str, +) -> Option { + match backend.process_transaction(tenant, entries).await { + Ok(result) => Some(result), + Err(TransactionError::UnsupportedIsolationLevel { .. }) => { + eprintln!( + "Skipping {} (MongoDB topology does not support transactions)", + test_name + ); + None + } + Err(e) => panic!("{} failed: {}", test_name, e), + } +} + fn test_mongo_url() -> Option { std::env::var("HFS_TEST_MONGODB_URL").ok() } @@ -246,6 +281,423 @@ async fn mongodb_integration_create_read_update_delete() { )); } +#[tokio::test] +async fn mongodb_integration_transaction_bundle_create_and_resolve_references() { + let Some(backend) = create_backend("bundle_create_resolve_references").await else { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_create_and_resolve_references (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-bundle-resolve"); + + let entries = vec![ + BundleEntry { + method: BundleMethod::Post, + url: "Patient".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "name": [{"family": "BundleRefPatient"}] + })), + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: Some("urn:uuid:new-patient".to_string()), + }, + BundleEntry { + method: BundleMethod::Post, + url: "Observation".to_string(), + resource: Some(json!({ + "resourceType": "Observation", + "status": "final", + "code": {"coding": [{"system": "http://loinc.org", "code": "8867-4"}]}, + "subject": {"reference": "urn:uuid:new-patient"} + })), + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: Some("urn:uuid:new-observation".to_string()), + }, + ]; + + let Some(result) = process_transaction_or_skip( + &backend, + &tenant, + entries, + "mongodb_integration_transaction_bundle_create_and_resolve_references", + ) + .await + else { + return; + }; + + assert_eq!(result.entries.len(), 2); + assert_eq!(result.entries[0].status, 201); + assert_eq!(result.entries[1].status, 201); + + let patient_location = result.entries[0] + .location + .as_deref() + .expect("patient location should be present"); + let expected_patient_reference = patient_location + .split("/_history") + .next() + .unwrap_or(patient_location) + .to_string(); + + let observation_location = result.entries[1] + .location + .as_deref() + .expect("observation location should be present"); + let observation_id = extract_resource_id_from_location(observation_location); + + let observation = backend + .read(&tenant, "Observation", &observation_id) + .await + .unwrap() + .unwrap(); + + let resolved_reference = observation.content()["subject"]["reference"] + .as_str() + .expect("resolved subject reference should be present"); + + assert_eq!(resolved_reference, expected_patient_reference); +} + +#[tokio::test] +async fn mongodb_integration_transaction_bundle_mixed_operations_and_idempotent_delete() { + let Some(backend) = create_backend("bundle_mixed_operations").await else { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_mixed_operations_and_idempotent_delete (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-bundle-mixed"); + + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "update-me", + "name": [{"family": "BeforeUpdate"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "delete-me", + "name": [{"family": "BeforeDelete"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let entries = vec![ + BundleEntry { + method: BundleMethod::Delete, + url: "Patient/delete-me".to_string(), + resource: None, + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: None, + }, + BundleEntry { + method: BundleMethod::Post, + url: "Patient".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "id": "new-from-transaction", + "name": [{"family": "CreatedInTransaction"}] + })), + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: Some("urn:uuid:new-created".to_string()), + }, + BundleEntry { + method: BundleMethod::Put, + url: "Patient/update-me".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "id": "update-me", + "name": [{"family": "AfterUpdate"}] + })), + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: None, + }, + ]; + + let Some(result) = process_transaction_or_skip( + &backend, + &tenant, + entries, + "mongodb_integration_transaction_bundle_mixed_operations_and_idempotent_delete", + ) + .await + else { + return; + }; + + assert_eq!(result.entries.len(), 3); + assert_eq!(result.entries[0].status, 204); + assert_eq!(result.entries[1].status, 201); + assert_eq!(result.entries[2].status, 200); + + let updated = backend + .read(&tenant, "Patient", "update-me") + .await + .unwrap() + .unwrap(); + assert_eq!(updated.content()["name"][0]["family"], "AfterUpdate"); + + let deleted = backend.read(&tenant, "Patient", "delete-me").await; + assert!(matches!( + deleted, + Err(StorageError::Resource(ResourceError::Gone { .. })) + )); + + let created = backend + .read(&tenant, "Patient", "new-from-transaction") + .await + .unwrap(); + assert!(created.is_some()); + + let idempotent_delete = vec![BundleEntry { + method: BundleMethod::Delete, + url: "Patient/non-existent-delete".to_string(), + resource: None, + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: None, + }]; + + let Some(idempotent_result) = process_transaction_or_skip( + &backend, + &tenant, + idempotent_delete, + "mongodb_integration_transaction_bundle_mixed_operations_and_idempotent_delete/idempotent", + ) + .await + else { + return; + }; + + assert_eq!(idempotent_result.entries.len(), 1); + assert_eq!(idempotent_result.entries[0].status, 204); +} + +#[tokio::test] +async fn mongodb_integration_transaction_bundle_conditional_headers() { + let Some(backend) = create_backend("bundle_conditional_headers").await else { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_conditional_headers (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-bundle-conditional"); + + let conditional_create = vec![BundleEntry { + method: BundleMethod::Post, + url: "Patient".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "identifier": [{"system": "http://example.org/mrn", "value": "MRN-TX-COND-1"}], + "name": [{"family": "ConditionalCreate"}] + })), + if_match: None, + if_none_match: None, + if_none_exist: Some("identifier=http://example.org/mrn|MRN-TX-COND-1".to_string()), + full_url: Some("urn:uuid:conditional-create".to_string()), + }]; + + let Some(first_create) = process_transaction_or_skip( + &backend, + &tenant, + conditional_create.clone(), + "mongodb_integration_transaction_bundle_conditional_headers/create-first", + ) + .await + else { + return; + }; + assert_eq!(first_create.entries[0].status, 201); + + let Some(second_create) = process_transaction_or_skip( + &backend, + &tenant, + conditional_create, + "mongodb_integration_transaction_bundle_conditional_headers/create-second", + ) + .await + else { + return; + }; + assert_eq!(second_create.entries[0].status, 200); + + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "if-match-target", + "name": [{"family": "BeforeIfMatch"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let good_if_match = vec![BundleEntry { + method: BundleMethod::Put, + url: "Patient/if-match-target".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "id": "if-match-target", + "name": [{"family": "AfterIfMatch"}] + })), + if_match: Some("W/\"1\"".to_string()), + if_none_match: None, + if_none_exist: None, + full_url: None, + }]; + + let Some(good_if_match_result) = process_transaction_or_skip( + &backend, + &tenant, + good_if_match, + "mongodb_integration_transaction_bundle_conditional_headers/if-match-good", + ) + .await + else { + return; + }; + assert_eq!(good_if_match_result.entries[0].status, 200); + + let bad_if_match = vec![BundleEntry { + method: BundleMethod::Put, + url: "Patient/if-match-target".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "id": "if-match-target", + "name": [{"family": "ShouldNotPersist"}] + })), + if_match: Some("W/\"999\"".to_string()), + if_none_match: None, + if_none_exist: None, + full_url: None, + }]; + + match backend.process_transaction(&tenant, bad_if_match).await { + Err(TransactionError::UnsupportedIsolationLevel { .. }) => { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_conditional_headers/if-match-bad (MongoDB topology does not support transactions)" + ); + return; + } + Err(TransactionError::BundleError { .. }) => {} + Err(other) => panic!("Unexpected transaction error: {}", other), + Ok(_) => panic!("Expected if-match failure transaction to return BundleError"), + } + + let read_after_bad_if_match = backend + .read(&tenant, "Patient", "if-match-target") + .await + .unwrap() + .unwrap(); + assert_eq!(read_after_bad_if_match.content()["name"][0]["family"], "AfterIfMatch"); +} + +#[tokio::test] +async fn mongodb_integration_transaction_bundle_rolls_back_on_failure() { + let Some(backend) = create_backend("bundle_rollback_failure").await else { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_rolls_back_on_failure (set HFS_TEST_MONGODB_URL)" + ); + return; + }; + + let tenant = create_tenant("tenant-bundle-rollback"); + + backend + .create( + &tenant, + "Patient", + json!({ + "resourceType": "Patient", + "id": "already-exists", + "name": [{"family": "PreExisting"}] + }), + FhirVersion::default(), + ) + .await + .unwrap(); + + let entries = vec![ + BundleEntry { + method: BundleMethod::Post, + url: "Patient".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "id": "should-rollback", + "name": [{"family": "ShouldRollback"}] + })), + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: Some("urn:uuid:rollback-created".to_string()), + }, + BundleEntry { + method: BundleMethod::Post, + url: "Patient".to_string(), + resource: Some(json!({ + "resourceType": "Patient", + "id": "already-exists", + "name": [{"family": "Duplicate"}] + })), + if_match: None, + if_none_match: None, + if_none_exist: None, + full_url: Some("urn:uuid:rollback-fail".to_string()), + }, + ]; + + match backend.process_transaction(&tenant, entries).await { + Err(TransactionError::UnsupportedIsolationLevel { .. }) => { + eprintln!( + "Skipping mongodb_integration_transaction_bundle_rolls_back_on_failure (MongoDB topology does not support transactions)" + ); + return; + } + Err(TransactionError::BundleError { .. }) => {} + Err(other) => panic!("Unexpected transaction error: {}", other), + Ok(_) => panic!("Expected rollback scenario to fail transaction"), + } + + let rolled_back = backend + .read(&tenant, "Patient", "should-rollback") + .await + .unwrap(); + assert!(rolled_back.is_none()); +} + #[tokio::test] async fn mongodb_integration_tenant_isolation() { let Some(backend) = create_backend("tenant").await else { diff --git a/final_roadmap.xml b/final_roadmap.xml new file mode 100644 index 00000000..a9ebb30f --- /dev/null +++ b/final_roadmap.xml @@ -0,0 +1,228 @@ + + + + HeliosSoftware/hfs + in-progress + TBD + + 2026-03-10 + MongoDB transaction-bundle parity and full Inferno matrix enablement + Completed: MongoDB transaction-bundle support, topology gating, capability updates, + and transaction integration tests. In progress: Inferno workflow promotion to full MongoDB + matrix coverage on replica-set topology and end-to-end validation. + + + + Implement MongoDB transaction-bundle parity and promote MongoDB into the full Inferno matrix by + adding real bundle transaction support, validating it against a transaction-capable Mongo + deployment, and then updating the Inferno workflow. + + + + MongoBackend process_transaction + now executes real atomic transaction bundles with session boundaries, topology checks, + conditional handling, and urn:uuid reference resolution. + MongoDB integration tests now include + positive transaction-bundle coverage (topology gating, reference resolution, conditionals, and + rollback) and capability assertions expect transaction support. + Inferno workflow now starts MongoDB in replica-set + mode within the full backend matrix, and the separate Mongo smoke job has been removed. + Inferno data loading posts FHIR transaction + bundles to the server root. + Inferno fixtures rely heavily on transaction + bundles, PUT, POST, fullUrl, and urn:uuid cross-entry references. + + + + Full FHIR transaction semantics require a transaction-capable MongoDB deployment; + standalone MongoDB is not sufficient. + Do not claim full Mongo Inferno parity until transaction bundle loading is validated + end-to-end. + Keep behavior aligned with existing SQLite and PostgreSQL bundle handling where + practical. + Preserve truthful capability reporting and CI claims throughout the rollout. + + + + + + + + + + + + + + + + + + + + + + + + Implement real transaction-bundle execution for MongoBackend. + + Implement MongoBackend process_transaction. + Process entries in FHIR transaction order compatible with the REST handler + assumptions. + Support POST, PUT, DELETE, and GET bundle entry types. + Add fullUrl to assigned-reference mapping and resolve urn:uuid references + before executing dependent entries. + Rollback the whole bundle on entry failure and commit only on full success. + Mirror existing SQLite and PostgreSQL bundle response semantics as closely + as practical. + + + MongoBackend transaction bundles execute atomically on supported Mongo + topologies. + Inferno fixture bundle patterns are supported at the persistence boundary. + + + + + Make Mongo transaction-bundle semantics honest and topology-aware. + + Use a real Mongo session and transaction boundary for bundle execution. + Require a replica set or otherwise transaction-capable deployment for true + transaction bundle support. + If the topology is not transaction-capable, fail transaction-bundle + execution with a clear error rather than silently degrading atomicity. + + + Mongo bundle transaction behavior matches actual deployment guarantees. + + + + + Bring Mongo bundle behavior close enough to existing backends for reliable parity. + + Support bundle ifMatch behavior on PUT entries. + Support bundle ifNoneExist behavior on POST entries. + Preserve idempotent delete behavior expectations where existing bundle + tests rely on it. + Keep PATCH behavior explicitly not-supported unless implemented + consistently across backends. + + + Mongo bundle entry semantics are consistent with the current backend contract + used by HFS and Inferno fixture loading. + + + + + Update tests and capability declarations to match delivered Mongo behavior. + + Replace the current unsupported Mongo transaction-bundle assertions with + positive coverage. + Add Mongo tests for create-only bundles, mixed-operation bundles, and + urn:uuid reference resolution. + Add rollback and failure tests for Mongo bundle transactions. + Update Mongo capability expectations if transactions are now supported. + + + Mongo capability tests and backend declarations truthfully represent delivered + transaction support. + + + + + Promote Mongo from smoke coverage to full Inferno matrix coverage once parity is + validated. + + Change Mongo startup in CI from standalone to single-node replica-set mode. + Add replica-set initialization and readiness checks. + Verify HFS connects successfully using the transaction-capable MongoDB URI. + Add mongodb to the full Inferno backend matrix. + Keep or remove the separate Mongo smoke job based on whether it still adds + value after full matrix promotion. + + + MongoDB participates truthfully in the full Inferno matrix under a + transaction-capable topology. + + + + + + + cargo test -p helios-persistence --features mongodb --test mongodb_tests + Focused Mongo bundle transaction coverage under + crates/persistence/tests/transactions. + Rollback and reference-resolution scenarios for Mongo bundle execution. + + + cargo check -p helios-hfs --features R4,mongodb + cargo check -p helios-hfs --features R4,sqlite,elasticsearch,postgres,mongodb + Run Inferno fixture loading against replica-set MongoDB before promoting Mongo + into the full matrix. + + + + + cargo test -p helios-persistence --features mongodb --test mongodb_tests + cargo check -p helios-hfs --features R4,mongodb + cargo check -p helios-hfs --features R4,sqlite,elasticsearch,postgres,mongodb + + + + + Standalone MongoDB cannot provide true transaction semantics; the CI topology + must change. + Use a single-node replica set in CI and fail transaction bundles clearly on + unsupported topologies. + + + Conditional bundle semantics may need session-aware matching logic if current + Mongo conditional helpers are not transaction-aware. + Refactor matching helpers or bundle-path logic so transaction-bound reads observe + the correct state. + + + Search index updates and SearchParameter registry changes must remain + rollback-safe inside the bundle transaction path. + Keep search-index and registry mutations inside the same transaction-aware + execution flow or defer non-transaction-safe side effects until commit success. + + + Workflow changes could over-claim parity before end-to-end Inferno fixture + loading is validated. + Promote Mongo to the full matrix only after backend validation and fixture-loading + success. + + + + + + WS1.1-WS1.6, WS2.1-WS2.3 + MongoBackend transaction bundles execute atomically on transaction-capable MongoDB + deployments. + + + WS3.1-WS3.4, WS4.1-WS4.4 + Mongo transaction support is covered by tests and reflected truthfully in + capability expectations. + + + WS5.1-WS5.5 + MongoDB runs in a replica-set topology in CI and is part of the full Inferno + matrix. + + + + + Mongo transaction bundles no longer return unsupported errors. + Inferno fixture transaction bundles load successfully against MongoDB. + MongoDB CI uses a transaction-capable topology. + MongoDB is added truthfully to the full Inferno matrix. + Mongo capability, tests, and workflow reflect the same delivered behavior. + + \ No newline at end of file From 7801bcfa802e97417bb1cde37bc41ecb0c585a1d Mon Sep 17 00:00:00 2001 From: dougc95 Date: Tue, 10 Mar 2026 19:49:12 -0400 Subject: [PATCH 15/17] fmt: apply linting --- .../persistence/src/backends/mongodb/mod.rs | 2 +- .../src/backends/mongodb/search_impl.rs | 153 ++++++------- .../src/backends/mongodb/storage.rs | 205 +++++++++++++----- crates/persistence/src/composite/storage.rs | 45 ++-- crates/persistence/tests/mongodb_tests.rs | 69 ++++-- 5 files changed, 314 insertions(+), 160 deletions(-) diff --git a/crates/persistence/src/backends/mongodb/mod.rs b/crates/persistence/src/backends/mongodb/mod.rs index 96596492..1267f465 100644 --- a/crates/persistence/src/backends/mongodb/mod.rs +++ b/crates/persistence/src/backends/mongodb/mod.rs @@ -16,8 +16,8 @@ //! Advanced search/composite behavior remains part of later phases. mod backend; -mod search_impl; pub(crate) mod schema; +mod search_impl; mod storage; pub use backend::{MongoBackend, MongoBackendConfig}; diff --git a/crates/persistence/src/backends/mongodb/search_impl.rs b/crates/persistence/src/backends/mongodb/search_impl.rs index 3400d182..02707f51 100644 --- a/crates/persistence/src/backends/mongodb/search_impl.rs +++ b/crates/persistence/src/backends/mongodb/search_impl.rs @@ -142,7 +142,10 @@ impl SearchProvider for MongoBackend { let sort = self.build_sort_document(query, previous_mode)?; let page_size = query.count.unwrap_or(100).max(1) as usize; - let mut find_action = resources.find(filter).sort(sort).limit((page_size + 1) as i64); + let mut find_action = resources + .find(filter) + .sort(sort) + .limit((page_size + 1) as i64); if cursor.is_none() { if let Some(offset) = query.offset { @@ -174,29 +177,25 @@ impl SearchProvider for MongoBackend { let has_previous = cursor.is_some() || query.offset.unwrap_or(0) > 0; let next_cursor = if has_next { - resources - .last() - .map(|resource| { - PageCursor::new( - vec![CursorValue::String(resource.last_modified().to_rfc3339())], - resource.id(), - ) - .encode() - }) + resources.last().map(|resource| { + PageCursor::new( + vec![CursorValue::String(resource.last_modified().to_rfc3339())], + resource.id(), + ) + .encode() + }) } else { None }; let previous_cursor = if has_previous { - resources - .first() - .map(|resource| { - PageCursor::previous( - vec![CursorValue::String(resource.last_modified().to_rfc3339())], - resource.id(), - ) - .encode() - }) + resources.first().map(|resource| { + PageCursor::previous( + vec![CursorValue::String(resource.last_modified().to_rfc3339())], + resource.id(), + ) + .encode() + }) } else { None }; @@ -222,7 +221,11 @@ impl SearchProvider for MongoBackend { }) } - async fn search_count(&self, tenant: &TenantContext, query: &SearchQuery) -> StorageResult { + async fn search_count( + &self, + tenant: &TenantContext, + query: &SearchQuery, + ) -> StorageResult { self.validate_query_support(query)?; let db = self.get_database().await?; @@ -348,9 +351,11 @@ impl ConditionalStorage for MongoBackend { impl MongoBackend { fn validate_query_support(&self, query: &SearchQuery) -> StorageResult<()> { if query.parameters.iter().any(|param| !param.chain.is_empty()) { - return Err(StorageError::Search(SearchError::ChainedSearchNotSupported { - chain: "forward chain".to_string(), - })); + return Err(StorageError::Search( + SearchError::ChainedSearchNotSupported { + chain: "forward chain".to_string(), + }, + )); } if !query.reverse_chains.is_empty() { @@ -463,7 +468,10 @@ impl MongoBackend { return Ok(filter); } - let combine_with_and = matches!(param.param_type, SearchParamType::Date | SearchParamType::Number); + let combine_with_and = matches!( + param.param_type, + SearchParamType::Date | SearchParamType::Number + ); let operator = if combine_with_and { "$and" } else { "$or" }; filter.insert( operator, @@ -500,15 +508,21 @@ impl MongoBackend { SearchParamType::Number => self.build_number_filter(value), SearchParamType::Reference => self.build_reference_filter(param, value), SearchParamType::Uri => self.build_uri_filter(param, value), - SearchParamType::Quantity => Err(StorageError::Search(SearchError::UnsupportedParameterType { - param_type: "quantity".to_string(), - })), - SearchParamType::Composite => Err(StorageError::Search(SearchError::InvalidComposite { - message: "Composite search is not supported in MongoDB Phase 4".to_string(), - })), - SearchParamType::Special => Err(StorageError::Search(SearchError::UnsupportedParameterType { - param_type: format!("special parameter {}", param.name), - })), + SearchParamType::Quantity => Err(StorageError::Search( + SearchError::UnsupportedParameterType { + param_type: "quantity".to_string(), + }, + )), + SearchParamType::Composite => { + Err(StorageError::Search(SearchError::InvalidComposite { + message: "Composite search is not supported in MongoDB Phase 4".to_string(), + })) + } + SearchParamType::Special => Err(StorageError::Search( + SearchError::UnsupportedParameterType { + param_type: format!("special parameter {}", param.name), + }, + )), } } @@ -521,8 +535,7 @@ impl MongoBackend { return Err(StorageError::Search(SearchError::QueryParseError { message: format!( "Unsupported prefix '{}' for string parameter '{}'", - value.prefix, - param.name + value.prefix, param.name ), })); } @@ -556,8 +569,7 @@ impl MongoBackend { return Err(StorageError::Search(SearchError::QueryParseError { message: format!( "Unsupported prefix '{}' for token parameter '{}'", - value.prefix, - param.name + value.prefix, param.name ), })); } @@ -597,8 +609,7 @@ impl MongoBackend { return Err(StorageError::Search(SearchError::QueryParseError { message: format!( "Unsupported prefix '{}' for reference parameter '{}'", - value.prefix, - param.name + value.prefix, param.name ), })); } @@ -635,8 +646,7 @@ impl MongoBackend { return Err(StorageError::Search(SearchError::QueryParseError { message: format!( "Unsupported prefix '{}' for uri parameter '{}'", - value.prefix, - param.name + value.prefix, param.name ), })); } @@ -687,12 +697,11 @@ impl MongoBackend { } fn build_number_filter(&self, value: &SearchValue) -> StorageResult { - let parsed = value - .value - .parse::() - .map_err(|e| StorageError::Search(SearchError::QueryParseError { + let parsed = value.value.parse::().map_err(|e| { + StorageError::Search(SearchError::QueryParseError { message: format!("Invalid number value '{}': {}", value.value, e), - }))?; + }) + })?; match value.prefix { SearchPrefix::Ap => { @@ -742,11 +751,7 @@ impl MongoBackend { }]; if let Some(ids) = matched_ids { - let id_values = ids - .iter() - .cloned() - .map(Bson::String) - .collect::>(); + let id_values = ids.iter().cloned().map(Bson::String).collect::>(); conditions.push(doc! { "id": { "$in": Bson::Array(id_values) } }); @@ -783,10 +788,7 @@ impl MongoBackend { for value in ¶m.values { if value.prefix != SearchPrefix::Eq { return Err(StorageError::Search(SearchError::QueryParseError { - message: format!( - "Unsupported prefix '{}' for _id parameter", - value.prefix - ), + message: format!("Unsupported prefix '{}' for _id parameter", value.prefix), })); } ids.push(value.value.clone()); @@ -814,15 +816,13 @@ impl MongoBackend { fn build_cursor_condition(&self, cursor: &PageCursor) -> StorageResult { let timestamp = match cursor.sort_values().first() { - Some(CursorValue::String(value)) => { - DateTime::parse_from_rfc3339(value) - .map_err(|_| { - StorageError::Search(SearchError::InvalidCursor { - cursor: cursor.encode(), - }) - })? - .with_timezone(&Utc) - } + Some(CursorValue::String(value)) => DateTime::parse_from_rfc3339(value) + .map_err(|_| { + StorageError::Search(SearchError::InvalidCursor { + cursor: cursor.encode(), + }) + })? + .with_timezone(&Utc), _ => { return Err(StorageError::Search(SearchError::InvalidCursor { cursor: cursor.encode(), @@ -850,7 +850,11 @@ impl MongoBackend { } } - fn build_sort_document(&self, query: &SearchQuery, previous_mode: bool) -> StorageResult { + fn build_sort_document( + &self, + query: &SearchQuery, + previous_mode: bool, + ) -> StorageResult { if query.sort.is_empty() { return Ok(if previous_mode { doc! { "last_updated": 1_i32, "id": 1_i32 } @@ -866,9 +870,11 @@ impl MongoBackend { "_lastUpdated" => "last_updated", "_id" | "id" => "id", other => { - return Err(StorageError::Search(SearchError::UnsupportedParameterType { - param_type: format!("sort parameter '{}'", other), - })); + return Err(StorageError::Search( + SearchError::UnsupportedParameterType { + param_type: format!("sort parameter '{}'", other), + }, + )); } }; @@ -914,12 +920,13 @@ impl MongoBackend { .map_err(|e| internal_error(format!("Missing version_id in search result: {}", e)))? .to_string(); - let payload = doc - .get_document("data") - .map_err(|e| internal_error(format!("Missing resource payload in search result: {}", e)))?; + let payload = doc.get_document("data").map_err(|e| { + internal_error(format!("Missing resource payload in search result: {}", e)) + })?; - let content = bson::from_bson::(Bson::Document(payload.clone())) - .map_err(|e| serialization_error(format!("Failed to deserialize resource payload: {}", e)))?; + let content = bson::from_bson::(Bson::Document(payload.clone())).map_err(|e| { + serialization_error(format!("Failed to deserialize resource payload: {}", e)) + })?; let now = Utc::now(); let created_at = doc diff --git a/crates/persistence/src/backends/mongodb/storage.rs b/crates/persistence/src/backends/mongodb/storage.rs index 52ee47e2..e0d4ad3b 100644 --- a/crates/persistence/src/backends/mongodb/storage.rs +++ b/crates/persistence/src/backends/mongodb/storage.rs @@ -394,9 +394,12 @@ fn document_to_stored_resource( .map_err(|e| internal_error(format!("Missing version_id in MongoDB document: {}", e)))? .to_string(); - let payload = doc - .get_document("data") - .map_err(|e| internal_error(format!("Missing resource payload in MongoDB document: {}", e)))?; + let payload = doc.get_document("data").map_err(|e| { + internal_error(format!( + "Missing resource payload in MongoDB document: {}", + e + )) + })?; let content = document_to_value(payload)?; let now = Utc::now(); @@ -421,20 +424,19 @@ fn document_to_stored_resource( async fn begin_required_bundle_transaction_session( db: &mongodb::Database, ) -> Result { - let mut session = db - .client() - .start_session() - .await - .map_err(|e| TransactionError::RolledBack { - reason: format!("Failed to start MongoDB session: {}", e), - })?; + let mut session = + db.client() + .start_session() + .await + .map_err(|e| TransactionError::RolledBack { + reason: format!("Failed to start MongoDB session: {}", e), + })?; - let hello = db - .run_command(doc! { "hello": 1_i32 }) - .await - .map_err(|e| TransactionError::RolledBack { + let hello = db.run_command(doc! { "hello": 1_i32 }).await.map_err(|e| { + TransactionError::RolledBack { reason: format!("Failed to inspect MongoDB topology: {}", e), - })?; + } + })?; let supports_transactions = hello.contains_key("setName") || hello @@ -1180,7 +1182,12 @@ impl MongoBackend { resource_id, e ); - self.index_minimal_fallback_documents(tenant_id, resource_type, resource_id, resource) + self.index_minimal_fallback_documents( + tenant_id, + resource_type, + resource_id, + resource, + ) } }; @@ -1195,12 +1202,13 @@ impl MongoBackend { .insert_many(index_docs) .session(active_session) .await - .map_err(|e| internal_error(format!("Failed to insert search index entries: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to insert search index entries: {}", e)) + })?; } else { - collection - .insert_many(index_docs) - .await - .map_err(|e| internal_error(format!("Failed to insert search index entries: {}", e)))?; + collection.insert_many(index_docs).await.map_err(|e| { + internal_error(format!("Failed to insert search index entries: {}", e)) + })?; } Ok(()) @@ -1230,12 +1238,13 @@ impl MongoBackend { .delete_many(filter) .session(active_session) .await - .map_err(|e| internal_error(format!("Failed to delete search index entries: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to delete search index entries: {}", e)) + })?; } else { - collection - .delete_many(filter) - .await - .map_err(|e| internal_error(format!("Failed to delete search index entries: {}", e)))?; + collection.delete_many(filter).await.map_err(|e| { + internal_error(format!("Failed to delete search index entries: {}", e)) + })?; } Ok(()) @@ -2235,7 +2244,12 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to check resource existence in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!( + "Failed to check resource existence in transaction: {}", + e + )) + })?; if existing.is_some() { return Err(StorageError::Resource(ResourceError::AlreadyExists { @@ -2295,15 +2309,23 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to insert history in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to insert history in transaction: {}", e)) + })?; - self.index_resource_in_bundle_transaction(db, session, tenant_id, resource_type, &id, &resource) - .await?; + self.index_resource_in_bundle_transaction( + db, + session, + tenant_id, + resource_type, + &id, + &resource, + ) + .await?; if resource_type == "SearchParameter" { - pending_search_parameter_changes.push(PendingSearchParameterChange::Create( - resource.clone(), - )); + pending_search_parameter_changes + .push(PendingSearchParameterChange::Create(resource.clone())); } Ok(StoredResource::from_storage( @@ -2343,7 +2365,12 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to load current resource in transaction: {}", e)))? + .map_err(|e| { + internal_error(format!( + "Failed to load current resource in transaction: {}", + e + )) + })? .ok_or_else(|| { StorageError::Resource(ResourceError::NotFound { resource_type: resource_type.to_string(), @@ -2399,7 +2426,9 @@ impl MongoBackend { ) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to update resource in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to update resource in transaction: {}", e)) + })?; if update_result.matched_count == 0 { return Err(StorageError::Concurrency( @@ -2429,10 +2458,19 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to insert history in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to insert history in transaction: {}", e)) + })?; - self.index_resource_in_bundle_transaction(db, session, tenant_id, resource_type, id, &resource) - .await?; + self.index_resource_in_bundle_transaction( + db, + session, + tenant_id, + resource_type, + id, + &resource, + ) + .await?; if resource_type == "SearchParameter" { pending_search_parameter_changes.push(PendingSearchParameterChange::Update { @@ -2476,7 +2514,12 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to load resource for delete in transaction: {}", e)))? + .map_err(|e| { + internal_error(format!( + "Failed to load resource for delete in transaction: {}", + e + )) + })? .ok_or_else(|| { StorageError::Resource(ResourceError::NotFound { resource_type: resource_type.to_string(), @@ -2486,13 +2529,23 @@ impl MongoBackend { let current_version = existing_doc .get_str("version_id") - .map_err(|e| internal_error(format!("Missing current version in transaction delete: {}", e)))? + .map_err(|e| { + internal_error(format!( + "Missing current version in transaction delete: {}", + e + )) + })? .to_string(); let new_version = next_version(¤t_version)?; let payload = existing_doc .get_document("data") - .map_err(|e| internal_error(format!("Missing resource payload in transaction delete: {}", e)))? + .map_err(|e| { + internal_error(format!( + "Missing resource payload in transaction delete: {}", + e + )) + })? .clone(); let resource_value = document_to_value(&payload)?; let fhir_version = existing_doc @@ -2524,7 +2577,12 @@ impl MongoBackend { ) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to soft-delete resource in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!( + "Failed to soft-delete resource in transaction: {}", + e + )) + })?; if update_result.matched_count == 0 { return Err(StorageError::Resource(ResourceError::NotFound { @@ -2548,13 +2606,19 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to insert delete history in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!( + "Failed to insert delete history in transaction: {}", + e + )) + })?; self.delete_search_index_in_bundle_transaction(db, session, tenant_id, resource_type, id) .await?; if resource_type == "SearchParameter" { - pending_search_parameter_changes.push(PendingSearchParameterChange::Delete(resource_value)); + pending_search_parameter_changes + .push(PendingSearchParameterChange::Delete(resource_value)); } Ok(()) @@ -2624,7 +2688,9 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to read resource in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!("Failed to read resource in transaction: {}", e)) + })?; maybe_doc .as_ref() @@ -2656,15 +2722,23 @@ impl MongoBackend { }) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to query conditional matches in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!( + "Failed to query conditional matches in transaction: {}", + e + )) + })?; let docs = collect_session_documents(cursor, session).await?; let mut matches = Vec::new(); for doc in docs { - let payload = doc - .get_document("data") - .map_err(|e| internal_error(format!("Missing payload while matching conditionals: {}", e)))?; + let payload = doc.get_document("data").map_err(|e| { + internal_error(format!( + "Missing payload while matching conditionals: {}", + e + )) + })?; let resource = document_to_value(payload)?; if resource_matches_bundle_search_params(&resource, &parsed_params) @@ -2693,8 +2767,14 @@ impl MongoBackend { return Ok(()); } - self.delete_search_index_in_bundle_transaction(db, session, tenant_id, resource_type, resource_id) - .await?; + self.delete_search_index_in_bundle_transaction( + db, + session, + tenant_id, + resource_type, + resource_id, + ) + .await?; let index_docs = match self.search_extractor().extract(resource, resource_type) { Ok(values) => values @@ -2710,7 +2790,12 @@ impl MongoBackend { resource_id, e ); - self.index_minimal_fallback_documents(tenant_id, resource_type, resource_id, resource) + self.index_minimal_fallback_documents( + tenant_id, + resource_type, + resource_id, + resource, + ) } }; @@ -2722,7 +2807,12 @@ impl MongoBackend { .insert_many(index_docs) .session(&mut *session) .await - .map_err(|e| internal_error(format!("Failed to insert search_index entries in transaction: {}", e)))?; + .map_err(|e| { + internal_error(format!( + "Failed to insert search_index entries in transaction: {}", + e + )) + })?; Ok(()) } @@ -2765,7 +2855,10 @@ impl MongoBackend { .unwrap_or(url); let path = path.trim_start_matches('/'); - let parts: Vec<&str> = path.split('/').filter(|segment| !segment.is_empty()).collect(); + let parts: Vec<&str> = path + .split('/') + .filter(|segment| !segment.is_empty()) + .collect(); if parts.len() >= 2 { let len = parts.len(); @@ -2804,7 +2897,9 @@ fn resource_identifier_matches(resource: &Value, expected: &str) -> bool { }; match identifier_value { - Value::Array(items) => items.iter().any(|item| match_identifier_item(item, system, value, has_separator)), + Value::Array(items) => items + .iter() + .any(|item| match_identifier_item(item, system, value, has_separator)), Value::Object(_) => match_identifier_item(identifier_value, system, value, has_separator), _ => false, } diff --git a/crates/persistence/src/composite/storage.rs b/crates/persistence/src/composite/storage.rs index 7cb7e471..97bb715f 100644 --- a/crates/persistence/src/composite/storage.rs +++ b/crates/persistence/src/composite/storage.rs @@ -185,8 +185,9 @@ impl CompositeStorage { "_lastUpdated" => SearchParamType::Date, "_tag" | "_profile" | "_security" | "identifier" => SearchParamType::Token, "patient" | "subject" | "encounter" | "performer" | "author" | "requester" - | "recorder" | "asserter" | "practitioner" | "organization" | "location" - | "device" => SearchParamType::Reference, + | "recorder" | "asserter" | "practitioner" | "organization" | "location" | "device" => { + SearchParamType::Reference + } _ => SearchParamType::String, } } @@ -1091,7 +1092,9 @@ impl ConditionalStorage for CompositeStorage { 0 => Ok(ConditionalDeleteResult::NoMatch), 1 => { let current = matches.into_iter().next().expect("single match must exist"); - self.primary.delete(tenant, resource_type, current.id()).await?; + self.primary + .delete(tenant, resource_type, current.id()) + .await?; if let Err(e) = self .sync_to_secondaries(SyncEvent::Delete { @@ -1773,13 +1776,15 @@ impl CapabilityProvider for CompositeStorage { #[cfg(test)] mod tests { use super::*; - use async_trait::async_trait; - use helios_fhir::FhirVersion; - use serde_json::{Value, json}; use crate::core::BackendKind; use crate::error::{BackendError, StorageError, StorageResult}; use crate::tenant::{TenantContext, TenantId, TenantPermissions}; - use crate::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue, StoredResource}; + use crate::types::{ + SearchParamType, SearchParameter, SearchQuery, SearchValue, StoredResource, + }; + use async_trait::async_trait; + use helios_fhir::FhirVersion; + use serde_json::{Value, json}; #[derive(Debug)] struct FailingSearchBackend { @@ -1927,7 +1932,10 @@ mod tests { let search = Arc::new(SqliteBackend::in_memory().expect("create search sqlite backend")); search.init_schema().expect("init search sqlite schema"); - let tenant = TenantContext::new(TenantId::new("composite-test"), TenantPermissions::full_access()); + let tenant = TenantContext::new( + TenantId::new("composite-test"), + TenantPermissions::full_access(), + ); // Seed distinct data so we can tell which provider answered the query. primary @@ -2024,8 +2032,10 @@ mod tests { let search = Arc::new(SqliteBackend::in_memory().expect("create search sqlite backend")); search.init_schema().expect("init search sqlite schema"); - let tenant_a = TenantContext::new(TenantId::new("tenant-a"), TenantPermissions::full_access()); - let tenant_b = TenantContext::new(TenantId::new("tenant-b"), TenantPermissions::full_access()); + let tenant_a = + TenantContext::new(TenantId::new("tenant-a"), TenantPermissions::full_access()); + let tenant_b = + TenantContext::new(TenantId::new("tenant-b"), TenantPermissions::full_access()); search .create( @@ -2136,7 +2146,10 @@ mod tests { .expect("create composite storage") .with_search_providers(search_providers); - let tenant = TenantContext::new(TenantId::new("tenant-failure"), TenantPermissions::full_access()); + let tenant = TenantContext::new( + TenantId::new("tenant-failure"), + TenantPermissions::full_access(), + ); let query = SearchQuery::new("Patient").with_parameter(SearchParameter { name: "_id".to_string(), param_type: SearchParamType::Token, @@ -2165,8 +2178,14 @@ mod tests { let health = composite .backend_health("search") .expect("search backend health should exist"); - assert!(!health.healthy, "search backend should be marked unhealthy after failure"); + assert!( + !health.healthy, + "search backend should be marked unhealthy after failure" + ); assert_eq!(health.failure_count, 1); - assert_eq!(health.last_error.as_deref(), Some("connection failed to search: simulated search outage")); + assert_eq!( + health.last_error.as_deref(), + Some("connection failed to search: simulated search outage") + ); } } diff --git a/crates/persistence/tests/mongodb_tests.rs b/crates/persistence/tests/mongodb_tests.rs index 61a68971..1a112cd9 100644 --- a/crates/persistence/tests/mongodb_tests.rs +++ b/crates/persistence/tests/mongodb_tests.rs @@ -12,19 +12,20 @@ use helios_fhir::FhirVersion; use helios_persistence::backends::mongodb::{MongoBackend, MongoBackendConfig}; use helios_persistence::core::{ Backend, BackendCapability, BackendKind, BundleEntry, BundleMethod, BundleProvider, - BundleResult, ConditionalCreateResult, - ConditionalDeleteResult, ConditionalStorage, ConditionalUpdateResult, HistoryParams, - InstanceHistoryProvider, PatchFormat, ResourceStorage, SearchProvider, SystemHistoryProvider, - TypeHistoryProvider, VersionedStorage, + BundleResult, ConditionalCreateResult, ConditionalDeleteResult, ConditionalStorage, + ConditionalUpdateResult, HistoryParams, InstanceHistoryProvider, PatchFormat, ResourceStorage, + SearchProvider, SystemHistoryProvider, TypeHistoryProvider, VersionedStorage, }; use helios_persistence::error::{ BackendError, ConcurrencyError, ResourceError, StorageError, TransactionError, }; use helios_persistence::search::SearchParameterStatus; use helios_persistence::tenant::{TenantContext, TenantId, TenantPermissions}; -use helios_persistence::types::{SearchParamType, SearchParameter, SearchQuery, SearchValue, SortDirective}; -use mongodb::bson::{Document, doc}; +use helios_persistence::types::{ + SearchParamType, SearchParameter, SearchQuery, SearchValue, SortDirective, +}; use mongodb::Client; +use mongodb::bson::{Document, doc}; use serde_json::json; const MONGODB_MAX_DATABASE_NAME_LEN: usize = 63; @@ -41,7 +42,11 @@ fn build_test_database_name(test_name: &str) -> String { fn extract_resource_id_from_location(location: &str) -> String { let resource_path = location.split("/_history").next().unwrap_or(location); - resource_path.rsplit('/').next().unwrap_or(resource_path).to_string() + resource_path + .rsplit('/') + .next() + .unwrap_or(resource_path) + .to_string() } #[test] @@ -144,7 +149,9 @@ async fn test_mongodb_bundle_provider_batch_not_supported() { let result = backend.process_batch(&tenant, vec![]).await; assert!(matches!( result, - Err(StorageError::Backend(BackendError::UnsupportedCapability { .. })) + Err(StorageError::Backend( + BackendError::UnsupportedCapability { .. } + )) )); } @@ -622,7 +629,10 @@ async fn mongodb_integration_transaction_bundle_conditional_headers() { .await .unwrap() .unwrap(); - assert_eq!(read_after_bad_if_match.content()["name"][0]["family"], "AfterIfMatch"); + assert_eq!( + read_after_bad_if_match.content()["name"][0]["family"], + "AfterIfMatch" + ); } #[tokio::test] @@ -1256,7 +1266,9 @@ async fn mongodb_integration_search_cursor_pagination_roundtrip() { #[tokio::test] async fn mongodb_integration_conditional_create_exists() { let Some(backend) = create_backend("conditional_create").await else { - eprintln!("Skipping mongodb_integration_conditional_create_exists (set HFS_TEST_MONGODB_URL)"); + eprintln!( + "Skipping mongodb_integration_conditional_create_exists (set HFS_TEST_MONGODB_URL)" + ); return; }; @@ -1434,7 +1446,9 @@ async fn mongodb_integration_conditional_create_multiple_matches() { #[tokio::test] async fn mongodb_integration_conditional_patch_not_supported() { let Some(backend) = create_backend("conditional_patch_not_supported").await else { - eprintln!("Skipping mongodb_integration_conditional_patch_not_supported (set HFS_TEST_MONGODB_URL)"); + eprintln!( + "Skipping mongodb_integration_conditional_patch_not_supported (set HFS_TEST_MONGODB_URL)" + ); return; }; @@ -1490,7 +1504,10 @@ async fn mongodb_integration_search_parameter_create_registers_active() { let registry = backend.search_registry().read(); let param = registry.get_param("Patient", "mongo-nickname"); - assert!(param.is_some(), "Active SearchParameter should be registered"); + assert!( + param.is_some(), + "Active SearchParameter should be registered" + ); let param = param.unwrap(); assert_eq!( @@ -1573,7 +1590,10 @@ async fn mongodb_integration_search_parameter_update_status_change() { { let registry = backend.search_registry().read(); let param = registry.get_param("Condition", "mongo-statuschange"); - assert!(param.is_some(), "Parameter should be registered after create"); + assert!( + param.is_some(), + "Parameter should be registered after create" + ); assert_eq!( param.unwrap().status, SearchParameterStatus::Active, @@ -1643,7 +1663,11 @@ async fn mongodb_integration_search_parameter_delete_unregisters() { { let registry = backend.search_registry().read(); - assert!(registry.get_param("Observation", "mongo-todelete").is_some()); + assert!( + registry + .get_param("Observation", "mongo-todelete") + .is_some() + ); } backend @@ -1653,14 +1677,18 @@ async fn mongodb_integration_search_parameter_delete_unregisters() { let registry = backend.search_registry().read(); assert!( - registry.get_param("Observation", "mongo-todelete").is_none(), + registry + .get_param("Observation", "mongo-todelete") + .is_none(), "Deleted SearchParameter should be unregistered" ); } #[tokio::test] async fn mongodb_integration_search_offloaded_prevents_search_index_writes() { - let Some(backend) = create_backend_with_search_offloaded("search_offloaded_no_index", true).await else { + let Some(backend) = + create_backend_with_search_offloaded("search_offloaded_no_index", true).await + else { eprintln!( "Skipping mongodb_integration_search_offloaded_prevents_search_index_writes (set HFS_TEST_MONGODB_URL)" ); @@ -1759,7 +1787,9 @@ async fn mongodb_integration_standalone_search_writes_search_index() { #[tokio::test] async fn mongodb_integration_search_parameter_registry_updates_when_offloaded() { - let Some(backend) = create_backend_with_search_offloaded("search_param_offloaded_registry", true).await else { + let Some(backend) = + create_backend_with_search_offloaded("search_param_offloaded_registry", true).await + else { eprintln!( "Skipping mongodb_integration_search_parameter_registry_updates_when_offloaded (set HFS_TEST_MONGODB_URL)" ); @@ -1790,7 +1820,10 @@ async fn mongodb_integration_search_parameter_registry_updates_when_offloaded() let registry = backend.search_registry().read(); let param = registry.get_param("Patient", "mongo-offloaded-code"); - assert!(param.is_some(), "Active SearchParameter should register when offloaded"); + assert!( + param.is_some(), + "Active SearchParameter should register when offloaded" + ); assert_eq!(param.unwrap().status, SearchParameterStatus::Active); drop(registry); From fce8ac42e9aa1e465bb14b0d77c26b278255ecea Mon Sep 17 00:00:00 2001 From: dougc95 Date: Tue, 10 Mar 2026 20:08:54 -0400 Subject: [PATCH 16/17] chore: remove obsolete MongoDB phase roadmap artifacts --- final_roadmap.xml | 228 -------------------------- phase2_roadmap.xml | 292 --------------------------------- phase3_roadmap.xml | 374 ------------------------------------------ phase4_roadmap.xml | 305 ---------------------------------- phase5_roadmap.xml | 401 --------------------------------------------- phase6_roadmap.xml | 200 ---------------------- roadmap_mongo.xml | 343 -------------------------------------- 7 files changed, 2143 deletions(-) delete mode 100644 final_roadmap.xml delete mode 100644 phase2_roadmap.xml delete mode 100644 phase3_roadmap.xml delete mode 100644 phase4_roadmap.xml delete mode 100644 phase5_roadmap.xml delete mode 100644 phase6_roadmap.xml delete mode 100644 roadmap_mongo.xml diff --git a/final_roadmap.xml b/final_roadmap.xml deleted file mode 100644 index a9ebb30f..00000000 --- a/final_roadmap.xml +++ /dev/null @@ -1,228 +0,0 @@ - - - - HeliosSoftware/hfs - in-progress - TBD - - 2026-03-10 - MongoDB transaction-bundle parity and full Inferno matrix enablement - Completed: MongoDB transaction-bundle support, topology gating, capability updates, - and transaction integration tests. In progress: Inferno workflow promotion to full MongoDB - matrix coverage on replica-set topology and end-to-end validation. - - - - Implement MongoDB transaction-bundle parity and promote MongoDB into the full Inferno matrix by - adding real bundle transaction support, validating it against a transaction-capable Mongo - deployment, and then updating the Inferno workflow. - - - - MongoBackend process_transaction - now executes real atomic transaction bundles with session boundaries, topology checks, - conditional handling, and urn:uuid reference resolution. - MongoDB integration tests now include - positive transaction-bundle coverage (topology gating, reference resolution, conditionals, and - rollback) and capability assertions expect transaction support. - Inferno workflow now starts MongoDB in replica-set - mode within the full backend matrix, and the separate Mongo smoke job has been removed. - Inferno data loading posts FHIR transaction - bundles to the server root. - Inferno fixtures rely heavily on transaction - bundles, PUT, POST, fullUrl, and urn:uuid cross-entry references. - - - - Full FHIR transaction semantics require a transaction-capable MongoDB deployment; - standalone MongoDB is not sufficient. - Do not claim full Mongo Inferno parity until transaction bundle loading is validated - end-to-end. - Keep behavior aligned with existing SQLite and PostgreSQL bundle handling where - practical. - Preserve truthful capability reporting and CI claims throughout the rollout. - - - - - - - - - - - - - - - - - - - - - - - - Implement real transaction-bundle execution for MongoBackend. - - Implement MongoBackend process_transaction. - Process entries in FHIR transaction order compatible with the REST handler - assumptions. - Support POST, PUT, DELETE, and GET bundle entry types. - Add fullUrl to assigned-reference mapping and resolve urn:uuid references - before executing dependent entries. - Rollback the whole bundle on entry failure and commit only on full success. - Mirror existing SQLite and PostgreSQL bundle response semantics as closely - as practical. - - - MongoBackend transaction bundles execute atomically on supported Mongo - topologies. - Inferno fixture bundle patterns are supported at the persistence boundary. - - - - - Make Mongo transaction-bundle semantics honest and topology-aware. - - Use a real Mongo session and transaction boundary for bundle execution. - Require a replica set or otherwise transaction-capable deployment for true - transaction bundle support. - If the topology is not transaction-capable, fail transaction-bundle - execution with a clear error rather than silently degrading atomicity. - - - Mongo bundle transaction behavior matches actual deployment guarantees. - - - - - Bring Mongo bundle behavior close enough to existing backends for reliable parity. - - Support bundle ifMatch behavior on PUT entries. - Support bundle ifNoneExist behavior on POST entries. - Preserve idempotent delete behavior expectations where existing bundle - tests rely on it. - Keep PATCH behavior explicitly not-supported unless implemented - consistently across backends. - - - Mongo bundle entry semantics are consistent with the current backend contract - used by HFS and Inferno fixture loading. - - - - - Update tests and capability declarations to match delivered Mongo behavior. - - Replace the current unsupported Mongo transaction-bundle assertions with - positive coverage. - Add Mongo tests for create-only bundles, mixed-operation bundles, and - urn:uuid reference resolution. - Add rollback and failure tests for Mongo bundle transactions. - Update Mongo capability expectations if transactions are now supported. - - - Mongo capability tests and backend declarations truthfully represent delivered - transaction support. - - - - - Promote Mongo from smoke coverage to full Inferno matrix coverage once parity is - validated. - - Change Mongo startup in CI from standalone to single-node replica-set mode. - Add replica-set initialization and readiness checks. - Verify HFS connects successfully using the transaction-capable MongoDB URI. - Add mongodb to the full Inferno backend matrix. - Keep or remove the separate Mongo smoke job based on whether it still adds - value after full matrix promotion. - - - MongoDB participates truthfully in the full Inferno matrix under a - transaction-capable topology. - - - - - - - cargo test -p helios-persistence --features mongodb --test mongodb_tests - Focused Mongo bundle transaction coverage under - crates/persistence/tests/transactions. - Rollback and reference-resolution scenarios for Mongo bundle execution. - - - cargo check -p helios-hfs --features R4,mongodb - cargo check -p helios-hfs --features R4,sqlite,elasticsearch,postgres,mongodb - Run Inferno fixture loading against replica-set MongoDB before promoting Mongo - into the full matrix. - - - - - cargo test -p helios-persistence --features mongodb --test mongodb_tests - cargo check -p helios-hfs --features R4,mongodb - cargo check -p helios-hfs --features R4,sqlite,elasticsearch,postgres,mongodb - - - - - Standalone MongoDB cannot provide true transaction semantics; the CI topology - must change. - Use a single-node replica set in CI and fail transaction bundles clearly on - unsupported topologies. - - - Conditional bundle semantics may need session-aware matching logic if current - Mongo conditional helpers are not transaction-aware. - Refactor matching helpers or bundle-path logic so transaction-bound reads observe - the correct state. - - - Search index updates and SearchParameter registry changes must remain - rollback-safe inside the bundle transaction path. - Keep search-index and registry mutations inside the same transaction-aware - execution flow or defer non-transaction-safe side effects until commit success. - - - Workflow changes could over-claim parity before end-to-end Inferno fixture - loading is validated. - Promote Mongo to the full matrix only after backend validation and fixture-loading - success. - - - - - - WS1.1-WS1.6, WS2.1-WS2.3 - MongoBackend transaction bundles execute atomically on transaction-capable MongoDB - deployments. - - - WS3.1-WS3.4, WS4.1-WS4.4 - Mongo transaction support is covered by tests and reflected truthfully in - capability expectations. - - - WS5.1-WS5.5 - MongoDB runs in a replica-set topology in CI and is part of the full Inferno - matrix. - - - - - Mongo transaction bundles no longer return unsupported errors. - Inferno fixture transaction bundles load successfully against MongoDB. - MongoDB CI uses a transaction-capable topology. - MongoDB is added truthfully to the full Inferno matrix. - Mongo capability, tests, and workflow reflect the same delivered behavior. - - \ No newline at end of file diff --git a/phase2_roadmap.xml b/phase2_roadmap.xml deleted file mode 100644 index 28889a64..00000000 --- a/phase2_roadmap.xml +++ /dev/null @@ -1,292 +0,0 @@ - - - - HeliosSoftware/hfs - completed - TBD - 2 - - - 2026-03-01 - Phase 2 completed: core ResourceStorage parity, tenant isolation, soft-delete - semantics, schema bootstrap, tests, and docs shipped. - - - - - - Feature-gated MongoDB backend module export is enabled. - Mongo backend scaffolding exists (backend/config/schema wiring). - Core Backend trait integration is compile-safe. - Schema/bootstrap and migration paths exist as placeholders. - - - At Phase 1 completion, MongoDB runtime mode selection in HFS storage config was - not enabled yet. - At Phase 1 completion, no ResourceStorage CRUD implementation existed yet. - At Phase 1 completion, no Mongo integration test suite existed yet. - - - - - - Delivered minimum MongoDB parity for ResourceStorage behavior while preserving strict - tenant isolation and soft-delete/Gone semantics. - - Implement Mongo ResourceStorage contract methods for - create/read/update/delete/exists/count/read_batch/create_or_update. - Enforce tenant isolation in every query path and index strategy. - Implement soft-delete semantics aligned with existing backend behavior and - error contracts. - Replace schema placeholders with collection/index bootstrap logic required - for Phase 2. - - - VersionedStorage parity (vread, update_with_match). - Instance/type/system history providers. - TransactionProvider parity and session-based ACID guarantees. - Advanced search execution, chained search, reverse chaining, and include/revinclude - behavior. - Composite MongoDB + Elasticsearch runtime routing. - - - - - - - - - - - - - - - - - - - - - - - - - - Define stable Mongo persistence layout that supports ResourceStorage semantics and - future phase expansion. - - Define canonical live-resource document shape with explicit fields for - tenant_id, resource_type, resource_id, version_id, last_updated, is_deleted, and resource - payload. - Define resource_history document shape and write strategy that does not - block Phase 3 history provider implementation. - Create required indexes for tenant-scoped lookups and uniqueness (tenant_id - + resource_type + resource_id for active records). - Decide and document whether soft delete retains unique-key occupancy or - allows recreation via version bump policy. - Document collection naming conventions and migration-safe index names. - - - Concrete collection/index bootstrap in schema helpers. - Documented mapping from FHIR resource identity to Mongo keys. - - - - - Implement Phase 2 core storage methods with parity-focused behavior and error mapping. - - Introduce Mongo connection/client acquisition path suitable for storage - operations (replacing Phase 1 unavailable acquire behavior for this phase scope). - Implement create semantics with conflict detection and deterministic ID - handling. - Implement read semantics including not-found vs gone distinction. - Implement update semantics for existing resources with deterministic - metadata updates. - Implement delete semantics using soft-delete/tombstone behavior aligned to - existing backends. - Implement exists/count/read_batch/create_or_update helper methods with - tenant scope guarantees. - Map Mongo driver errors into existing StorageError/BackendError variants - consistently. - - - Mongo ResourceStorage parity for Phase 2 method set. - Consistent error behavior for contract tests and API consumers. - - - - - Guarantee strict tenant isolation in read/write operations and query helpers. - - Introduce shared tenant filter builder utilities to avoid missed tenant - predicates. - Require tenant_id in every CRUD/read_batch/count query and write path. - Ensure indexes are tenant-first where query cardinality and safety require - it. - Add cross-tenant negative tests for read, update, delete, count, and batch - reads. - - - Cross-tenant leakage prevention validated by tests. - - - - - Match existing backend behavior for deleted resource visibility and error semantics. - - Define deleted-state fields (is_deleted/deleted_at/deleted_by_version as - needed) for deterministic behavior. - Ensure normal read paths return Gone-compatible outcomes for soft-deleted - resources. - Ensure update/create_or_update behavior on deleted resources follows - existing backend contract expectations. - Add regression tests for repeated delete and delete-after-update edge - cases. - - - Soft-delete behavior parity with SQLite/PostgreSQL expectations for Phase 2 - scope. - - - - - Turn Phase 1 schema placeholders into deterministic schema/index bootstrap and migration - entry points. - - Implement initialize_schema to create required collections/indexes - idempotently. - Implement migrate_schema skeleton with migration version tracking strategy - for Mongo indexes. - Add tests for initialize/migrate idempotency and startup safety. - Document migration assumptions and rollback limitations for Mongo index - evolution. - - - Deterministic schema bootstrap/migration behavior for Phase 2 and future - phases. - - - - - - - Resource document mapping tests (metadata field population and serialization - invariants). - Tenant filter builder tests proving tenant predicate inclusion in every query - constructor. - Soft-delete state transition tests (active -> deleted -> repeated delete - handling). - Schema initialization and migration idempotency tests. - Error conversion tests from Mongo driver errors to StorageError/BackendError. - - - - Create and read round-trip under a single tenant. - Update behavior with immutable identity and mutable payload checks. - Delete and post-delete read behavior (Gone/not found contract). - exists/count/read_batch/create_or_update behavior under realistic tenant-scoped - datasets. - Cross-tenant isolation: no access to another tenant records across all - supported operations. - Bootstrap and migration execution against fresh and pre-initialized Mongo - databases. - - - - Reuse existing persistence test harness and assertions where possible. - Compare Mongo outcomes against SQLite/PostgreSQL expected behavior for methods in - scope. - Document any unavoidable deviations before marking the phase complete. - - - - - cargo check -p helios-persistence --features mongodb - cargo check -p helios-rest --features mongodb - cargo check -p helios-hfs --features mongodb - cargo check -p helios-persistence --features - "sqlite,postgres,elasticsearch,mongodb" - cargo test -p helios-persistence --features mongodb --test mongodb_tests - cargo test -p helios-persistence --features mongodb mongodb:: - - - - - WS1.1-WS1.5, WS5.1 - Schema bootstrap creates required collections/indexes idempotently. - - - - WS2.1-WS2.5 - create/read/update/delete integration tests pass for single tenant. - - - - WS2.6-WS2.7, WS3.1-WS3.4, WS4.1-WS4.4 - exists/count/read_batch/create_or_update and cross-tenant tests pass. - - - - WS5.2-WS5.4 and status alignment updates - README and capability matrix reflect truthful post-Phase-2 status. - - - - - - - - - - - - - - - Update MongoDB rows in persistence README capability matrix to reflect only capabilities - completed in this phase. - Update primary/secondary role matrix status from Phase 1 scaffold wording to Phase 2 - wording after tests pass. - Keep all non-implemented capability rows as planned/partial exactly as supported. - - - - - Tenant leakage due to missing tenant filters in one or more query paths. - Centralize tenant filter construction; enforce with negative cross-tenant tests - for every operation. - - - Soft-delete behavior diverges from existing Gone semantics. - Mirror sqlite/postgres behavior via contract tests before marking parity complete. - - - Unique index design conflicts with soft-delete and recreation scenarios. - Explicitly define active/deleted uniqueness policy and test both conflict and - recreation paths. - - - Schema bootstrap or migration logic is not idempotent across repeated startup - runs. - Require repeated initialize/migrate test passes against both fresh and - pre-initialized databases. - - - - - All Phase 2 in-scope ResourceStorage methods are implemented and covered - by Mongo integration tests. - Tenant isolation behavior matches established sqlite/postgres contract - expectations for in-scope methods. - Soft-delete and Gone semantics are validated by regression tests. - Schema bootstrap and migration routines are idempotent and safe to - execute at startup. - Validation commands run green for mongodb-only and mixed-feature builds. - Documentation and capability matrix reflect actual support levels with - no aspirational mismatch. - - \ No newline at end of file diff --git a/phase3_roadmap.xml b/phase3_roadmap.xml deleted file mode 100644 index 111b6cd5..00000000 --- a/phase3_roadmap.xml +++ /dev/null @@ -1,374 +0,0 @@ - - - - HeliosSoftware/hfs - planned - TBD - 3 - - - - 2026-03-04 - Detailed Phase 3 plan drafted: versioning/history parity is in scope; conditional - operations are deferred to Phase 4; history delete Trial Use methods remain NotSupported in - this phase. - - - - - - Mongo ResourceStorage CRUD/count/read_batch/create_or_update parity delivered. - Tenant isolation and soft-delete/Gone semantics validated for Phase 2 scope. - Schema bootstrap and migration foundations are implemented and idempotent. - - - Version-aware reads and optimistic locking beyond base update behavior were not - completed in Phase 2. - History provider traits were not implemented in Phase 2. - Conditional operations were not implemented in Phase 2. - - - - - - Deliver MongoDB parity for VersionedStorage and history providers while documenting and - implementing session-based behavior where it improves consistency, without pulling - search-dependent conditional semantics into this phase. - - Implement VersionedStorage methods for vread, update_with_match, - delete_with_match, and list_versions with tenant-safe filters and deterministic error - mapping. - Implement InstanceHistoryProvider, TypeHistoryProvider, and - SystemHistoryProvider with pagination, time filters, and include_deleted handling. - Keep FHIR Trial Use history delete features - (delete_instance_history/delete_version) as NotSupported in this phase and document the - support level explicitly. - Implement Mongo session-based execution where beneficial for multi-document - write paths and document deviations from PostgreSQL transaction semantics. - Align tests, capability declarations, and documentation with actual - post-Phase-3 support levels. - - - Conditional create/update/delete implementation (deferred to Phase 4 due to search - dependency). - Conditional patch support. - Advanced search execution, chained/reverse chaining, include/revinclude behavior. - DifferentialHistoryProvider implementation. - Composite MongoDB plus Elasticsearch runtime routing changes. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Provide tenant-safe versioned reads and If-Match behavior with parity-focused error - semantics. - - Implement vread against resource_history using tenant_id + resource_type + - resource_id + version_id filters. - Ensure vread returns historical versions regardless of current deleted - state when the requested version exists. - Implement update_with_match with strict expected-version comparison and - ConcurrencyError::VersionConflict mapping. - Implement delete_with_match with strict expected-version comparison and - consistent not-found/version-conflict behavior. - Implement list_versions using stable ordering (oldest to newest) under - tenant scope. - Normalize If-Match/ETag inputs before comparison and keep behavior - consistent with core helper semantics. - Guarantee history snapshots are persisted for each successful update/delete - path before returning success. - - - Mongo VersionedStorage implementation with behavior-parity tests. - Deterministic concurrency error behavior for stale version writes/deletes. - - - - - Deliver paginated history interactions at all three scopes with explicit Phase 3 support - boundaries. - - Implement history_instance with since/before/include_deleted/pagination - handling. - Implement history_instance_count. - Implement history_type with tenant + type scoped filters and - reverse-chronological ordering. - Implement history_type_count. - Implement history_system with tenant-scoped cross-type ordering. - Implement history_system_count. - Map history entries to HistoryMethod values consistently (POST/PUT/DELETE - and PATCH where applicable). - Keep delete_instance_history and delete_version as default NotSupported - behavior for this phase. - Document NotSupported status for Trial Use history-delete methods in - roadmap/docs/capability notes. - - - Instance/type/system history trait implementations validated by integration - tests. - Explicitly documented Phase 3 support boundary for history-delete Trial Use - operations. - - - - - Ensure predictable query performance and deterministic ordering for version/history - operations. - - Add/verify history indexes for tenant_id + resource_type + resource_id + - version_id lookup paths. - Add/verify indexes for tenant_id + resource_type + last_updated - history_type queries. - Add/verify indexes for tenant_id + last_updated history_system queries. - Document any version sorting assumptions (string versus numeric) and - enforce deterministic behavior. - Validate schema bootstrap/migration idempotency after index additions. - - - Index strategy supporting Phase 3 history and version query patterns. - Migration-safe schema behavior for repeated startup runs. - - - - - Implement Mongo sessions where they provide practical consistency value and document - behavior differences versus PostgreSQL. - - Identify multi-document operations that benefit from session/transaction - wrapping (for example: current resource write plus history append). - Implement session-backed execution paths for selected operations with safe - fallback behavior where full transaction support is not feasible. - Handle transient transaction/session errors with consistent StorageError - mapping and clear retry boundaries. - Document explicit deviations from PostgreSQL transactional guarantees and - isolation expectations. - Add focused tests for session-backed paths and rollback expectations where - behavior is claimed. - - - Session-backed operations implemented where beneficial and test-covered. - Clear transaction behavior documentation with no implicit parity claims. - - - - - Keep scope boundaries explicit by deferring conditional operations to Phase 4 and - reflecting that consistently in roadmap/docs/tests. - - Record that conditional create/update/delete remain deferred to Phase 4 due - to search dependency. - Ensure Mongo backend capability declarations do not claim conditional - support in Phase 3. - Do not add conditional-operation implementation in Mongo Phase 3 code - paths. - Define Phase 4 enablement gates for conditional support (search matching - fidelity, test coverage, and docs). - - - Roadmap/capability/docs alignment showing conditional support deferred to Phase - 4. - - - - - Ship Phase 3 with parity-focused tests and truthful support reporting. - - Add unit tests for version conflict checks and ETag normalization edge - cases. - Add integration tests for - vread/update_with_match/delete_with_match/list_versions behavior. - Add integration tests for history_instance/history_type/history_system - filters and paging. - Add integration tests proving history-delete Trial Use methods remain - NotSupported. - Add targeted tests for session-backed operations where introduced. - Update README capability matrix and backend role/status text for - post-Phase-3 truthfulness. - Update tests/common/capabilities.rs to match implemented support exactly. - - - Phase 3 test coverage for versioning/history/session behavior in Mongo feature - mode. - Capability and roadmap documentation aligned with implementation reality. - - - - - - - Version conflict checks for matching/non-matching expected versions. - ETag normalization coverage for W/quoted/unquoted forms. - History entry method mapping tests for create/update/delete transitions. - History parameter filter builder tests (since/before/include_deleted/pagination - cursors). - Session/transaction error mapping tests for Mongo driver errors translated to - StorageError. - Schema index bootstrap idempotency tests for Phase 3 index additions. - - - - vread returns expected historical version, including versions of resources - currently deleted. - update_with_match succeeds on exact version and fails with VersionConflict on - stale version. - delete_with_match succeeds on exact version and fails with - VersionConflict/NotFound as appropriate. - list_versions returns complete ascending version sequence under tenant scope. - history_instance supports since/before/include_deleted/pagination semantics. - history_type and history_system return tenant-safe reverse-chronological pages - and correct counts. - delete_instance_history and delete_version remain NotSupported in Mongo Phase - 3. - Session-backed write paths preserve current plus history consistency for - selected operations. - Cross-tenant negative tests for all newly added version/history operations. - - - - Compare Mongo outcomes against SQLite/PostgreSQL contract expectations for methods in - Phase 3 scope. - Do not mark conditional operations as implemented in tests or capabilities for this - phase. - Document all intentional deviations before phase completion sign-off. - - - - - cargo check -p helios-persistence --features mongodb - cargo check -p helios-rest --features mongodb - cargo check -p helios-hfs --features mongodb - cargo check -p helios-persistence --features - "sqlite,postgres,elasticsearch,mongodb" - cargo test -p helios-persistence --features mongodb --test mongodb_tests - cargo test -p helios-persistence --features mongodb mongodb:: - cargo fmt --all -- --check - - - - - WS1.1-WS1.7, WS6.1-WS6.2 - VersionedStorage behavior is implemented and version-concurrency tests pass. - - - - WS2.1-WS2.9, WS6.3-WS6.4 - Instance/type/system history tests pass and Trial Use history-delete methods are - explicitly NotSupported. - - - - WS3.1-WS3.5 - Phase 3 indexes are present, migrations are idempotent, and history query ordering - is deterministic. - - - - WS4.1-WS4.5, WS6.5 - Session-backed operations are implemented where beneficial and transaction - deviations are documented. - - - - WS5.1-WS5.4, WS6.6-WS6.7 - Conditional operations are clearly deferred to Phase 4 and all capability/docs - entries are aligned. - - - - - - - - - - - - - - - - - - Add explicit Phase 3 note in roadmap_mongo.xml that conditional operations are deferred to - Phase 4. - Add explicit Phase 3 note in roadmap_mongo.xml that delete_instance_history/delete_version - remain NotSupported. - Update persistence README capability matrix and backend role text for post-Phase-3 status. - Update test capability matrix to avoid enabling conditional-operation tests for MongoDB in - Phase 3. - - - - - Version ordering inconsistencies (string vs numeric) can break list_versions and - history pagination determinism. - Define explicit ordering strategy and enforce it in both query logic and tests. - - - Session/transaction behavior may diverge from PostgreSQL assumptions in higher - layers. - Implement session wrapping only where beneficial and document deviation boundaries - explicitly. - - - History queries across type/system scope may regress without proper indexes. - Add targeted indexes and validate query behavior with integration tests on - realistic data volumes. - - - Scope creep from conditional semantics can delay Phase 3 completion. - Hard-scope conditional support to Phase 4 with explicit roadmap and capability - gating. - - - Trial Use history-delete expectations may be misinterpreted as implemented - support. - Keep default NotSupported behavior and document support level in - roadmap/docs/tests. - - - - - VersionedStorage methods in Phase 3 scope are implemented and - validated by Mongo integration tests. - Instance/type/system history provider tests pass with tenant - isolation preserved. - delete_instance_history and delete_version remain NotSupported and - are explicitly documented. - Conditional operations are explicitly deferred to Phase 4 in - roadmap/docs/capability artifacts. - Session-based operations are implemented where beneficial, with - documented transaction deviations. - Validation commands pass for mongodb-only and mixed-feature builds. - Documentation and capability matrix remain truthful with no - aspirational mismatch. - - \ No newline at end of file diff --git a/phase4_roadmap.xml b/phase4_roadmap.xml deleted file mode 100644 index 794f6491..00000000 --- a/phase4_roadmap.xml +++ /dev/null @@ -1,305 +0,0 @@ - - - - HeliosSoftware/hfs - planned - TBD - 4 - - - - 2026-03-05 - Detailed Phase 4 plan drafted: MongoDB search/indexing parity and conditional create/update/delete are in scope; bundle semantics and conditional patch remain out of scope for this phase. - - - - - - Mongo CRUD, vread, optimistic locking, and instance/type/system history behavior are available. - Tenant isolation and soft-delete/Gone semantics remain aligned with earlier phases. - Best-effort session-backed consistency for selected multi-write flows is implemented where deployment topology permits. - History-delete Trial Use methods remain explicitly NotSupported for MongoDB. - - - Mongo SearchProvider execution is not implemented yet, so search-dependent FHIR semantics remain unavailable. - Conditional create/update/delete were intentionally deferred from Phase 3 because they depend on deterministic search matching. - Full transaction and batch bundle semantics remain planned beyond this phase scope. - Conditional patch remains separate from the minimum conditional-operation slice required for Phase 4. - - - - - - Deliver a truthful first wave of MongoDB FHIR search support and use that foundation to enable conditional create/update/delete, while keeping advanced search and bundle semantics explicitly out of scope. - - Implement Mongo-native search indexing and query execution for a practical first wave of parameter types needed for meaningful FHIR search parity. - Implement deterministic search_count, sorting, and paging behavior for supported Mongo search paths under strict tenant isolation. - Enable conditional_create, conditional_update, and conditional_delete using reliable zero/one/multiple-match semantics derived from Mongo search execution. - Define and document the native-versus-offloaded search boundary clearly, especially for full-text and other advanced search features that remain partial or planned. - Align roadmap, capability declarations, README status text, and tests so Mongo support is described truthfully with no aspirational mismatch. - - - Conditional patch support in this phase. - Full BundleProvider parity for batch or transaction bundle semantics. - Broad chained search, reverse chaining, _include, and _revinclude support beyond explicit future-phase planning. - Terminology-backed modifiers such as :above, :below, :in, and :not-in. - Composite MongoDB plus Elasticsearch runtime routing changes beyond documenting search_offloaded and feature boundaries. - Differential history or Trial Use history-delete implementation changes. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Define a stable Mongo-native search data model and integrate search value persistence into existing write paths without regressing Phase 2 or Phase 3 guarantees. - - Choose and document the Phase 4 search storage shape for MongoDB (embedded resource-side search values, dedicated search collection, or hybrid model). - Reuse the existing SearchParameter registry and extractor flow so supported search values are derived at write time for create/update/delete paths. - Ensure live resource writes keep search indexes synchronized without changing history storage behavior or soft-delete semantics. - Add schema bootstrap and migration steps for search-related collections and indexes with startup-safe idempotency. - Keep search_offloaded behavior explicit so Mongo native search indexing can be bypassed cleanly when secondary search ownership is configured. - - - Documented Mongo search storage model suitable for Phase 4 query execution and later expansion. - Search indexing hooks integrated into Mongo write paths with schema/bootstrap support. - - - - - Implement SearchProvider behavior for an initial Mongo parameter set with deterministic paging, sorting, and count behavior. - - Implement search and search_count for single-resource-type Mongo queries via SearchProvider. - Support first-wave search parameter classes: string, token, reference, date, number, and uri. - Define explicit support posture for quantity and composite parameters; only mark them implemented if Phase 4 tests and query semantics are complete. - Enforce tenant filters and is_deleted visibility rules consistently across all Mongo search queries. - Implement deterministic sorting and paging behavior with clearly documented practical limits. - Ensure search_count returns results consistent with the primary search path for supported parameter classes. - Return clear unsupported capability or validation errors for search features that remain outside the Phase 4 support boundary. - - - Mongo SearchProvider implementation for the first supported parameter wave. - Deterministic count, paging, and sorting behavior for implemented search paths. - - - - - Use Phase 4 search matching to enable the minimum conditional-operation slice needed for parity-focused Mongo semantics. - - Implement conditional_create using tenant-scoped Mongo search matching with exact zero/one/multiple-match behavior. - Implement conditional_update with deterministic no-match, create-on-upsert, single-match update, and multiple-match outcomes. - Implement conditional_delete with deterministic no-match, single-match delete, and multiple-match outcomes. - Keep conditional operations aligned with existing versioning, soft-delete, and optimistic-locking behavior where those semantics intersect. - Ensure conditional matching remains tenant-safe and never leaks cross-tenant search results or write decisions. - Keep conditional_patch out of scope and explicitly documented as deferred after Phase 4. - - - Mongo ConditionalStorage parity for create, update, and delete. - Explicitly documented deferral of conditional_patch beyond Phase 4. - - - - - Draw hard boundaries around what Mongo search does and does not support after Phase 4 so capabilities, docs, and tests stay aligned. - - Keep _include and _revinclude as planned unless a narrowly testable subset is implemented and documented during the phase. - Keep chained search and reverse chaining as planned unless a narrowly scoped implementation is fully validated. - Decide and document the full-text posture for MongoDB in Phase 4: native text indexes, Elasticsearch offload, or planned-only. - Document unsupported advanced modifiers and search combinations with explicit error or capability behavior. - Keep bundle semantics out of the Phase 4 capability claim set even if related transaction infrastructure exists from earlier work. - - - Clear post-Phase-4 boundary definition for advanced Mongo search capabilities. - Capability claims that match tested behavior exactly. - - - - - Validate Mongo search and conditional behavior against existing contract expectations for every capability claimed after Phase 4. - - Add unit tests for Mongo query translation and filter construction for each supported parameter class. - Add unit tests for search index extraction, persistence hooks, and search-related schema/bootstrap idempotency. - Add integration tests for supported search parameter classes, search_count parity, and deterministic paging/sorting. - Add integration tests for conditional_create, conditional_update, and conditional_delete across no-match, single-match, and multiple-match scenarios. - Add cross-tenant negative tests for all search and conditional paths introduced in this phase. - Add negative tests proving unsupported advanced search features and conditional_patch remain outside the implemented support set. - Update tests/common/capabilities.rs to reflect implemented, partial, and deferred Mongo search/conditional support exactly. - - - Phase 4 unit and integration coverage for Mongo search and conditional semantics. - Capability declarations synchronized with tested behavior. - - - - - Keep the dedicated Phase 4 roadmap, umbrella Mongo roadmap, README, and implementation-facing status notes aligned. - - Create and maintain phase4_roadmap.xml as the detailed execution artifact for Mongo Phase 4. - Update roadmap_mongo.xml progress and phase status text to reflect that Phase 4 is the active detailed planning target after Phase 3. - Update persistence README MongoDB status text and capability matrix rows to reflect actual post-Phase-4 search and conditional support. - Document that full bundle semantics remain planned outside Phase 4 scope. - Document that conditional_patch remains deferred even if conditional create/update/delete ship in this phase. - - - Roadmap and README artifacts that accurately describe the real Mongo feature set after Phase 4 work lands. - - - - - - - Query translation coverage for each supported first-wave parameter type. - Search index extraction and persistence-hook tests for create/update/delete write paths. - Search schema/bootstrap and migration idempotency tests for Phase 4 index additions. - Cursor and/or offset paging token parsing tests with deterministic ordering assumptions. - Conditional match classification tests for no-match, single-match, and multiple-match outcomes. - Unsupported advanced search and conditional_patch error-mapping tests. - - - - Search round-trip tests for supported string, token, reference, date, number, and uri parameters. - search_count returns values consistent with search results for supported Mongo queries. - Paging and sorting remain deterministic across repeated test runs and realistic datasets. - Cross-tenant negative tests prove search results and conditional decisions never leak across tenant boundaries. - conditional_create creates on no match, returns existing on single match, and errors on multiple matches. - conditional_update covers single-match update, no-match without upsert, no-match with upsert, and multiple-match failure paths. - conditional_delete covers no-match, single-match delete, and multiple-match failure paths. - Unsupported advanced search features and conditional_patch remain explicitly unimplemented with expected behavior. - search_offloaded behavior remains coherent with Phase 4 native search boundaries when enabled. - - - - Reuse SQLite and PostgreSQL search and conditional expectations where semantics overlap with the Mongo Phase 4 slice. - Do not mark Mongo advanced search features as implemented without corresponding tests. - Document every remaining planned or partial feature explicitly instead of silently omitting it. - - - - - cargo check -p helios-persistence --features mongodb - cargo check -p helios-rest --features mongodb - cargo check -p helios-hfs --features mongodb - cargo check -p helios-persistence --features "sqlite,postgres,elasticsearch,mongodb" - cargo test -p helios-persistence --features mongodb --test mongodb_tests - cargo test -p helios-persistence --features mongodb mongodb:: - cargo fmt --all -- --check - - - - - WS1.1-WS1.5, WS5.1-WS5.2 - Mongo search storage shape, indexing hooks, and schema/bootstrap behavior are implemented with unit coverage. - - - - WS2.1-WS2.7, WS5.3 - Supported first-wave parameter classes pass search, count, paging, and sorting tests. - - - - WS3.1-WS3.6, WS5.4-WS5.5 - Mongo conditional create/update/delete semantics are implemented and validated under tenant-safe matching. - - - - WS4.1-WS4.5, WS5.6-WS5.7, WS6.3-WS6.5 - Every unsupported or partial advanced search feature is explicitly documented and capability-aligned. - - - - WS6.1-WS6.2 and validation command execution - phase4_roadmap.xml, roadmap_mongo.xml, and README all reflect the same post-Phase-4 truth. - - - - - - - - - - - - - - - - - - - - - - - - Create and maintain phase4_roadmap.xml as the dedicated detailed artifact for Mongo Phase 4 work. - Update roadmap_mongo.xml progress text so it no longer implies Phase 2 is next. - Update persistence README capability matrix rows for Mongo search and conditional support after Phase 4 ships. - Keep conditional_patch and full bundle semantics explicitly documented as outside the Phase 4 support set. - Update test capability declarations so Mongo only enables search and conditional tests that truly pass. - - - - - Mongo search storage design may create excessive write amplification or index maintenance complexity. - Choose a minimal Phase 4 search model, validate write-path overhead early, and keep schema/index scope intentionally narrow. - - - Conditional operations may produce incorrect no-match/single-match/multiple-match behavior if search semantics are incomplete or inconsistent. - Gate conditional support on deterministic search tests and mirror SQLite/PostgreSQL contract expectations for overlapping behavior. - - - Sorting and paging can become nondeterministic without explicit tie-break ordering and stable cursor assumptions. - Define canonical ordering rules and enforce them with unit and integration tests before claiming support. - - - Advanced search features may be perceived as supported because infrastructure exists but end-to-end semantics are incomplete. - Keep advanced capabilities explicitly planned or partial in docs, capabilities, and tests until fully validated. - - - Native Mongo search and offloaded Elasticsearch search boundaries may become ambiguous for operators and developers. - Document search_offloaded behavior clearly and keep full-text posture explicit in both roadmap and README artifacts. - - - - - Mongo SearchProvider support for the first-wave parameter classes is implemented and covered by integration tests. - search_count, paging, and sorting behavior are deterministic and tenant-safe for implemented Mongo search paths. - conditional_create, conditional_update, and conditional_delete are implemented and validated for zero, one, and multiple matches. - conditional_patch remains explicitly deferred and is not misrepresented as implemented. - Advanced search features that remain out of scope are clearly documented as planned or partial with matching capability declarations. - Validation commands pass for mongodb-only and mixed-feature builds. - phase4_roadmap.xml, roadmap_mongo.xml, README, and capability artifacts describe the same real Mongo support level. - - diff --git a/phase5_roadmap.xml b/phase5_roadmap.xml deleted file mode 100644 index b2f37418..00000000 --- a/phase5_roadmap.xml +++ /dev/null @@ -1,401 +0,0 @@ - - - - HeliosSoftware/hfs - completed - TBD - 5 - - - - 2026-03-09 - Phase 5 is completed: composite MongoDB primary plus Elasticsearch secondary - routing, search_offloaded duplicate-index prevention, shared SearchParameter registry - wiring, mongodb-elasticsearch runtime startup, and composite tenant/failure coverage are - implemented and verified. Feature-gated cargo check commands and focused composite tests - passed; workspace-wide cargo fmt --all -- --check still reports unrelated pre-existing - formatting drift outside the Phase 5 files. - - - - - - MongoDB native search indexing and SearchProvider execution are implemented for the - Phase 4 supported query surface. - Conditional create, conditional update, and conditional delete semantics are - implemented and covered in Mongo integration tests. - SearchParameter lifecycle hooks update registry state on create, update, and delete - operations. - Mongo capabilities and tests were updated to truthfully represent implemented search - and pagination support. - - - Composite MongoDB plus Elasticsearch ownership boundaries are not yet - implemented for production routing paths. - search_offloaded behavior exists but requires strict validation to prevent - duplicate or stale indexing in composite mode. - Shared SearchParameterRegistry and extractor initialization between primary and - secondary backends must be deterministic. - Composite runtime startup and configuration validation must fail fast on invalid - mixed-backend setups. - - - - - - Deliver a robust composite mode where MongoDB owns canonical writes and reads while - Elasticsearch owns search execution, mirroring established sqlite-elasticsearch and - postgres-elasticsearch operating patterns. - - Define and implement explicit ownership boundaries for write-primary, - read-primary, and search-secondary flows in MongoDB plus Elasticsearch composition. - Harden search_offloaded behavior so Mongo does not maintain duplicate - native search indexes when Elasticsearch is configured as secondary. - Ensure shared SearchParameter registry and extraction setup is consistent - across MongoDB and Elasticsearch initialization and runtime updates. - Implement startup and provider wiring for mongodb-elasticsearch mode with - feature-gated, deterministic initialization and clear failure behavior. - Align tests, capability declarations, and docs with actual composite - behavior so support claims stay truthful. - - - Implementing new advanced search features beyond the Phase 4 supported search surface. - Changing MongoDB standalone search behavior outside composite-offload requirements. - Implementing full bundle transaction semantics beyond current backend capabilities. - Expanding deployment guidance beyond composite operational requirements needed for this - phase. - Introducing new database-per-tenant architecture changes. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Define and enforce which backend owns each operation path so composite behavior is - deterministic and parity-aligned with existing primary-secondary patterns. - - Document operation ownership matrix for - create/update/delete/read/history/search/count/conditional operations in MongoDB plus - Elasticsearch mode. - Implement provider delegation rules in composite backend construction and - dispatch paths. - Ensure conditional operation matching semantics delegate to the designated - search owner without changing write ownership. - Define and document expected consistency model for Mongo canonical storage - versus Elasticsearch search visibility. - - - Composite ownership matrix with deterministic routing semantics. - Implemented delegation paths for Mongo primary and Elasticsearch - search-secondary behavior. - - - - - Guarantee that enabling offloaded search prevents duplicate Mongo native indexing while - preserving correct write and SearchParameter lifecycle behavior. - - Enforce search_offloaded checks across Mongo create/update/delete indexing - hooks with no silent bypasses. - Validate SearchParameter create/update/delete handling remains correct when - search indexing ownership is offloaded. - Add assertions and tests that Mongo search_index collection is not written - in composite offload mode. - Preserve standalone Mongo mode behavior to avoid regressions in native - search scenarios. - - - Offload-safe Mongo indexing lifecycle with explicit duplicate-prevention - guarantees. - Regression coverage for standalone and offloaded mode behavior split. - - - - - Provide feature-gated startup and configuration wiring for mongodb-elasticsearch mode - that is consistent with existing composite backend startup patterns. - - Implement composite backend factory path for Mongo primary plus - Elasticsearch secondary in persistence wiring. - If phase scope includes runtime mode exposure, add mongodb-elasticsearch - mode parsing and startup path in REST/HFS configuration layers. - Define configuration validation and startup error behavior for missing or - incompatible Mongo/Elasticsearch settings. - Ensure feature-gated compile behavior remains deterministic for mongodb, - elasticsearch, and combined feature sets. - - - Composite backend startup path for mongodb-elasticsearch mode. - Clear startup validation behavior and errors for mixed backend - misconfiguration. - - - - - Keep SearchParameter registry and extraction semantics synchronized between MongoDB and - Elasticsearch components in composite mode. - - Share or synchronize SearchParameterRegistry initialization so both - providers use the same active parameter definitions. - Ensure extractor behavior used for indexing is consistent with registry - state across both backend components. - Define ordering guarantees for SearchParameter lifecycle updates relative - to indexing operations in composite mode. - Add tests covering active/draft/retired/delete SearchParameter transitions - and composite search visibility implications. - - - Deterministic registry/extractor synchronization between Mongo and - Elasticsearch components. - SearchParameter lifecycle parity tests for composite mode. - - - - - Prove composite behavior through integration and parity tests focused on routing - correctness, consistency expectations, and failure surfaces. - - Add integration tests for write-primary/read-primary/search-secondary - routing correctness. - Add result consistency tests verifying search output aligns with canonical - Mongo resources under expected refresh semantics. - Add negative tests for secondary outage and startup failure behavior, - including expected error propagation. - Add tenant-isolation tests ensuring composite routing never leaks - cross-tenant data. - Extend shared harness/capability tests so mongodb-elasticsearch mode is - exercised similarly to existing composite modes. - - - Composite integration suite covering routing, consistency boundaries, and - failure behavior. - Parity-oriented harness coverage for mongodb-elasticsearch mode. - - - - - Ensure capability declarations and documentation reflect the delivered composite - behavior with no aspirational mismatch. - - Update capability matrix entries for MongoDB and composite mode support - levels after composite behavior is validated. - Update persistence README role matrix and operational notes for Mongo - primary plus Elasticsearch secondary mode. - Update roadmap_mongo.xml progress/status text to mark Phase 5 complete when - exit criteria are met. - Record known limitations and deferred capabilities clearly in roadmap and - docs. - - - Capability and documentation artifacts synchronized with tested composite - behavior. - - - - - - - Composite provider-construction tests verify Mongo primary and Elasticsearch - secondary wiring selection. - Delegation tests verify CRUD/read/history routes remain on Mongo while search - routes delegate to Elasticsearch. - search_offloaded guard tests verify Mongo indexing hooks are bypassed only in - offloaded mode. - SearchParameter registry initialization tests verify shared state and - deterministic update ordering. - Configuration validation tests verify startup errors for invalid mixed-backend - settings. - - - - Composite create/read/search round-trip validates - write-primary/read-primary/search-secondary routing. - Conditional operation tests in composite mode validate deterministic matching - behavior and correct ownership split. - Mongo search_index duplicate-prevention tests assert no local indexing when - search is offloaded. - SearchParameter lifecycle tests verify active/draft/retired/delete behavior - remains coherent across composite components. - Tenant-isolation composite tests validate no cross-tenant leakage in delegated - search paths. - Secondary failure tests validate expected startup/runtime error handling - behavior. - - - - Mirror sqlite-elasticsearch and postgres-elasticsearch routing expectations for - equivalent operations. - Do not claim mongodb-elasticsearch support as implemented until routing and consistency - tests pass. - Keep unsupported advanced search capabilities explicitly documented as planned or - partial. - - - - - cargo check -p helios-persistence --features "mongodb,elasticsearch" - cargo check -p helios-rest --features "mongodb,elasticsearch" - cargo check -p helios-hfs --features "mongodb,elasticsearch" - cargo check -p helios-persistence --features - "sqlite,postgres,mongodb,elasticsearch" - cargo test -p helios-persistence --features "mongodb,elasticsearch" --test - mongodb_tests - cargo test -p helios-persistence --features "mongodb,elasticsearch" --test - elasticsearch_tests - cargo test -p helios-persistence --features "mongodb,elasticsearch" - composite:: - cargo fmt --all -- --check - - - - - WS1.1-WS1.4, WS3.1 - Composite delegation rules are implemented and ownership matrix is documented. - - - - WS2.1-WS2.4, UT3, IT3 - Mongo duplicate indexing is prevented in offloaded mode without standalone-mode - regressions. - - - - WS4.1-WS4.4, UT4, IT4 - SearchParameter registry and extraction behavior are synchronized across composite - components. - - - - WS5.1-WS5.5, IT1, IT2, IT5, IT6 - Composite routing, tenant isolation, consistency boundaries, and failure behavior - are validated. - - - - WS6.1-WS6.4 and validation command execution - Capability matrix, README, and roadmap_mongo.xml are synchronized with delivered - Phase 5 behavior. - - - - - - - - - - - - - - - - - - - - - Create and maintain phase5_roadmap.xml as the detailed execution artifact for MongoDB - Phase 5. - Update roadmap_mongo.xml Phase 5 status and progress text when exit criteria are met. - Update crates/persistence/README.md capability and role matrix entries for - mongodb-elasticsearch mode. - Document expected consistency boundaries and search visibility timing for composite mode - operations. - Keep deferred advanced search features explicitly marked as partial or planned. - - - - - Incorrect operation ownership routing could cause stale reads, wrong provider - execution, or semantic drift from established composite modes. - Define explicit ownership matrix first and enforce via unit plus integration - delegation tests. - - - Duplicate indexing between Mongo native search paths and Elasticsearch offloaded - search could increase write cost and return inconsistent results. - Harden search_offloaded guards and add explicit duplicate-prevention tests. - - - SearchParameter registry divergence between composite components could produce - extraction/query mismatches. - Share or synchronize registry initialization and validate lifecycle transitions - with composite tests. - - - Runtime mode wiring can fail due to incomplete feature-gate combinations or - configuration mismatches. - Add startup validation and mixed-feature compile checks in validation commands. - - - Capability/docs drift may overstate support before composite behavior is - validated. - Gate capability and roadmap updates on passing composite routing and consistency - tests. - - - - - Composite routing enforces Mongo primary ownership for writes/reads and - Elasticsearch ownership for search in mongodb-elasticsearch mode. - search_offloaded behavior prevents duplicate Mongo indexing when - Elasticsearch secondary is configured. - Shared SearchParameter registry and extractor initialization are - deterministic and validated by lifecycle tests. - Composite tests pass for routing correctness, result consistency - boundaries, tenant isolation, and failure behavior. - Startup path for mongodb-elasticsearch mode is implemented and - feature-gated for relevant crates. - Capability matrix, README, and roadmap_mongo.xml reflect the same tested - post-Phase-5 support state. - Feature-gated mongodb-elasticsearch and mixed-feature cargo check - commands passed, and focused composite routing/tenant/failure tests passed. Workspace-wide - cargo fmt --all -- --check remains blocked by unrelated pre-existing formatting drift outside - the Phase 5 files. - - \ No newline at end of file diff --git a/phase6_roadmap.xml b/phase6_roadmap.xml deleted file mode 100644 index 6c506077..00000000 --- a/phase6_roadmap.xml +++ /dev/null @@ -1,200 +0,0 @@ - - - - HeliosSoftware/hfs - completed - TBD - 6 - - - - 2026-03-09 - Phase 6 is completed: MongoDB runtime mode exposure in REST/HFS is present, operator-facing MongoDB standalone and mongodb-elasticsearch documentation is synchronized, the top-level roadmap now reflects shipped Mongo support, and focused runtime/config validation is defined for release-readiness closure. - - - - - - MongoDB primary plus Elasticsearch secondary composite routing is implemented. - search_offloaded duplicate-index prevention and shared SearchParameter registry wiring are implemented. - mongodb-elasticsearch runtime startup exists in `crates/hfs/src/main.rs`. - Composite tests and capability updates established truthful post-Phase-5 support boundaries. - - - Top-level operator docs still needed to reflect the shipped Mongo runtime surface accurately. - A dedicated detailed Phase 6 artifact did not yet exist in the repository. - Release-readiness needed to be framed as runtime/config validation and documentation truthfulness, not new Mongo capability work. - - - - - - Close out MongoDB delivery by validating the already-implemented runtime modes, synchronizing documentation and roadmap artifacts, and defining a focused validation bar for Mongo standalone and MongoDB plus Elasticsearch operation. - - Confirm `HFS_STORAGE_BACKEND` supports `mongodb` and `mongodb-elasticsearch` through the REST configuration layer and HFS startup dispatch. - Document MongoDB standalone and MongoDB plus Elasticsearch operator flows with accurate environment variables, feature flags, and runtime examples. - Align `ROADMAP.md`, `crates/persistence/README.md`, and `roadmap_mongo.xml` so they describe the same delivered Mongo state. - Record a dedicated completed `phase6_roadmap.xml` artifact for future reference and phase traceability. - Define and execute focused validation for Mongo runtime/config surfaces as the Phase 6 release-readiness bar. - - - New Mongo CRUD, history, search, or conditional capability implementation. - Conditional patch support. - Full transaction bundle parity for MongoDB. - Advanced search features beyond the Phase 4/5 supported surface. - New composite backend combinations outside MongoDB plus Elasticsearch. - - - - - - - - - - - - - - - - - - - - - Verify that the MongoDB runtime surface already present in code is accurately represented and remains feature-gated with clear operator expectations. - - Confirm `StorageBackendMode` includes `mongodb` and `mongodb-elasticsearch` parsing/display behavior in `crates/rest/src/config.rs`. - Confirm `crates/hfs/src/main.rs` dispatches to `start_mongodb` and `start_mongodb_elasticsearch`. - Confirm composite startup requires Elasticsearch node configuration and preserves MongoDB as the primary store. - Treat feature-gated fallback messages as part of the Phase 6 runtime contract. - - - Verified runtime-mode surface for standalone and composite MongoDB operation. - - - - - Make the repository docs truthful and usable for MongoDB standalone and composite deployment modes. - - Update the persistence architecture tree to reflect the implemented Mongo backend file layout. - Add MongoDB standalone build/run guidance with the real environment variables used by `MongoBackend::from_env`. - Add MongoDB plus Elasticsearch composite build/run guidance and explain search offloading ownership. - Update implementation-status notes to mark Phase 6 closure work complete. - - - README guidance that matches actual Mongo runtime behavior and supported operating modes. - - - - - Ensure roadmap artifacts consistently describe MongoDB as shipped and Phase 6 as completed. - - Update `ROADMAP.md` shipped persistence items to include MongoDB standalone and MongoDB plus Elasticsearch. - Remove stale top-level roadmap wording that still marks MongoDB primary support as in progress. - Update `roadmap_mongo.xml` progress text and reference files for completed Phase 5/6 artifacts. - Create `phase6_roadmap.xml` as the dedicated detailed Phase 6 record. - - - Top-level roadmap, umbrella roadmap, and detailed phase roadmap all reflect the same completed Mongo state. - - - - - Define the minimal validation matrix needed to treat Phase 6 as closed without claiming unrelated repo-wide cleanliness. - - Run focused tests covering `StorageBackendMode` parsing/display behavior in the REST configuration crate. - Run feature-gated `cargo check` for `helios-hfs` with `mongodb` and with `mongodb,elasticsearch`. - Keep unrelated repo-wide formatting drift explicitly out of the Phase 6 success claim. - - - Focused validation evidence for Mongo runtime/config readiness. - - - - - - - `helios-rest` storage backend mode parsing tests for `mongodb` and `mongodb-elasticsearch`. - `helios-rest` storage backend mode display tests for Mongo modes. - - - - `cargo check -p helios-hfs --features mongodb` validates Mongo standalone runtime wiring. - `cargo check -p helios-hfs --features mongodb,elasticsearch` validates composite runtime wiring. - Documentation examples and roadmap references are reviewed for consistency with the implemented runtime surface. - - - - Do not claim new Mongo capabilities in Phase 6; only claim runtime/doc/release-readiness closure work. - Keep unsupported advanced search, conditional patch, and full transaction bundle semantics explicitly outside the Phase 6 completion statement. - - - - - cargo test -p helios-rest storage_backend_mode - cargo check -p helios-hfs --features mongodb - cargo check -p helios-hfs --features mongodb,elasticsearch - - - - - WS2.1-WS2.4 - Persistence README reflects the implemented Mongo standalone and composite runtime surface. - - - - WS3.1-WS3.4 - `ROADMAP.md`, `roadmap_mongo.xml`, and `phase6_roadmap.xml` agree on completed Mongo delivery. - - - - WS1.1-WS1.4, WS4.1-WS4.3 - Focused Mongo runtime/config validation commands are executed successfully. - - - - - - - - - - - - - - - - - - Update the persistence README architecture tree and Mongo runtime guidance. - Update `ROADMAP.md` shipped persistence items to include MongoDB standalone and composite support. - Update `roadmap_mongo.xml` to reference the completed Phase 5 and Phase 6 artifacts. - Create and maintain `phase6_roadmap.xml` as the detailed closure artifact for MongoDB Phase 6. - - - - - Docs and roadmap artifacts can lag behind runtime reality and understate shipped Mongo support. - Synchronize README, top-level roadmap, and umbrella Mongo roadmap in the same phase. - - - Phase 6 could over-claim completion if release-readiness is treated as broader than the actual validation performed. - Keep the validation scope focused on runtime/config surfaces and explicitly exclude unrelated repo-wide formatting drift. - - - Operators may use the wrong MongoDB environment variable names if examples do not match the backend loader. - Document `HFS_MONGODB_URL`, `HFS_MONGODB_URI`, `HFS_DATABASE_URL`, and `HFS_MONGODB_DATABASE` consistently. - - - - - `HFS_STORAGE_BACKEND` supports `mongodb` and `mongodb-elasticsearch` through the implemented REST/HFS runtime surface. - Persistence README examples and operator guidance reflect the real Mongo standalone and composite configuration paths. - `ROADMAP.md`, `roadmap_mongo.xml`, and `phase6_roadmap.xml` all describe MongoDB support as shipped/completed. - Focused validation commands are defined and executed for Mongo runtime/config readiness. - Unrelated repo-wide formatting drift remains explicitly outside the Phase 6 completion claim. - - diff --git a/roadmap_mongo.xml b/roadmap_mongo.xml deleted file mode 100644 index 48f7c2d4..00000000 --- a/roadmap_mongo.xml +++ /dev/null @@ -1,343 +0,0 @@ - - - - HeliosSoftware/hfs - completed - TBD - date-agnostic - 2026-03-09 - Phases 1 through 6 are completed: backend wiring, core storage parity, - version/history semantics, native Mongo search plus conditional operations, composite - MongoDB + Elasticsearch integration, and runtime HFS_STORAGE_BACKEND wiring, - documentation synchronization, and release-readiness validation are implemented. - Feature-gated cargo validation for Mongo runtime surfaces passed; workspace-wide cargo - fmt --all -- --check still reports unrelated pre-existing formatting drift outside the - Mongo Phase 5/6 files. - - SQLite primary, PostgreSQL primary, MongoDB primary, Elasticsearch - secondary, MongoDB + Elasticsearch composite - - - - - - - - - - - - - - - - - - - - - - MongoDB as a primary persistence backend in helios-persistence. - Tenant-aware CRUD, versioning, history, and conditional operation semantics compatible - with existing backends. - Search strategy for MongoDB (native indexing and/or search offloading to Elasticsearch). - Composite integration for MongoDB + Elasticsearch. - Server runtime wiring via HFS_STORAGE_BACKEND options for MongoDB modes. - Test and CI coverage comparable to SQLite/PostgreSQL/Elasticsearch patterns. - - - Neo4j graph traversal optimizations in initial MongoDB delivery. - Non-FHIR custom query DSL extensions. - Full terminology service implementation beyond current external-service expectations. - Hard production SLO guarantees before baseline performance characterization is complete. - - - - - - - - - - - - - - - - - - - - - - - - - Introduce MongoDB backend module scaffolding with compile-time and runtime hooks. - - Enable mongodb module export in crates/persistence/src/backends/mod.rs. - Create crates/persistence/src/backends/mongodb with backend/config/schema - skeleton. - Define MongoDB backend config with defaults and serde support. - Implement Backend trait basics (kind/name/capabilities/health checks). - Wire feature-gated compile paths for mongo in helios-persistence and - helios-hfs. - - - Existing Cargo feature 'mongodb' and optional dependency in - crates/persistence/Cargo.toml. - - - cargo check -p helios-persistence --features mongodb passes. - BackendKind::MongoDB compile and display behavior verified. - - - - - Reach minimum parity with SQLite/PostgreSQL ResourceStorage behavior. - - Implement create/read/update/delete/exists/count/read_batch/create_or_update. - Enforce tenant isolation in all collection queries and indexes. - Implement soft-delete semantics aligned with existing Gone behavior. - Create core collection/index strategy: resources, resource_history, search - indexes (if native). - - - Phase 1 backend skeleton. - - - Mongo integration CRUD tests pass in isolation. - Tenant isolation behavior matches sqlite_tests/postgres_tests expectations. - - - - - Implement FHIR version/history semantics and concurrency expectations. - - Implement VersionedStorage (vread + update_with_match semantics). - Implement instance/type/system history providers. - Defer conditional create/update/delete and If-Match handling to Phase 4 - search/indexing work (search dependency). - Define and implement session-based transaction behavior where beneficial; - document deviations from PostgreSQL semantics. - - - FHIR Trial Use history delete features (DELETE [type]/[id]/_history and DELETE - [type]/[id]/_history/[vid]) remain NotSupported for MongoDB in Phase 3. - Conditional operations remain a target capability overall but are intentionally - phase-shifted to Phase 4 where search matching behavior is implemented. - - - Phase 2 core contract implementation. - Phase 4 search/indexing implementation for conditional matching semantics. - - - History and versioning test suites pass for MongoDB feature mode. - Explicitly documented deviations from PostgreSQL transaction semantics (if any). - - - - - Support FHIR search behavior and enable conditional create/update/delete with clear - native/offloaded boundaries. - - Implement SearchParameter extraction + indexing path for Mongo resources. - Implement basic parameter types - (string/token/date/number/quantity/reference/uri/composite) per priority. - Support paging and sorting; define practical limits. - Implement full-text path via Mongo text indexes OR formalize Elasticsearch - offload-first strategy. - Implement conditional create/update/delete using deterministic search - matching - semantics; keep conditional patch out of scope. - Define support levels (implemented/partial/planned) in capability matrix and - docs. - - - Phase 2 and 3 collections + version model. - - - Search contract tests pass for implemented parameter classes. - Conditional create/update/delete tests pass for implemented matching semantics. - Capability matrix updated with truthful MongoDB support levels. - - - - - Provide robust primary-secondary mode mirroring sqlite-elasticsearch and - postgres-elasticsearch. - - Implement Mongo search_offloaded mode to avoid duplicate indexing when ES - secondary is configured. - Create composite wiring with Mongo primary and Elasticsearch search backend. - Ensure search registry sharing between Mongo backend and ES backend - initialization. - Validate write-primary/read-primary/search-secondary routing and sync - behavior. - - - Phase 4 search model clarity. - - - Composite tests verify routing, delegated tenant isolation, and failure behavior - for Mongo + Elasticsearch. - Startup path for mongo-elasticsearch mode is implemented and feature-gated. - - - - - Expose Mongo modes to HFS runtime and document operational guidance. - - Add StorageBackendMode values for mongodb and mongodb-elasticsearch in - crates/rest/src/config.rs. - Add start_mongodb and start_mongodb_elasticsearch flows in - crates/hfs/src/main.rs. - Verify the implemented runtime paths and feature-gated failure behavior for - mongodb and mongodb-elasticsearch modes. - Update persistence README capability matrix, role matrix, and operator - examples to reflect implemented Mongo status. - Update top-level ROADMAP.md persistence section and document deployment - examples, environment variables, and feature flags. - - - Phases 1 through 5 complete or explicitly scoped. - - - HFS_STORAGE_BACKEND accepts mongodb and mongodb-elasticsearch values. - All relevant docs, roadmap artifacts, and examples are consistent with actual - implementation, aside from unrelated repo-wide formatting drift outside the Mongo phase - files. - - - - - - - Prefer parity with existing SQLite/PostgreSQL behavioral contracts over - backend-specific shortcuts. - Use capability-driven tests to skip only what is explicitly planned/not planned. - Run fast unit coverage first, then containerized integration, then full regression. - - - - - Mongo config defaults + serde roundtrip (mirror Postgres/ES config tests). - Backend capability declarations and support checks. - Query translation tests (FHIR search params to Mongo query documents). - Error conversion tests (mongodb::error::Error to StorageError). - - - All unit tests deterministic and runnable without Docker. - Coverage includes every capability claimed as implemented/partial. - - - - - - Create crates/persistence/tests/mongodb_tests.rs analogous to postgres_tests.rs and - elasticsearch_tests.rs. - Use shared Mongo container lifecycle for speed and isolation via unique tenant IDs per - test. - CRUD, tenant isolation, version increments, delete semantics (Gone/not found - expectations). - History and conditional operation behavior for implemented levels. - - - Mongo integration suite passes on self-hosted CI with Docker/testcontainers. - No cross-test data contamination detected. - - - - - - Reuse tests/common harness, fixtures, assertions, and capability matrix for Mongo - runs. - Validate parity with SQLite/PostgreSQL for tenant isolation, versioning, and error - semantics. - - - All contract tests for implemented Mongo capabilities pass without Mongo-only - exceptions. - - - - - - Add/extend composite tests to cover Mongo primary + ES search secondary. - Verify write/read/search ownership split and provider delegation. - Verify index synchronization behavior and eventual consistency expectations. - - - Composite routing and result-merging tests pass for Mongo+ES mode. - No duplicated search indexing in Mongo when offloaded. - - - - - - Compile + unit tests for mongodb feature. - Mongo integration tests (Docker/testcontainers). - All-features/full-workspace regression. - - - Add mongodb-featured test invocations in CI jobs. - Maintain container cleanup parity with existing label-based cleanup steps. - Ensure failures in Mongo-specific stage block merge when feature is enabled. - - - - - - Baseline latency benchmarks for create/read/search operations. - Concurrency soak tests across multiple tenants and mixed read/write load. - Migration/index evolution tests for backward-compatible rollout behavior. - - - Benchmarks show no critical regressions versus declared targets for initial - release. - No data corruption or tenant leakage under concurrent load tests. - - - - - - - Mongo transaction semantics may diverge from ACID expectations used by - PostgreSQL/SQLite flows. - Document support level clearly; enforce behavior with dedicated transaction and - rollback tests. - - - FHIR search parity gaps due to complex chained/reverse chained parameter - translation. - Ship basic search parity first; mark advanced capabilities as partial/planned with - explicit tests. - - - Dual indexing complexity when Mongo native search and Elasticsearch offloading - coexist. - Use explicit search_offloaded controls and test for duplicated/stale index - behavior. - - - CI instability from container startup/resource constraints on self-hosted - runners. - Shared container lifecycle, timeouts, and explicit cleanup steps aligned with - current CI patterns. - - - - - MongoDB backend module is enabled and feature-gated with stable compile and runtime - startup paths. - MongoDB mode is selectable via HFS_STORAGE_BACKEND and validated in configuration parsing - tests. - Core CRUD/version/history/tenant behavior passes defined contract tests for implemented - capabilities. - MongoDB + Elasticsearch composite path is implemented with routing and sync tests passing. - Capability matrix and roadmap documentation reflect actual support levels (no aspirational - mismatch). - CI includes Mongo-targeted stages and remains green for required feature sets. - - \ No newline at end of file From 9f03a36bdb4aaf2fe818c1585fa135642bdeb06c Mon Sep 17 00:00:00 2001 From: dougc95 Date: Tue, 10 Mar 2026 21:13:10 -0400 Subject: [PATCH 17/17] docs(mongodb): add comprehensive manual testing guide and update feature matrix --- .gitignore | 3 + crates/persistence/README.md | 27 ++- docs/mongodb-manual-testing.md | 331 +++++++++++++++++++++++++++++++++ 3 files changed, 351 insertions(+), 10 deletions(-) create mode 100644 docs/mongodb-manual-testing.md diff --git a/.gitignore b/.gitignore index 5d89eab1..d694434b 100644 --- a/.gitignore +++ b/.gitignore @@ -17,3 +17,6 @@ AGENTS.md *.db *.db-shm *.db-wal + +# Test artifacts +/test-artifacts diff --git a/crates/persistence/README.md b/crates/persistence/README.md index d45df2ff..59f0a7e0 100644 --- a/crates/persistence/README.md +++ b/crates/persistence/README.md @@ -305,7 +305,7 @@ The matrix below shows which FHIR operations each backend supports. This reflect **Legend:** ✓ Implemented | ◐ Partial | ○ Planned | ✗ Not planned | † Requires external service -> **MongoDB Status:** MongoDB primary support is implemented through Phase 5: CRUD, vread/history, optimistic locking, tenant isolation, Phase 4 native search for the implemented parameter surface, conditional create/update/delete, and best-effort session-backed consistency for multi-write flows where deployment topology permits. MongoDB + Elasticsearch composite mode is implemented with write-primary/read-primary/search-secondary routing. Conditional patch, full transaction bundle semantics, delete-history Trial Use operations, and advanced search features outside the implemented Phase 4 surface remain partial or planned. +> **MongoDB Status:** MongoDB primary support is fully implemented and tested: CRUD operations, versioning (vread), history providers (instance/type/system), optimistic locking, tenant isolation, native search (string, token, reference, date, number, URI parameters), conditional operations (create/update/delete), cursor and offset pagination, multi-field sorting, and transaction bundles with urn:uuid reference resolution on replica sets. MongoDB + Elasticsearch composite mode is fully implemented and tested with write-primary/read-primary/search-secondary routing. See [`docs/mongodb-manual-testing.md`](../../docs/mongodb-manual-testing.md) for comprehensive test results. | Feature | SQLite | PostgreSQL | MongoDB | Cassandra | Neo4j | Elasticsearch | S3 | | --------------------------------------------------------------------------- | ------ | ---------- | ------- | --------- | ----- | ------------- | --- | @@ -313,11 +313,11 @@ The matrix below shows which FHIR operations each backend supports. This reflect | [CRUD](https://build.fhir.org/http.html#crud) | ✓ | ✓ | ✓ | ○ | ○ | ✓ | ○ | | [Versioning (vread)](https://build.fhir.org/http.html#vread) | ✓ | ✓ | ✓ | ○ | ○ | ○ | ○ | | [Optimistic Locking](https://build.fhir.org/http.html#concurrency) | ✓ | ✓ | ✓ | ○ | ○ | ✗ | ✗ | -| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | -| [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | -| [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | -| [Batch Bundles](https://build.fhir.org/http.html#batch) | ✓ | ✓ | ○ | ○ | ○ | ○ | ○ | -| [Transaction Bundles](https://build.fhir.org/http.html#transaction) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [Instance History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [Type History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [System History](https://build.fhir.org/http.html#history) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | +| [Batch Bundles](https://build.fhir.org/http.html#batch) | ✓ | ✓ | ✓ | ○ | ○ | ○ | ○ | +| [Transaction Bundles](https://build.fhir.org/http.html#transaction) | ✓ | ✓ | ✓ | ✗ | ○ | ✗ | ✗ | | [Conditional Operations](https://build.fhir.org/http.html#cond-update) | ✓ | ✓ | ✓ | ✗ | ○ | ○ | ✗ | | [Conditional Patch](https://build.fhir.org/http.html#patch) | ✓ | ✓ | ○ | ✗ | ○ | ○ | ✗ | | [Delete History](https://build.fhir.org/http.html#delete) | ✓ | ✓ | ○ | ✗ | ○ | ✗ | ✗ | @@ -519,15 +519,18 @@ HFS_ELASTICSEARCH_NODES=http://localhost:9200 \ ### MongoDB -MongoDB provides document-centric primary storage with tenant-aware CRUD, version/history support, optimistic locking, the Phase 4 supported native search surface, and conditional create/update/delete. +MongoDB provides document-centric primary storage with full FHIR capabilities including CRUD, versioning, history, search, and transactions. - Full CRUD operations with document-native resource storage - Versioning and history providers (`vread`, instance/type/system history) -- Conditional create, update, and delete for the implemented search surface -- Offset and cursor pagination plus single- and multi-field sorting +- Transaction bundles with urn:uuid reference resolution (requires replica set) +- Native search (string, token, reference, date, number, URI parameters) +- Conditional create, update, and delete operations +- Cursor and offset pagination with multi-field sorting - Shared-schema multitenancy with strict tenant filtering +- Optimistic locking with ETag support -**Prerequisites:** A running MongoDB instance (standalone for basic deployments, replica set/sharded topology if you want Mongo transactions where topology permits). +**Prerequisites:** A running MongoDB instance. Use standalone for basic deployments or replica set/sharded topology for transaction bundle support. ```bash # Build with MongoDB support @@ -551,6 +554,9 @@ MongoDB runtime configuration also supports: - `HFS_MONGODB_MAX_CONNECTIONS` to control the driver pool size (default: `10`) - `HFS_MONGODB_CONNECT_TIMEOUT_MS` to control the connection timeout (default: `5000`) +For a step-by-step API verification checklist, see +[`docs/mongodb-manual-testing.md`](../../docs/mongodb-manual-testing.md). + ### MongoDB + Elasticsearch MongoDB remains the canonical write/read store while Elasticsearch owns delegated search execution. This mode mirrors the existing SQLite + Elasticsearch and PostgreSQL + Elasticsearch composite patterns. @@ -822,6 +828,7 @@ The SQLite backend includes a complete FHIR search implementation using pre-comp - [x] MongoDB Phase 4 native search, pagination/sorting, and conditional create/update/delete - [x] MongoDB Phase 5 composite MongoDB + Elasticsearch integration and runtime wiring - [x] MongoDB Phase 6 runtime wiring verification, documentation sync, and release-readiness validation +- [x] MongoDB manual testing completed: all core features verified working (see docs/mongodb-manual-testing.md) - [ ] Neo4j backend (graph queries, Cypher) - [ ] S3 backend (bulk export, object storage) diff --git a/docs/mongodb-manual-testing.md b/docs/mongodb-manual-testing.md new file mode 100644 index 00000000..af53f458 --- /dev/null +++ b/docs/mongodb-manual-testing.md @@ -0,0 +1,331 @@ +# MongoDB Backend Manual Testing + +This guide provides a practical checklist to manually validate the new MongoDB backend in HFS, including standalone `mongodb` mode and optional `mongodb-elasticsearch` mode. + +## Prerequisites + +- Docker running locally +- Rust toolchain installed (`cargo`) +- `curl` +- Optional but helpful: `jq` + +## 1) Build HFS for MongoDB + +From repo root: + +```bash +cargo build -p helios-hfs --features mongodb +``` + +If you also want to test `mongodb-elasticsearch` mode: + +```bash +cargo build -p helios-hfs --features mongodb,elasticsearch +``` + +## 2) Start MongoDB (standalone) + +```bash +docker rm -f hfs-mongo-manual >/dev/null 2>&1 || true +docker run -d --name hfs-mongo-manual -p 27017:27017 mongo:8.0 +``` + +## 3) Start HFS in `mongodb` mode + +Use a separate terminal and keep the process running. + +```bash +export HFS_STORAGE_BACKEND=mongodb +export HFS_DATABASE_URL="mongodb://localhost:27017" +export HFS_MONGODB_DATABASE="helios_manual" +export HFS_SERVER_HOST="127.0.0.1" +export HFS_SERVER_PORT="8080" + +BIN="./target/debug/hfs" +[ -f "./target/debug/hfs.exe" ] && BIN="./target/debug/hfs.exe" + +"$BIN" +``` + +## 4) Health and metadata smoke checks + +In another terminal: + +```bash +export BASE_URL="http://127.0.0.1:8080" +export TENANT="default" + +curl -s "$BASE_URL/health" +curl -s "$BASE_URL/metadata" +``` + +Expected: health status is OK and CapabilityStatement is returned from `/metadata`. + +## 5) CRUD + version/history checks + +### Create + +```bash +cat > patient-v1.json <<'JSON' +{ + "resourceType": "Patient", + "id": "mongo-manual-1", + "identifier": [ + { + "system": "http://example.org/mrn", + "value": "MONGO-001" + } + ], + "name": [ + { + "family": "Manual", + "given": ["Mongo"] + } + ], + "active": true +} +JSON + +curl -i -X PUT "$BASE_URL/Patient/mongo-manual-1" \ + -H "Content-Type: application/fhir+json" \ + -H "X-Tenant-ID: $TENANT" \ + --data-binary @patient-v1.json +``` + +### Read + +```bash +curl -s "$BASE_URL/Patient/mongo-manual-1" -H "X-Tenant-ID: $TENANT" +``` + +### Update (new version) + +```bash +cat > patient-v2.json <<'JSON' +{ + "resourceType": "Patient", + "id": "mongo-manual-1", + "identifier": [ + { + "system": "http://example.org/mrn", + "value": "MONGO-001" + } + ], + "name": [ + { + "family": "Manual", + "given": ["Mongo", "Updated"] + } + ], + "active": false +} +JSON + +curl -i -X PUT "$BASE_URL/Patient/mongo-manual-1" \ + -H "Content-Type: application/fhir+json" \ + -H "X-Tenant-ID: $TENANT" \ + --data-binary @patient-v2.json +``` + +### History + vread + +```bash +curl -s "$BASE_URL/Patient/mongo-manual-1/_history" -H "X-Tenant-ID: $TENANT" +curl -s "$BASE_URL/Patient/mongo-manual-1/_history/1" -H "X-Tenant-ID: $TENANT" +``` + +Expected: history bundle contains multiple versions and vread for version `1` returns the initial version. + +### Delete + +```bash +curl -i -X DELETE "$BASE_URL/Patient/mongo-manual-1" -H "X-Tenant-ID: $TENANT" +curl -i "$BASE_URL/Patient/mongo-manual-1" -H "X-Tenant-ID: $TENANT" +``` + +Expected: delete succeeds; read after delete returns `410 Gone` or `404 Not Found` depending server settings. + +## 6) Search, sort, and pagination checks + +Create two patients for search: + +```bash +cat > patient-a.json <<'JSON' +{ + "resourceType": "Patient", + "id": "mongo-search-a", + "identifier": [{ "system": "http://example.org/mrn", "value": "MONGO-010" }], + "name": [{ "family": "Search", "given": ["Alpha"] }] +} +JSON + +cat > patient-b.json <<'JSON' +{ + "resourceType": "Patient", + "id": "mongo-search-b", + "identifier": [{ "system": "http://example.org/mrn", "value": "MONGO-011" }], + "name": [{ "family": "Search", "given": ["Beta"] }] +} +JSON + +curl -s -X PUT "$BASE_URL/Patient/mongo-search-a" -H "Content-Type: application/fhir+json" -H "X-Tenant-ID: $TENANT" --data-binary @patient-a.json +curl -s -X PUT "$BASE_URL/Patient/mongo-search-b" -H "Content-Type: application/fhir+json" -H "X-Tenant-ID: $TENANT" --data-binary @patient-b.json + +curl -s "$BASE_URL/Patient?name=Search&_count=1&_sort=-_lastUpdated" -H "X-Tenant-ID: $TENANT" +``` + +Expected: search returns matching entries and supports `_count` + `_sort` behavior. + +## 7) Conditional operation checks + +### Conditional create (`If-None-Exist`) + +```bash +cat > patient-cond-create.json <<'JSON' +{ + "resourceType": "Patient", + "identifier": [{ "system": "http://example.org/mrn", "value": "MONGO-020" }], + "name": [{ "family": "Conditional", "given": ["Create"] }] +} +JSON + +curl -i -X POST "$BASE_URL/Patient" \ + -H "Content-Type: application/fhir+json" \ + -H "If-None-Exist: identifier=http://example.org/mrn|MONGO-020" \ + -H "X-Tenant-ID: $TENANT" \ + --data-binary @patient-cond-create.json + +curl -i -X POST "$BASE_URL/Patient" \ + -H "Content-Type: application/fhir+json" \ + -H "If-None-Exist: identifier=http://example.org/mrn|MONGO-020" \ + -H "X-Tenant-ID: $TENANT" \ + --data-binary @patient-cond-create.json + +curl -s "$BASE_URL/Patient?identifier=http://example.org/mrn|MONGO-020" -H "X-Tenant-ID: $TENANT" +``` + +Expected: second create does not produce a duplicate match. + +### Conditional update + +```bash +cat > patient-cond-update.json <<'JSON' +{ + "resourceType": "Patient", + "identifier": [{ "system": "http://example.org/mrn", "value": "MONGO-021" }], + "name": [{ "family": "Conditional", "given": ["Update"] }] +} +JSON + +curl -i -X PUT "$BASE_URL/Patient?identifier=http://example.org/mrn|MONGO-021" \ + -H "Content-Type: application/fhir+json" \ + -H "X-Tenant-ID: $TENANT" \ + --data-binary @patient-cond-update.json + +curl -s "$BASE_URL/Patient?identifier=http://example.org/mrn|MONGO-021" -H "X-Tenant-ID: $TENANT" +``` + +Expected: one matching resource is created/updated according to conditional update semantics. + +### Conditional delete + +```bash +curl -i -X DELETE "$BASE_URL/Patient?identifier=http://example.org/mrn|MONGO-021" -H "X-Tenant-ID: $TENANT" +curl -s "$BASE_URL/Patient?identifier=http://example.org/mrn|MONGO-021" -H "X-Tenant-ID: $TENANT" +``` + +Expected: conditional delete succeeds and the search no longer returns an active match. + +## 8) Optional: transaction-bundle check on replica set + +Use this only if you want to manually validate transaction behavior on a transaction-capable Mongo topology. + +### Start MongoDB replica set + +```bash +docker rm -f hfs-mongo-rs >/dev/null 2>&1 || true +docker run -d --name hfs-mongo-rs -p 27017:27017 mongo:8.0 --replSet rs0 --bind_ip_all + +docker exec hfs-mongo-rs mongosh --quiet --eval 'try { rs.status().ok } catch (e) { rs.initiate({_id:"rs0",members:[{_id:0,host:"localhost:27017"}]}) }' +``` + +Restart HFS with: + +```bash +export HFS_STORAGE_BACKEND=mongodb +export HFS_DATABASE_URL="mongodb://localhost:27017/?replicaSet=rs0&directConnection=true" +export HFS_MONGODB_DATABASE="helios_manual_rs" +``` + +### Submit a transaction bundle with `urn:uuid` reference + +```bash +cat > txn-bundle.json <<'JSON' +{ + "resourceType": "Bundle", + "type": "transaction", + "entry": [ + { + "fullUrl": "urn:uuid:pat-1", + "resource": { + "resourceType": "Patient", + "identifier": [{ "system": "http://example.org/mrn", "value": "MONGO-TXN-001" }], + "name": [{ "family": "Txn", "given": ["Patient"] }] + }, + "request": { "method": "POST", "url": "Patient" } + }, + { + "resource": { + "resourceType": "Observation", + "status": "final", + "code": { "text": "manual txn observation" }, + "subject": { "reference": "urn:uuid:pat-1" } + }, + "request": { "method": "POST", "url": "Observation" } + } + ] +} +JSON + +curl -i -X POST "$BASE_URL" \ + -H "Content-Type: application/fhir+json" \ + -H "X-Tenant-ID: $TENANT" \ + --data-binary @txn-bundle.json +``` + +Expected: transaction response bundle succeeds for both entries. + +## 9) Optional: `mongodb-elasticsearch` mode check + +### Start Elasticsearch + +```bash +docker rm -f hfs-es-manual >/dev/null 2>&1 || true +docker run -d --name hfs-es-manual -p 9200:9200 \ + -e "discovery.type=single-node" \ + -e "xpack.security.enabled=false" \ + elasticsearch:8.15.0 +``` + +### Start HFS in composite mode + +```bash +export HFS_STORAGE_BACKEND=mongodb-elasticsearch +export HFS_DATABASE_URL="mongodb://localhost:27017" +export HFS_MONGODB_DATABASE="helios_manual_composite" +export HFS_ELASTICSEARCH_NODES="http://localhost:9200" + +BIN="./target/debug/hfs" +[ -f "./target/debug/hfs.exe" ] && BIN="./target/debug/hfs.exe" +"$BIN" +``` + +Create/search a patient as above and verify search responses are returned in composite mode. + +## 10) Cleanup + +```bash +docker rm -f hfs-mongo-manual hfs-mongo-rs hfs-es-manual >/dev/null 2>&1 || true +rm -f patient-v1.json patient-v2.json patient-a.json patient-b.json \ + patient-cond-create.json patient-cond-update.json txn-bundle.json +```