Skip to content

Feature/s3 ElasticSearch composite#47

Open
aacruzgon wants to merge 13 commits intomainfrom
feature/s3-elasticsearch-composite
Open

Feature/s3 ElasticSearch composite#47
aacruzgon wants to merge 13 commits intomainfrom
feature/s3-elasticsearch-composite

Conversation

@aacruzgon
Copy link
Contributor

feat(persistence): add S3 + ElasticSearch composite storage backend

Summary

  • Adds s3-elasticsearch (s3-es) as a new composite storage backend mode, wiring S3 as the primary store for CRUD/versioning/history and Elasticsearch as the dedicated search engine
  • Follows the same CompositeStorage pattern already used by sqlite-elasticsearch and postgres-elasticsearch
  • Fixes a bug in execute_parallel_search where UnsupportedCapability errors from the primary backend (S3 has no search) were propagating and crashing full-text search queries — the fix promotes the auxiliary (ES) result to primary when the primary lacks search capability

Changes

  • crates/rest/src/config.rs — adds StorageBackendMode::S3Elasticsearch with Display, FromStr, and parse tests
  • crates/hfs/src/main.rs — adds start_s3_elasticsearch() startup function and build_search_registry() helper (S3 has no internal search registry, so ES builds one independently)
  • crates/persistence/src/composite/storage.rs — graceful handling of UnsupportedCapability in parallel search
  • crates/persistence/tests/s3_es_tests.rs — integration test file for the S3+ES composite backend using MinIO + Elasticsearch via testcontainers

Usage

cargo build -p helios-hfs --features s3,elasticsearch

HFS_STORAGE_BACKEND=s3-elasticsearch \
  HFS_S3_BUCKET=my-bucket \
  HFS_S3_REGION=us-east-1 \
  HFS_ELASTICSEARCH_NODES=http://localhost:9200 \
  ./hfs

Test Plan

  • cargo build -p helios-hfs --features s3,elasticsearch passes
  • cargo test -p helios-persistence --features s3,elasticsearch passes
  • cargo test -p helios-rest passes (config parse tests include S3Elasticsearch)
  • Live tested against a real AWS S3 bucket (my-buckets-name, us-east-1) + Elasticsearch, with a 473-entry Synthea FHIR transaction bundle ingested end-to-end

Live Server Verification

Tested against HFS_STORAGE_BACKEND=s3-elasticsearch with a 473-resource Synthea bundle:

Test Command Result
Health GET /health ok, backend: composite
String search GET /Patient?family=Ritchie586 1 result
Token search GET /Observation?code=72166-2 20 results
Full-text search GET /Condition?_content=sinusitis 6 results (Viral sinusitis)
Full-text search GET /MedicationRequest?_content=Albuterol 10 results
Date range GET /Patient?birthdate=lt1970-01-01 1 result
Pagination GET /Observation?_count=5 5 results
Direct S3 read GET /Patient/355 Resource returned from S3
Multi-param GET /Observation?code=72166-2&_count=5 5 results
_text (no narrative) GET /Condition?_text=sinusitis 0 results, no error

smunini and others added 13 commits March 5, 2026 20:51
Add `s3-elasticsearch` as a new composite storage mode that routes all
CRUD, versioning, history, and bulk operations to S3 while offloading
all search queries to Elasticsearch.

Key differences from SQLite/PostgreSQL+ES composites:
- Uses `ElasticsearchBackend::new()` (standalone registry) instead of
  `with_shared_registry()`, since S3 has no search parameter registry
- No `set_search_offloaded()` call needed; S3's stub SearchProvider
  already returns UnsupportedCapability for all search operations

Changes:
- rest/config.rs: add `S3Elasticsearch` variant with aliases `s3-elasticsearch`
  and `s3-es`; update error message and arg doc; add parse/display tests
- hfs/main.rs: add match arm and `start_s3_elasticsearch()` with cfg
  feature guards for `s3` + `elasticsearch`; update module docs
- CLAUDE.md: add s3-elasticsearch row to storage backends table
- README.md: add S3+ES to configurations table, running example, env var
- persistence/README.md: mark S3+Elasticsearch as implemented in both
  role matrix tables; update search offloading paragraph
S3 alone lacks search capability and cannot pass the full Inferno US Core
test suite. Replace it with s3-elasticsearch so CRUD goes to S3 while all
FHIR search queries are handled by Elasticsearch.

- Matrix: s3 → s3-elasticsearch
- Start Elasticsearch: condition now covers both sqlite-elasticsearch and s3-elasticsearch
- Start HFS: replace s3 branch with s3-elasticsearch (adds HFS_ELASTICSEARCH_NODES)
- Skip condition: matrix.backend != 's3-elasticsearch' when HFS_S3_BUCKET unset
…ite prefix

- Add HFS_S3_PREFIX env var support to start_s3 and start_s3_elasticsearch
  so callers can scope all S3 keys under an optional global prefix
- Update inferno.yml: pass HFS_S3_PREFIX=ci/<suite_id>/ per matrix job so
  parallel s3-elasticsearch jobs are fully isolated within the shared bucket
- Empty only the job-scoped S3 prefix (not the whole bucket) before each run
- Add AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY to the s3-elasticsearch HFS
  start command
- Guard all post-HFS steps with the s3-elasticsearch/HFS_S3_BUCKET condition
  so jobs are skipped cleanly when the bucket secret is not configured
- Document HFS_S3_PREFIX in README.md
… [skip ci]

Document the S3+ES composite storage configuration including env vars,
build/run commands, S3-compatible endpoint setup, key differences from
SQLite/PG+ES, and programmatic composite assembly example. Also update
the search offloading section to accurately reflect S3 behavior.
Switch S3Backend::new() to S3Backend::from_env_async() for proper async
SDK config loading, and include full error details in S3 client error
messages instead of dropping them when the message field is empty.
@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

❌ Patch coverage is 28.94737% with 27 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/persistence/src/composite/storage.rs 0.00% 21 Missing ⚠️
crates/hfs/src/main.rs 0.00% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants