Skip to content

feat: Support for Multiple Calibrations in VA-Spec Model Exports#658

Open
bencap wants to merge 4 commits intorelease-2026.1.1from
feature/bencap/494/va-models-for-multiple-calibrations
Open

feat: Support for Multiple Calibrations in VA-Spec Model Exports#658
bencap wants to merge 4 commits intorelease-2026.1.1from
feature/bencap/494/va-models-for-multiple-calibrations

Conversation

@bencap
Copy link
Collaborator

@bencap bencap commented Feb 13, 2026

This pull request introduces significant enhancements and refactoring to the variant annotation logic, especially around the handling of functional and pathogenicity statements, calibration selection, and evidence strength mapping. The changes improve support for multiple calibrations, provide more accurate and VA-Spec-compliant evidence strength reporting, and streamline the annotation flow for both functional and pathogenicity contexts.

Key improvements include:

Annotation API and Logic Refactoring:

  • Refactored variant_functional_impact_statement and replaced variant_pathogenicity_evidence with variant_pathogenicity_statement, now supporting multiple calibrations and returning richer, VA-Spec-compliant objects. Calibration selection logic now prioritizes the strongest available calibration, with optional inclusion of research-use-only calibrations. ([src/mavedb/lib/annotation/annotate.pyL34-R134])
  • Updated the annotation flow to build evidence lines and statements for all eligible calibrations, and to propagate the strongest calibration and classification throughout the annotation process. ([src/mavedb/lib/annotation/annotate.pyL34-R134])

Evidence Strength and Classification Handling:

  • Enhanced the classification functions to accept a specific ScoreCalibration and return both the matched functional classification and the VA-Spec classification, enabling precise mapping between internal and external evidence strength levels. [1]], [2]])
  • Implemented a mapping from the internal MODERATE_PLUS evidence strength to the VA-Spec MODERATE level, ensuring external compatibility while preserving internal granularity. ([src/mavedb/lib/annotation/classification.pyL90-R129])

Type and Model Updates:

  • Updated the GA4GH VA-Spec stubs to introduce VariantPathogenicityStatement and AcmgClassification, and refactored VariantPathogenicityEvidenceLine to inherit from EvidenceLine for improved type consistency. [1]], [2]])

Validation and Error Handling:

Testing and Documentation:

  • Introduced Pytest markers for unit and integration tests to better organize and clarify test coverage. ([pyproject.tomlR108-R111])
  • Updated documentation and import statements to reflect the new annotation flow and VA-Spec structure. [1]], [2]])

These changes collectively modernize the annotation pipeline, improve standards compliance, and set the stage for more flexible and accurate variant annotation workflows.

…ve test infrastructure

This commit introduces major enhancements to the annotation system, including support
for multiple score calibrations, improved test infrastructure, and alignment with
VA-Spec standards.

- Refactor annotation system to support multiple score calibrations per score set
- Add calibration selection logic based on evidence strength and classification conflicts
- Implement `select_strongest_functional_calibration()` and `select_strongest_pathogenicity_calibration()`
- Update classification functions to accept explicit score_calibration parameter
- Add `score_calibration_may_be_used_for_annotation()` utility for eligibility checks
- Support both research-use-only and production calibrations with opt-in flag

- Add `src/mavedb/lib/annotation/direction.py` for evidence direction determination
- Implement `aggregate_direction_of_evidence()` for combining evidence lines
- Add `direction_of_support_for_functional_classification()` mapping
- Add `direction_of_support_for_pathogenicity_classification()` mapping

- Create `tests/helpers/mocks/` package with comprehensive factory functions
- Add `mock_utilities.py` with MockObjectWithPydanticFunctionality and MockVariantCollection
- Add `factories.py` with 20+ factory functions for all MaveDB models
- Add documentation in `tests/helpers/mocks/README.md` for usage patterns

- Update statement/evidence line generation to use all eligible calibrations
- Refactor contribution modules to remove `excalibr_calibration_agent()`
- Add score calibration contributions with URN and metadata
- Update datetime handling to use native datetime objects instead of strings
- Add SPDX license support with `score_set_license_to_mappable_concept()`
- Implement MODERATE_PLUS to MODERATE mapping for VA-Spec compatibility

- Add `serialize_evidence_items()` for consistent evidence serialization
- Add `sequence_feature_for_mapped_variant()` for gene/transcript extraction
- Add `target_for_variant()` for multi-target score set handling
- Add `SequenceFeature` named tuple for structured feature representation

- Rename `/functional-impact` → `/functional-statement`
- Rename `/clinical-evidence` → `/pathogenicity-statement`
- Rename `/functional-study-result` → `/study-result`
- Update response models from EvidenceLine to VariantPathogenicityStatement

- Convert `ScoreCalibrationRelation` to str-based enum for JSON serialization
- Update classification functions to return tuple with range and classification
- Add gene context qualifier to pathogenicity propositions

- Refactor all annotation tests with class-based structure using @pytest.mark.unit
- Add comprehensive module docstrings explaining test purpose and scope
- Add descriptive docstrings for all test methods
- Organize tests into logical groups (Unit/Integration)

- Add tests for direction.py: aggregation, functional, and pathogenicity mappings
- Add tests for constants.py: GENERIC_DISEASE_MEDGEN_CODE and MEDGEN_SYSTEM
- Add tests for exceptions.py: MappingDataDoesntExistException
- Add tests for contribution.py: creator/modifier tests with dates and resource types
- Add tests for classification.py: MODERATE_PLUS mapping validation
- Update conftest.py with annotation-specific fixtures

- Update `tests/lib/conftest.py` to use new factory functions
- Add pytest.mark.integration marker to pyproject.toml
- Create `tests/lib/annotation/conftest.py` with annotation fixtures

- Update `mypy_stubs/ga4gh/va_spec/acmg_2015.pyi` with proper inheritance
- Add VariantPathogenicityStatement and AcmgClassification classes
- Fix EvidenceLine inheritance hierarchy

- Add comprehensive inline documentation for MODERATE_PLUS mapping rationale
- Add performance TODOs for ORM relationship optimization
- Document calibration selection logic and conflict resolution strategies
- Add usage examples in mock infrastructure README

- Classification functions now require explicit score_calibration parameter
- Contribution functions now use native datetime objects instead of formatted strings
- API routes renamed to align with VA-Spec terminology

- Add TODO comments for ORM optimization in classification.py
- Document need to avoid eager loading of variant relationships
- Suggest pre-resolving classification IDs for O(1) lookups
@bencap bencap force-pushed the feature/bencap/494/va-models-for-multiple-calibrations branch from be01fd8 to 67e3aed Compare February 13, 2026 23:28
@bencap bencap force-pushed the feature/bencap/494/va-models-for-multiple-calibrations branch from 957d5cd to 6c94041 Compare February 14, 2026 00:05
@bencap bencap marked this pull request as ready for review February 14, 2026 00:13
@bencap bencap requested review from jstone-dev and sallybg February 14, 2026 00:13
@bencap
Copy link
Collaborator Author

bencap commented Feb 14, 2026

Bumped Pandas to v2.2+ in these changes to avoid build failures from the removal of pkg_resources from setuptools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for multiple functional and clinical ranges in VA Spec model exports

1 participant