Skip to content

Feature: Add Support for PatchExtractor in Sklearn Serializer #49

@Gnpd

Description

@Gnpd

Feature: Add Support for PatchExtractor in Sklearn Serializer

Summary

Currently, the OpenModels sklearn serializer does not support the PatchExtractor transformer from scikit-learn. Attempting to serialize or deserialize a PatchExtractor instance results in a ValueError: not enough values to unpack (expected 3, got 2). This is likely due to the way the patch_size or related attributes are handled internally.

Motivation

  • Completeness: PatchExtractor is a useful transformer for image data and should be supported like other scikit-learn estimators.
  • User Experience: Users expect all standard scikit-learn transformers to be serializable/deserializable without errors.
  • Consistency: The serializer should handle special cases or internal attribute structures that differ from typical estimators.

Error Details

  • Error:
    ValueError: not enough values to unpack (expected 3, got 2)
  • Context:
    This error occurs when serializing or deserializing a PatchExtractor instance, likely due to the structure of the patch_size or related attributes.

References

Suggested Tasks

  • Investigate the internal structure of PatchExtractor, especially how patch_size and related attributes are stored and restored.
  • Update the serializer to correctly handle the serialization and deserialization of PatchExtractor parameters and fitted attributes.
  • Add tests to ensure that PatchExtractor can be round-tripped (serialized and deserialized) without errors.
  • Remove "PatchExtractor" from the NOT_SUPPORTED_ESTIMATORS list in sklearn_serializer.py once support is complete.

Acceptance Criteria

  • PatchExtractor can be serialized and deserialized without errors.
  • All relevant parameters and fitted attributes are preserved.
  • Tests are added to cover typical usage of PatchExtractor.
  • "PatchExtractor" is no longer listed in NOT_SUPPORTED_ESTIMATORS.

Related file: openmodels/serializers/sklearn/sklearn_serializer.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions