-
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Feature: Optionally Include Training Data in SklearnSerializer Output
Summary:
Add support for optionally including training data (X, y) in the output of SklearnSerializer.serialize. This would allow users to pass the original training features and targets when serializing a model, resulting in a training_data key in the serialized dictionary.
Motivation:
- Enables reproducibility and easier model inspection by storing the data used for training alongside the model parameters and attributes.
- Facilitates downstream tasks such as model validation, auditing, and sharing, where access to the original training data is beneficial.
Proposed API Change:
- Update the
serializemethod to accept optionalXandyparameters. - If provided, include a
training_datakey in the output dictionary:{ ...existing keys..., "training_data": { "X": <serialized X>, "y": <serialized y> } } - If not provided, omit the
training_datakey.
Example Usage:
serializer = SklearnSerializer()
serialized = serializer.serialize(model, X=X_train, y=y_train)Notes:
- Training data should be serialized using the existing conversion utilities to ensure compatibility with JSON and other formats.
- This addition should be fully optional and backward compatible.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Projects
Status
Backlog