Skip to content

Comments

🚀 Use Pydantic for Problem Definition#308

Open
An0sis wants to merge 5 commits intomainfrom
pydantic_problem_definition
Open

🚀 Use Pydantic for Problem Definition#308
An0sis wants to merge 5 commits intomainfrom
pydantic_problem_definition

Conversation

@An0sis
Copy link

@An0sis An0sis commented Feb 1, 2026

Thanks for contributing! Please make sure your PR title and content follow the guidelines.
Leave this checklist below in your PR description and tick the corresponding boxes.

Checklist

  • Typing enforced
  • Documentation updated
  • Changelog updated
  • Tests and Example updates
  • Coverage should be 100%

@An0sis An0sis requested a review from a team as a code owner February 1, 2026 23:47
Copilot AI review requested due to automatic review settings February 1, 2026 23:47
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@An0sis An0sis linked an issue Feb 1, 2026 that may be closed by this pull request
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors ProblemDefinition to be a Pydantic model to improve parsing/validation, and updates documentation/changelog to reflect the new behavior.

Changes:

  • Converted ProblemDefinition to inherit from pydantic.BaseModel and added field definitions/validators.
  • Replaced the directory_path deprecation logger warning with a DeprecationWarning.
  • Updated docs and changelog to describe the Pydantic-based implementation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
src/plaid/problem_definition.py Migrates ProblemDefinition to Pydantic, adds model config/validators, and updates deprecated directory_path handling.
docs/source/core_concepts/problem_definition.md Notes that ProblemDefinition is Pydantic-based and discusses validation behavior.
CHANGELOG.md Records the Pydantic refactor and directory_path deprecation note.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +125 to +127
"`directory_path` is deprecated, use `path` instead.",
DeprecationWarning,
)
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raise line is indented with an extra space, which violates the repo’s Ruff E111 indentation rule (indentation not a multiple of four). Align the indentation inside this if block.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +136
@field_validator("task")
@classmethod
def validate_task(cls, v: Optional[str]) -> Optional[str]:
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same indentation issue here: the raise statement has indentation that isn’t a multiple of four, which will be flagged by Ruff (E111).

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +137
@classmethod
def validate_task(cls, v: Optional[str]) -> Optional[str]:
"""Validate that the task is among the authorized tasks."""
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message refers to "authorized tasks" but this validator is for score_function. Update the message to mention authorized score functions (and ideally include AUTHORIZED_SCORE_FUNCTIONS).

Copilot uses AI. Check for mistakes.
Comment on lines 122 to 133
"Arguments `path` and `directory_path` cannot be both set. Use only `path` as `directory_path` is deprecated."
)
else:
path = directory_path
logger.warning(
"DeprecationWarning: 'directory_path' is deprecated, use 'path' instead."
)
warnings.warn(
"`directory_path` is deprecated, use `path` instead.",
DeprecationWarning,
)
path = directory_path

if path is not None:
path = Path(path)
self._load_from_dir_(path)

Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior: task and score_function can now be validated via Pydantic on model construction. There are existing tests for set_task/set_score_function, but none covering ProblemDefinition(task=...) / ProblemDefinition(score_function=...) success and failure cases. Add tests to lock in the new validation path.

Copilot uses AI. Check for mistakes.
- outputs: list of FeatureIdentifiers
- split: arbitrary named splits (train/val/test, etc.) stored as JSON

`ProblemDefinition` is built on top of [Pydantic](https://docs.pydantic.dev/), providing robust data validation. Authorized tasks and score functions are strictly enforced.
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs claim tasks/score functions are "strictly enforced", but the current implementation can bypass Pydantic validation when values are assigned after initialization (e.g., when loading from disk) unless assignment validation is enabled. Either adjust the wording or ensure the implementation validates assignments during load.

Suggested change
`ProblemDefinition` is built on top of [Pydantic](https://docs.pydantic.dev/), providing robust data validation. Authorized tasks and score functions are strictly enforced.
`ProblemDefinition` is built on top of [Pydantic](https://docs.pydantic.dev/), providing robust data validation when defining and updating problems through its public API. Authorized tasks and score functions are validated accordingly.

Copilot uses AI. Check for mistakes.
Comment on lines +26 to 28
from typing import Any, Dict, List, Optional, Sequence, Union

import yaml
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ruff is configured with Pyflakes (F401) and isort (I) (see ruff.toml). This import block is currently unsorted, and PrivateAttr/model_validator appear to be unused in this file; this will fail lint. Reorder the typing/pydantic imports and drop unused names (or start using them).

Copilot uses AI. Check for mistakes.
class ProblemDefinition(BaseModel):
"""Gathers all necessary informations to define a learning problem."""

model_config = ConfigDict(arbitrary_types_allowed=True, extra="ignore")
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProblemDefinition is mutating validated fields during _load_from_dir_/_initialize_from_problem_infos_dict via property setters after super().__init__(). With the current model_config, assignments won’t trigger Pydantic validators, so invalid task/score_function values loaded from disk can slip through. Consider enabling validate_assignment=True (or refactor loading to use model_validate) to make validation effective beyond initial construction.

Suggested change
model_config = ConfigDict(arbitrary_types_allowed=True, extra="ignore")
model_config = ConfigDict(
arbitrary_types_allowed=True,
extra="ignore",
validate_assignment=True,
)

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Feb 1, 2026

Codecov Report

❌ Patch coverage is 94.00000% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/plaid/problem_definition.py 94.00% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@casenave
Copy link
Member

casenave commented Feb 4, 2026

Hi, thank you for your contribution :-) Can you sign the CLA, look into the coverage issue and the suggestions from Copilot ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use pydantic to handle ProblemDefinition

3 participants