-
Notifications
You must be signed in to change notification settings - Fork 68
Multi-fidelity surrogate models #721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev/mfbo
Are you sure you want to change the base?
Changes from all commits
995cbf8
686fc86
18ec63e
d3d0c2f
585ad16
e4b28a7
eca58ef
88db97f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -15,6 +15,10 @@ | |||||
| from baybe.constraints.base import Constraint | ||||||
| from baybe.parameters import TaskParameter | ||||||
| from baybe.parameters.base import Parameter | ||||||
| from baybe.parameters.fidelity import ( | ||||||
| CategoricalFidelityParameter, | ||||||
| NumericalDiscreteFidelityParameter, | ||||||
| ) | ||||||
| from baybe.searchspace.continuous import SubspaceContinuous | ||||||
| from baybe.searchspace.discrete import ( | ||||||
| MemorySize, | ||||||
|
|
@@ -48,6 +52,29 @@ class SearchSpaceType(Enum): | |||||
| """Flag for hybrid search spaces resp. compatibility with hybrid search spaces.""" | ||||||
|
|
||||||
|
|
||||||
| class SearchSpaceTaskType(Enum): | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i really dont get why TL and MF thigns are mixed in here to havea single property, imo that dosnt amke sense I would expect to have a is there any reason younare enforcing such a mxiture between unrelated concepts?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would then also make the |
||||||
| """Enum class for different types of task and/or fidelity subspaces.""" | ||||||
|
|
||||||
| SINGLETASK = "SINGLETASK" | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would prefer
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now will revise the docstring to say |
||||||
| """Flag for search spaces with no task parameters.""" | ||||||
|
|
||||||
| CATEGORICALTASK = "CATEGORICALTASK" | ||||||
| """Flag for search spaces with a categorical task parameter.""" | ||||||
|
|
||||||
| NUMERICALFIDELITY = "NUMERICALFIDELITY" | ||||||
| """Flag for search spaces with a discrete numerical (ordered) fidelity parameter.""" | ||||||
|
|
||||||
| CATEGORICALFIDELITY = "CATEGORICALFIDELITY" | ||||||
| """Flag for search spaces with a categorical (unordered) fidelity parameter.""" | ||||||
|
|
||||||
| # TODO: Distinguish between multiple task parameter and mixed task parameter types. | ||||||
| # In future versions, multiple task/fidelity parameters may be allowed. For now, | ||||||
| # they are disallowed, whether the task-like parameters are different or the same | ||||||
| # class. | ||||||
| MULTIPLETASKPARAMETER = "MULTIPLETASKPARAMETER" | ||||||
| """Flag for search spaces with mixed task and fidelity parameters.""" | ||||||
|
|
||||||
|
|
||||||
| @define | ||||||
| class SearchSpace(SerialMixin): | ||||||
| """Class for managing the overall search space. | ||||||
|
|
@@ -275,6 +302,24 @@ def task_idx(self) -> int | None: | |||||
| # --> Fix this when refactoring the data | ||||||
| return cast(int, self.discrete.comp_rep.columns.get_loc(task_param.name)) | ||||||
|
|
||||||
| @property | ||||||
| def fidelity_idx(self) -> int | None: | ||||||
| """The column index of the task parameter in computational representation.""" | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| try: | ||||||
| # See TODO [16932] and TODO [11611] | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be stupid, but what do [16932] and [11611] refer to?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. probably to the generalization to more than 1 task parameter, numbers refer to old Azure items imo remove
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wow, if this refers to old azure items this is already a historical piece of code :D But agree, let's remove then.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TODO: delete both todos here and in task_idx. |
||||||
| fidelity_param = next( | ||||||
| p | ||||||
| for p in self.parameters | ||||||
| if isinstance( | ||||||
| p, | ||||||
| (CategoricalFidelityParameter, NumericalDiscreteFidelityParameter), | ||||||
| ) | ||||||
| ) | ||||||
| except StopIteration: | ||||||
| return None | ||||||
|
|
||||||
| return cast(int, self.discrete.comp_rep.columns.get_loc(fidelity_param.name)) | ||||||
|
|
||||||
| @property | ||||||
| def n_tasks(self) -> int: | ||||||
| """The number of tasks encoded in the search space.""" | ||||||
|
|
@@ -287,6 +332,105 @@ def n_tasks(self) -> int: | |||||
| return 1 | ||||||
| return len(task_param.values) | ||||||
|
|
||||||
| @property | ||||||
| def n_fidelities(self) -> int: | ||||||
| """The number of tasks encoded in the search space.""" | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| # See TODO [16932] | ||||||
| try: | ||||||
| fidelity_param = next( | ||||||
| p | ||||||
| for p in self.parameters | ||||||
| if isinstance( | ||||||
| p, | ||||||
| (CategoricalFidelityParameter, NumericalDiscreteFidelityParameter), | ||||||
| ) | ||||||
| ) | ||||||
| return len(fidelity_param.values) | ||||||
|
|
||||||
| # When there are no fidelity parameters, we effectively have a single fidelity | ||||||
| except StopIteration: | ||||||
| return 1 | ||||||
|
|
||||||
| @property | ||||||
| def n_task_dimensions(self) -> int: | ||||||
| """The number of task dimensions.""" | ||||||
| try: | ||||||
| # See TODO [16932] | ||||||
| fidelity_param = next( | ||||||
| p for p in self.parameters if isinstance(p, (TaskParameter,)) | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Check this! fidelity_param should not be a TaskParameter. |
||||||
| ) | ||||||
| except StopIteration: | ||||||
| fidelity_param = None | ||||||
|
|
||||||
| return 1 if fidelity_param is not None else 0 | ||||||
|
|
||||||
| @property | ||||||
| def n_fidelity_dimensions(self) -> int: | ||||||
AVHopp marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| """The number of fidelity dimensions.""" | ||||||
| try: | ||||||
| # See TODO [16932] | ||||||
| fidelity_param = next( | ||||||
| p | ||||||
| for p in self.parameters | ||||||
| if isinstance( | ||||||
| p, | ||||||
| (CategoricalFidelityParameter, NumericalDiscreteFidelityParameter), | ||||||
| ) | ||||||
| ) | ||||||
| except StopIteration: | ||||||
| fidelity_param = None | ||||||
|
|
||||||
| return 1 if fidelity_param is not None else 0 | ||||||
|
|
||||||
| @property | ||||||
| def task_type(self) -> SearchSpaceTaskType: | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. imo mixing TL and MF into the
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree here. I see that the two features share a lot of similarities from the technical/implementation point of view, but for a user, these are still two unrelated features - quite literally, as we cannot even support a combination of TL and MF at the moment, so we should try to avoid potential confusion here. |
||||||
| """Return the task type of the search space. | ||||||
|
|
||||||
| Raises: | ||||||
| ValueError: If searchspace contains more than one task/fidelity parameter. | ||||||
| ValueError: An unrecognised fidelity parameter type is in SearchSpace. | ||||||
|
Comment on lines
+387
to
+391
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doctoring uses |
||||||
| """ | ||||||
| task_like_parameters = ( | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the word
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TODO: |
||||||
| TaskParameter, | ||||||
| CategoricalFidelityParameter, | ||||||
| NumericalDiscreteFidelityParameter, | ||||||
| ) | ||||||
|
|
||||||
| n_task_like_parameters = sum( | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Couldn't we re-use the properties here? I'd prefer that as it re-uses existing code
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. some parts of this PR also use
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, shouldn't we be able to fully get this by using the properties? There literally are different properties for checking the number
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, if we go through the
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TODO: this. |
||||||
| isinstance(p, (task_like_parameters)) for p in self.parameters | ||||||
| ) | ||||||
|
|
||||||
| if n_task_like_parameters == 0: | ||||||
| return SearchSpaceTaskType.SINGLETASK | ||||||
| elif n_task_like_parameters > 1: | ||||||
| # TODO: commute this validation further downstream. | ||||||
| # In case of user-defined custom models which allow for multiple task | ||||||
| # parameters, this should be later in recommender logic. | ||||||
| # * Should this be an IncompatibilityError? | ||||||
| raise ValueError( | ||||||
| "SearchSpace must not contain more than one task/fidelity parameter." | ||||||
| ) | ||||||
| return SearchSpaceTaskType.MULTIPLETASKPARAMETER | ||||||
|
|
||||||
| if self.n_task_dimensions == 1: | ||||||
| return SearchSpaceTaskType.CATEGORICALTASK | ||||||
|
|
||||||
| if self.n_fidelity_dimensions == 1: | ||||||
| n_categorical_fidelity_dims = sum( | ||||||
| isinstance(p, CategoricalFidelityParameter) for p in self.parameters | ||||||
| ) | ||||||
| if n_categorical_fidelity_dims == 1: | ||||||
| return SearchSpaceTaskType.CATEGORICALFIDELITY | ||||||
|
|
||||||
| n_numerical_disc_fidelity_dims = sum( | ||||||
| isinstance(p, NumericalDiscreteFidelityParameter) | ||||||
| for p in self.parameters | ||||||
| ) | ||||||
| if n_numerical_disc_fidelity_dims == 1: | ||||||
| return SearchSpaceTaskType.NUMERICALFIDELITY | ||||||
|
|
||||||
| raise RuntimeError("This line should be impossible to reach.") | ||||||
|
|
||||||
| def get_comp_rep_parameter_indices(self, name: str, /) -> tuple[int, ...]: | ||||||
| """Find a parameter's column indices in the computational representation. | ||||||
|
|
||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -86,6 +86,10 @@ class Surrogate(ABC, SurrogateProtocol, SerialMixin): | |
| """Class variable encoding whether or not the surrogate supports transfer | ||
| learning.""" | ||
|
|
||
| supports_multi_fidelity: ClassVar[bool] | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there is an inconsistent treatment of the THe altter requries many explicit lines in the derived classes which could eba voided if we explicitly set it to I would prefer that but remember we had disucssion on it some time ago. In order to unify this, can you man a note here so this is revisited / rediscussed? for this PR its fine I guess |
||
| """Class variable encoding whether or not the surrogate supports multi fidelity | ||
| Bayesian optimization.""" | ||
|
|
||
| supports_multi_output: ClassVar[bool] = False | ||
| """Class variable encoding whether or not the surrogate is multi-output | ||
| compatible.""" | ||
|
|
@@ -428,6 +432,14 @@ def fit( | |
| f"support transfer learning." | ||
| ) | ||
|
|
||
| # Check if multi fidelity capabilities are needed | ||
| if (searchspace.n_fidelities > 1) and (not self.supports_multi_fidelity): | ||
| raise ValueError( | ||
| f"The search space contains fidelity parameters but the selected " | ||
| f"surrogate model type ({self.__class__.__name__}) does not " | ||
| f"support multi fidelity Bayesian optimisation." | ||
| ) | ||
|
|
||
| # Block partial measurements | ||
| handle_missing_values(measurements, [t.name for t in objective.targets]) | ||
|
|
||
|
|
@@ -472,6 +484,11 @@ def __str__(self) -> str: | |
| self.supports_transfer_learning, | ||
| single_line=True, | ||
| ), | ||
| to_string( | ||
| "Supports Multi Fidelity", | ||
| self.supports_multi_fidelity, | ||
| single_line=True, | ||
| ), | ||
| ] | ||
| return to_string(self.__class__.__name__, *fields) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -52,6 +52,8 @@ | |
| from torch import Tensor | ||
|
|
||
|
|
||
| # TODO Jordan MHS: _ModelContext is used by fidelity surrogate models now so may deserve | ||
| # its own file. | ||
| @define | ||
| class _ModelContext: | ||
| """Model context for :class:`GaussianProcessSurrogate`.""" | ||
|
|
@@ -80,6 +82,27 @@ def n_tasks(self) -> int: | |
| """The number of tasks.""" | ||
| return self.searchspace.n_tasks | ||
|
|
||
| @property | ||
| def n_fidelity_dimensions(self) -> int: | ||
| """The number of fidelity dimensions.""" | ||
| # Possible TODO: Generalize to multiple fidelity dimensions | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This todo is here to spark discussion - will remove after review.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now I would propose to keep everything focused around a single dimension, unless it is very clear and easy how to go on.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in analogy to the task, the fidelity is kept at max 1 fidelity parameter and everything els eis out of scope for the MF branch |
||
| return 1 if self.searchspace.fidelity_idx is not None else 0 | ||
|
|
||
| @property | ||
| def is_multi_fidelity(self) -> bool: | ||
| """Are there any fidelity dimensions?""" | ||
| return self.n_fidelity_dimensions > 0 | ||
|
|
||
| @property | ||
| def fidelity_idx(self) -> int | None: | ||
| """The computational column index of the task parameter, if available.""" | ||
| return self.searchspace.fidelity_idx | ||
|
|
||
| @property | ||
| def n_fidelities(self) -> int: | ||
| """The number of fidelities.""" | ||
| return self.searchspace.n_fidelities | ||
|
|
||
| @property | ||
| def parameter_bounds(self) -> Tensor: | ||
| """Get the search space parameter bounds in BoTorch Format.""" | ||
|
|
@@ -93,7 +116,7 @@ def numerical_indices(self) -> list[int]: | |
| return [ | ||
| i | ||
| for i in range(len(self.searchspace.comp_rep_columns)) | ||
| if i != self.task_idx | ||
| if i not in (self.task_idx, self.fidelity_idx) | ||
| ] | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the respective class is implemented, can you point out where the dispatching is done? or is this for another PR?