Multi-fidelity surrogate models by jpenn2023 · Pull Request #721 · emdgroup/baybe

jpenn2023 · 2026-01-08T09:03:28Z

Adding multi-fidelity properties for the SearchSpace class and multi-fidelity Gaussian process classes with the required fidelity kernels.

jpenn2023 · 2026-02-03T14:55:11Z

baybe/surrogates/gaussian_process/core.py

+# TODO Jordan MHS: _ModelContext is used by fidelity surrogate models now so may deserve
+# its own file.


Should I move model context to its own file since it is used in multiple Gaussian process files?

not important but if wanted feel free to put it into a gaussian_process/utils.py or similar

jpenn2023 · 2026-02-03T14:55:35Z

baybe/surrogates/gaussian_process/core.py

+# its own file.
 @define
 class _ModelContext:
    """Model context for :class:`GaussianProcessSurrogate`."""


And how would this comment be best amended?

by adding the other class OR getting rid of the explicit link and using a generic not-linked word which means it never has to be updated again ever in the history of the universe

jpenn2023 · 2026-02-03T14:56:32Z

baybe/surrogates/gaussian_process/core.py

+    @property
+    def n_fidelity_dimensions(self) -> int:
+        """The number of fidelity dimensions."""
+        # Possible TODO: Generalize to multiple fidelity dimensions


This todo is here to spark discussion - will remove after review.

For now I would propose to keep everything focused around a single dimension, unless it is very clear and easy how to go on.

in analogy to the task, the fidelity is kept at max 1 fidelity parameter and everything els eis out of scope for the MF branch

jpenn2023 · 2026-02-03T14:58:53Z

baybe/surrogates/gaussian_process/multi_fidelity.py

+    # See base class.
+
+    kernel_factory: KernelFactory = field(init=False, default=None)
+    """Design kernel is set to Matern within SingleTaskMultiFidelityGP."""


Opinions on referencing other packages' classes on in docstrings?

jpenn2023 · 2026-02-05T08:18:58Z

baybe/surrogates/gaussian_process/multi_fidelity.py

+            batch_shape=batch_shape,
+        )
+
+        fidelity_covar_module = self.fidelity_kernel_factory(


Index kernels used as categorical fidelity kernels do not need train_x and train_y to set any lengthscales but the KernelFactory class requires these as arguments in __call__. How should we handle this difference? E.g., we could (a) redundantly pass train_x and train_y into the fidelity kernels' __call__ or (b) make train_x and train_y optional in the parent KernelFactory class. The latter choice seems correct to me but wanted to check before changing existing code.

AVHopp

Some comments on everything but the multi_fidelity.py file, as I guess that we will discuss this today in a bit more detail.

AVHopp · 2026-02-06T07:43:25Z

baybe/kernels/basic.py

+        converter=optional_c(int),
+        validator=optional_v([instance_of(int), ge(1)]),
+    )
+    """Matrix rank controlling the degree of correlation between outputs


Please add a short half-sentence about what happens when it is None

AVHopp · 2026-02-06T07:45:40Z

baybe/searchspace/core.py

+class SearchSpaceTaskType(Enum):
+    """Enum class for different types of task and/or fidelity subspaces."""
+
+    SINGLETASK = "SINGLETASK"


Would prefer NOTASK over SINGLETASK as the latter one could also be read as "has a single task parameter"

AVHopp · 2026-02-06T07:48:34Z

baybe/searchspace/core.py

+    def fidelity_idx(self) -> int | None:
+        """The column index of the task parameter in computational representation."""
+        try:
+            # See TODO [16932] and TODO [11611]


Might be stupid, but what do [16932] and [11611] refer to?

probably to the generalization to more than 1 task parameter, numbers refer to old Azure items

imo remove

AVHopp · 2026-02-06T07:50:07Z

baybe/searchspace/core.py

+        try:
+            # See TODO [16932]
+            fidelity_param = next(
+                p for p in self.parameters if isinstance(p, (TaskParameter,))


Why (TaskParameter,), shouldn't TaskParameter also do the job?

AVHopp · 2026-02-06T07:52:40Z

baybe/searchspace/core.py

+        return 1 if fidelity_param is not None else 0
+
+    @property
+    def n_fidelity_dimensions(self) -> int:


Technically, this does not calculate the number of fidelity dimensions but simply returns 1 if there is at least one fidelity dimension. This thus implicitly assumes that there can only be one such dimension. Is this guaranteed at some other place?
Even if, I would propose that this still performs counting in the same way as it is done in n_task_dimensions and not hard-code the 1 as a return value

EDIT: It seems like this is only done when accessing task_type, right? So a user could still create a search space with multiple such parameters, right? Or has this been addressed in the previous PR already?

this is jsut a copycat of n_task_dimensions so you can assume the same assumptions etc

AVHopp · 2026-02-06T09:30:17Z

baybe/surrogates/gaussian_process/presets/default.py

    ) -> Kernel:
        effective_dims = train_x.shape[-1] - len(
-            [p for p in searchspace.parameters if isinstance(p, TaskParameter)]
+            [


Don't we have the properties for exactly calculating this?

AVHopp · 2026-02-06T09:37:07Z

baybe/surrogates/gaussian_process/presets/fidelity.py

+        )
+
+
+DefaultFidelityKernelFactory = IndexFidelityKernelFactory


Why this assignment here? For easier import in other files? SHouldn't a default de assigned there?

AVHopp · 2026-02-06T09:37:53Z

baybe/surrogates/gaussian_process/core.py

+    @property
+    def n_fidelity_dimensions(self) -> int:
+        """The number of fidelity dimensions."""
+        # Possible TODO: Generalize to multiple fidelity dimensions


For now I would propose to keep everything focused around a single dimension, unless it is very clear and easy how to go on.

AVHopp · 2026-02-06T09:38:12Z

baybe/surrogates/gaussian_process/core.py

+
+    @property
+    def is_multi_fidelity(self) -> bool:
+        """Are there any fidelity dimensions?"""


This is not how a docstring should look like, please reformulate properly.

is there an is_multi_task property? if not please add for consistency

AVHopp · 2026-02-06T09:42:19Z

baybe/surrogates/gaussian_process/multi_fidelity.py

+    def _make_parameter_scaler_factory(
+        parameter: Parameter,
+    ) -> type[InputTransform] | None:
+        # For GPs, we let botorch handle the scaling. See [Scaling Workaround] above.


The "above" is not in this file, so please refer to the correct part of the code base

AdrianSosic · 2026-03-03T20:28:28Z

Hey @jpenn2023, I just wanted to have a look at your PR but then noticed that something went wrong with your rebase. I know you asked me about the weird diff shown on Github (and yes, that is generally still a problem), but it turns out that your diff is wrong for a different reason – namely because you somehow rebased your commits together with an entire bunch of other commits 😬 Can you quickly fix it and ping me once ready for review?

AdrianSosic · 2026-03-06T17:28:33Z

Hey @jpenn2023, have a look, this should now give you a clean picture of your PR content, right?

jpenn2023 · 2026-03-09T09:31:14Z

Hey @jpenn2023, have a look, this should now give you a clean picture of your PR content, right?

Hi @AdrianSosic. Yes, this view looks right to me.

Scienfitz

my biggest question is why the concepts of TL and MF are now deedply intertwined by using a single searchspace property for combining them isntead of two properties characterizing the searchspaces task / fidelity character

Scienfitz · 2026-03-11T17:02:48Z

baybe/searchspace/core.py

    """Flag for hybrid search spaces resp. compatibility with hybrid search spaces."""


+class SearchSpaceTaskType(Enum):


i really dont get why TL and MF thigns are mixed in here to havea single property, imo that dosnt amke sense

I would expect to have a SearchSpaceTaskType (with SINGLETASK and CATEGORICALTASK) and SearchSpaceFidelityType (with SINGLEFIDELITY, CATEGORICALFIDELITY and NUMERICALFIDELITY etc) which their respective values and not a mixed kind of thing

is there any reason younare enforcing such a mxiture between unrelated concepts?

Scienfitz · 2026-03-11T17:05:11Z

baybe/searchspace/core.py


+    @property
+    def fidelity_idx(self) -> int | None:
+        """The column index of the task parameter in computational representation."""


Suggested change

"""The column index of the task parameter in computational representation."""

"""The column index of the fidelity parameter in computational representation."""

Scienfitz · 2026-03-11T17:05:41Z

baybe/searchspace/core.py

+    def fidelity_idx(self) -> int | None:
+        """The column index of the task parameter in computational representation."""
+        try:
+            # See TODO [16932] and TODO [11611]


probably to the generalization to more than 1 task parameter, numbers refer to old Azure items

imo remove

Scienfitz · 2026-03-11T17:06:25Z

baybe/searchspace/core.py


+    @property
+    def n_fidelities(self) -> int:
+        """The number of tasks encoded in the search space."""


Suggested change

"""The number of tasks encoded in the search space."""

"""The number of fidelities encoded in the search space."""

Scienfitz · 2026-03-11T17:10:39Z

baybe/searchspace/core.py

+        return 1 if fidelity_param is not None else 0
+
+    @property
+    def n_fidelity_dimensions(self) -> int:


this is jsut a copycat of n_task_dimensions so you can assume the same assumptions etc

Scienfitz · 2026-03-11T17:23:18Z

baybe/surrogates/gaussian_process/core.py

+    @property
+    def n_fidelity_dimensions(self) -> int:
+        """The number of fidelity dimensions."""
+        # Possible TODO: Generalize to multiple fidelity dimensions


in analogy to the task, the fidelity is kept at max 1 fidelity parameter and everything els eis out of scope for the MF branch

Scienfitz · 2026-03-11T17:24:23Z

baybe/surrogates/gaussian_process/core.py

+
+    @property
+    def is_multi_fidelity(self) -> bool:
+        """Are there any fidelity dimensions?"""


is there an is_multi_task property? if not please add for consistency

Scienfitz · 2026-03-11T17:28:23Z

baybe/surrogates/base.py

    """Class variable encoding whether or not the surrogate supports transfer
    learning."""

+    supports_multi_fidelity: ClassVar[bool]


there is an inconsistent treatment of the supports_* proeprties now ehere _multi_output is set in the abse class but _transfer_learning and _fidelity are not set

THe altter requries many explicit lines in the derived classes which could eba voided if we explicitly set it to False here and require overwriting if its actually compatible

I would prefer that but remember we had disucssion on it some time ago.

In order to unify this, can you man a note here so this is revisited / rediscussed? for this PR its fine I guess

Scienfitz · 2026-03-11T17:30:24Z

baybe/surrogates/gaussian_process/core.py

    # problem since the resulting `posterior` method of that object is exposed
    # to `optimize_acqf_*`, which is configured to be called on the original scale.
    # Moving the scaling operation into the botorch GP object avoids this conflict.



why does this class not have the new supports_multi_fidelity set?

Scienfitz · 2026-03-11T17:32:32Z

baybe/recommenders/pure/bayesian/base.py

+    #   CategoricalFidelityParameter or NumericalDiscreteFidelityParameter.
+    #   This can be achieved without the user having to specify the surroagte model,
+    #   e.g., by
+    #   * using a dispatcher factory which decides surrogate model on fit time


I see the respective class is implemented, can you point out where the dispatching is done? or is this for another PR?

jpenn2023 requested review from AVHopp, AdrianSosic and Scienfitz as code owners January 8, 2026 09:03

jpenn2023 force-pushed the dev-mfbo-main-surrogate-models branch from d062293 to 8ed0f2a Compare January 8, 2026 09:43

Scienfitz assigned jpenn2023 Jan 14, 2026

jpenn2023 commented Feb 3, 2026

View reviewed changes

AdrianSosic force-pushed the dev-mfbo-main-surrogate-models branch from ccac1f8 to 178772a Compare February 4, 2026 16:58

jpenn2023 commented Feb 5, 2026

View reviewed changes

AVHopp reviewed Feb 6, 2026

View reviewed changes

jpenn2023 force-pushed the dev-mfbo-main-surrogate-models branch from 2e64082 to 8e8547b Compare February 27, 2026 07:29

jpenn2023 force-pushed the dev-mfbo-main-surrogate-models branch from d96cf75 to 2c513b0 Compare March 6, 2026 12:25

AdrianSosic force-pushed the dev-mfbo-main-surrogate-models branch from 2c513b0 to 754d5da Compare March 6, 2026 13:48

AdrianSosic changed the base branch from dev/mfbo to main March 6, 2026 13:54

AdrianSosic changed the base branch from main to dev/mfbo March 6, 2026 13:55

AdrianSosic force-pushed the dev-mfbo-main-surrogate-models branch from 754d5da to 563b86a Compare March 6, 2026 17:21

AdrianSosic changed the base branch from dev/mfbo to main March 6, 2026 17:22

AdrianSosic changed the base branch from main to dev/mfbo March 6, 2026 17:22

jpenn2023 added 8 commits March 6, 2026 18:25

Multi fidelity searchspaces and surrogate modelling

995cbf8

Typing fixes

686fc86

More typing fixes

18ec63e

More typing fixes with some unresolved

d3d0c2f

Typo fix

585ad16

Integrating multi fidelity surrogate models with multitask refactor

e4b28a7

Integrate typing

eca58ef

Integrating kernel factories with multi fidelity

88db97f

AdrianSosic force-pushed the dev-mfbo-main-surrogate-models branch from 563b86a to 88db97f Compare March 6, 2026 17:26

AdrianSosic force-pushed the dev/mfbo branch from 5b1aa42 to ae31906 Compare March 6, 2026 17:27

Scienfitz requested changes Mar 11, 2026

View reviewed changes

		# TODO Jordan MHS: _ModelContext is used by fidelity surrogate models now so may deserve
		# its own file.

		"""Flag for hybrid search spaces resp. compatibility with hybrid search spaces."""


		class SearchSpaceTaskType(Enum):

	"""The column index of the task parameter in computational representation."""
	"""The column index of the fidelity parameter in computational representation."""

	"""The number of tasks encoded in the search space."""
	"""The number of fidelities encoded in the search space."""

		)


		DefaultFidelityKernelFactory = IndexFidelityKernelFactory

Conversation

jpenn2023 commented Jan 8, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jpenn2023 Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AVHopp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AdrianSosic commented Mar 3, 2026

Uh oh!

AdrianSosic commented Mar 6, 2026

Uh oh!

jpenn2023 commented Mar 9, 2026

Uh oh!

Scienfitz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

jpenn2023 Feb 3, 2026 •

edited

Loading