Skip to content

Rewrite Transfer Learning Example#694

Draft
AVHopp wants to merge 8 commits intomainfrom
example/rewrite_tl_example
Draft

Rewrite Transfer Learning Example#694
AVHopp wants to merge 8 commits intomainfrom
example/rewrite_tl_example

Conversation

@AVHopp
Copy link
Collaborator

@AVHopp AVHopp commented Nov 18, 2025

As discussed quite some time ago, I wanted to rewrite the transfer learning example such that we have a more meaningful one. This PR thus introduces the classical "Shields-Temperature-TL" example we have used in several other places.

I've tried to follow the same style as the other more recent examples (in particular paret o and laser tuning).

Link to compiled version: https://avhopp.github.io/baybe_dev/latest/examples/Transfer_Learning/basic_transfer_learning.html

TODO: Replace temperature by labs to make it more natural

Copilot AI review requested due to automatic review settings November 18, 2025 16:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the abstract Hartmann function example with a realistic chemical optimization scenario from the Shields et al. direct arylation study. The rewrite demonstrates transfer learning by showing how historical experimental data from chemical reactions at different temperatures can accelerate optimization at a new target temperature.

Key Changes:

  • Replaced synthetic Hartmann function with real chemical synthesis data
  • Introduced chemical parameters (solvents, bases, ligands, concentration, temperature)
  • Expanded simulation to demonstrate transfer learning across three different target temperatures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@AVHopp AVHopp added the documentation Improvements or additions to documentation label Nov 24, 2025
@AVHopp AVHopp force-pushed the example/rewrite_tl_example branch 2 times, most recently from 34e4eb5 to bf60544 Compare November 25, 2025 13:58
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please ensure this is stored as Optimized SVG (Inkscape) and is not a raw vanilla svg (size reasons). I can also do it if you like but it makes sense only after everything is finished

# optimization experiments under certain reaction conditions can be transferred to
# accelerate optimization under different conditions. Specifically, we investigate a
# setting where:
# * we have **historical experimental data** from chemical reactions conducted at
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds a bit like its some arb chemical reactions

this needs to be rephrased to say that its literally the exact same reaction just repeated at different temperatures

# Imagine you're a chemist in a pharmaceutical company, tasked with optimizing a
# direct arylation reaction to maximize product yield. This reaction involves
# combining different chemical components (solvents, bases, ligands) under varying
# temperature and concentration conditions. Each experiment is expensive and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temperature is not optimized so it should be taken out in the list here

# optimization campaigns at different temperatures in the past. The question arises:
# can you leverage this **historical data from related conditions** to accelerate
# optimization at your target temperature? This is where transfer learning becomes
# invaluable.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats quite a strong word for this eample. The unfortunate thign about the entire example is: we dotn need TL here as we can simply momdel the temperature as normal parameter, TL has no advantage compared to that

]

result = simulate_scenarios(
{f"{int(100 * fraction)}": campaign},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer if these labels read % to avoid confusing it with the number of points used as opposed to the fraction of available points used

ax.legend().set_visible(False)
else:
ax.legend(
title="Training data used", bbox_to_anchor=(1.05, 1), loc="upper left"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a good label as it doesnt get across that its the fraction of points

do you need to et a label at all? Sinc you rename the hue from Sccenario to % of data used thats already the perfect label (im not sure if its shown byd efault if you make this call without title= but if NOT then the renamings are also pointsless just saying

Comment on lines +240 to +243
final_results = results.groupby("% of data used")["yield_CumBest"].max()
baseline = final_results.loc["0"]
best_transfer = final_results.drop("0").max()
improvement = best_transfer - baseline
Copy link
Collaborator

@Scienfitz Scienfitz Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this here its not used anywhere

please do not add AI code that you don't check

# 2. The magnitude of improvement varies by condition, reflecting differences in how
# well knowledge transfers between specific temperature pairs.
#
# 2. Even small amounts of training data can yield significant acceleration, making thi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 2. Even small amounts of training data can yield significant acceleration, making thi
# 2. Even small amounts of training data can yield significant acceleration, making this


# The results reveal several key insights:
#
# 1. Transfer learning provides substantial improvements across all temperature
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most comments here are minor, but one big topic about the choice of example here is:

You've chosen a usecase where TL is not needed. We can simply model the temperature as explicit numerical parameter because thats the perfect quanitification of the different "contexts" here

Now dont get em wrong a study like this still makes sense, especially to gauge the TL results with the explicit-TL results (and baseline). But this is not done here in this example (and Im not sure it should as its an example and not a benchmark). If the results look OK then we could add them as second row of results in the plot, this would need more comment and explanation tho.

But without this context many readers might read this example an will think why I no model with normal simple parameter with no answers / explanations provided here anywhere

@AVHopp AVHopp added this to the 0.15.0 milestone Nov 28, 2025
@AVHopp AVHopp marked this pull request as draft November 28, 2025 14:19
@AVHopp AVHopp added the on hold PR progress is awaiting for something else to continue label Nov 28, 2025
AVHopp and others added 8 commits February 16, 2026 09:01
Co-authored-by: Martin Fitzner <martin.fitzner@merckgroup.com>
Co-authored-by: Martin Fitzner <martin.fitzner@merckgroup.com>
Co-authored-by: Martin Fitzner <martin.fitzner@merckgroup.com>
@AVHopp AVHopp force-pushed the example/rewrite_tl_example branch from 1875c34 to 5aa03e2 Compare February 16, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation on hold PR progress is awaiting for something else to continue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants