MRB-650 maps simplified by jonasbhend · Pull Request #92 · MeteoSwiss/evalml

jonasbhend · 2026-01-07T10:38:09Z

Add maps of forecast verification scores

Changes

score components for maps in verif.nc
Make verif.nc temporary to avoid storage of large data volumes
The domains which are originally called "centraleurope" and "switzerland" are mostly the same. I suggest making domain "switzerland" much smaller, so that more spatial detail can be seen, especially in the complex topography of the alps.

…oard" This reverts commit cdefa16.

summary statistics. (No changes to code yet.)

of summary statistics.

For Bias, RMSE and MAE map plots.

compat.py

earthkit instead.

Francesco. Got a long way towards the png plots. Co-authored-by: Francesco Zanetta <francesco.zanetta@meteoswiss.ch>

variables.

properly working). Output written to .png now working.

detailed inspection of results at smaller spatial scale.

symmetric colour map for bias.

to see if all of it still works.

Louis-Frey · 2026-02-09T13:41:18Z

I came across two different problems with this feature branch:

On the current last commit, I run into memory issues in rule verif_metrics_aggregation . The job seems to be killed by SLURM due to be out of memory (analysis of the log files and consultation with ChatGPT strongly suggest this). This may confirm Francesco's reservations regarding the computation of the spatial metrics.
When I extended the code to also plot the maps for the baselines, I got errors in rule report_experiment_dashboard several times last week. This made me test the code on the last commit today, resulting in the error described in 1.

So overall I can't really make sense of this. On the one hand, I apparently was able to run verif_metrics_aggregation before, because the code only failed later in report_experiment_dashboard. Also, I was able to run the full pipeline on a full year of daily forecasts with the ICON-CH1 emulator about a week ago, which ran without an error.

Also, the problem in 2. suggests that even if problem 1. does not occur, problems may arise later when the verification files are aggregated in the dashboard. This may be due to the large size of the verification files (about 17G for the run and about 11G for the ICON-CH1 baseline). However, also this ran without error in some cases (1 year experiment), so it does not seem to fail consistently.

Would be great if you could look into this!

frazane · 2026-02-09T15:11:30Z

Make verif.nc temporary to avoid storage of large data volumes

By doing this, verification will have to be re-computed for every run, every time the workflow is executed. I don't think we want this, no?

Louis-Frey · 2026-02-09T17:57:31Z

Make verif.nc temporary to avoid storage of large data volumes

By doing this, verification will have to be re-computed for every run, every time the workflow is executed. I don't think we want this, no?

Yes, that could be problematic. Jonas introduced it and I haven't given it much thought, should I change it back?

Louis-Frey · 2026-02-09T18:12:28Z

Ok I most likely fixed problem 2. See commit 7b50809

Louis-Frey · 2026-02-10T14:38:45Z

And I just tried again an evalml experiment config/forecasters-ich1.yaml, which ran without an error. So issue 1. is not reproducible.

This does the same thing with less code.

dnerini · 2026-02-12T12:08:58Z

src/plotting/__init__.py

+    # The domains which are originally called "centraleurope" and "switzerland"
+    # are mostly the same. I suggest making domain "switzerland" much smaller, 
+    # so that more spatial detail can be seen, especially in the complex 
+    # topography of the alps. 


it's a good idea! please move tihs comment to the description of your PR where you summarize the main changes introudced by this PR

The description is the top post in this thread I guess? I put it there too.

src/plotting/colormap_defaults.py

dnerini · 2026-02-12T12:16:01Z

workflow/scripts/report_experiment_dashboard.py


    # Load, de-duplicate lead_time, and keep best provider per source (same logic as verif_plot_metrics)
-    dfs = [xr.open_dataset(f) for f in args.verif_files]
+    drop_variables = ["TOT_PREC.MAE.spatial", "TOT_PREC.RMSE.spatial", "TOT_PREC.BIAS.spatial", 


instead of hardcoding the variables names, why not just opening the first dataset in the verif_files list and extract all variable names including the keyword "spatial"?

Yes, I'll do that.

dnerini · 2026-02-12T12:17:25Z

workflow/scripts/report_experiment_dashboard.py

+        ds,
+        forbidden_dims=("values",),  # critical!
+        metric_dims=("source", "season", "init_hour", "region", "lead_time", "eps"),
+    )


what's the purpose of this change? the spatial fields should have no impact on this part of the code, or am I missing something?

Yes, the spatial fields don't matter here, but they caused memory problems in the past when this rule was executed. I therefore implemented this work-around so that the spatial parts of the verification files are not loaded into memory in the first place. Chat-GPT assisted in the creation of this code, so forgive me if something is off.

OK I see, but I wonder if you need it? I mean, at that point you don't expect any spatial field to be present, since those were all filtered out with drop_variables no?

Ah, yes I tried that. But using drop_variables alone did not solve the problem, the spatial data were still loaded into memory. So I had to add the rest of this code.

then it'd be good to understand why that isn't behaving as expected and use only one single strategy to get rid of the spatial fields

and similar style for MAE and RMSE of T2m.

Now works!

Some SLURM optimizations.

Change back (clean up in the end)

Might have to slightly adjust for ICON (emulator).

Solves a Problem that occurred due to the merge.

Louis-Frey force-pushed the MRB-650-Maps-simplified branch from 2185fd6 to 9eb4643 Compare January 22, 2026 12:43

jonasbhend and others added 29 commits January 27, 2026 16:28

Simplified way of implementing fields

2b49526

Exclude spatial data from being plotted and included in dashboard

a456b1d

delete intermeidate verification files

0ec286f

Fix typo

28692d6

include score components for maps

f3dcf0d

Revert "Exclude spatial data from being plotted and included in dashb…

99dac52

…oard" This reverts commit cdefa16.

remove source dimension from scores

edcca5b

clean up

0ec8a0f

New rule and plotting file for plotting maps of

51f22a6

summary statistics. (No changes to code yet.)

Obvious changes to the new plotting rule for maps

366249d

of summary statistics.

Some more changes (preliminary, to be continued).

f531395

Further changes to plotting scripts.

32c7ecf

First version of colour maps finished.

c2ab645

For Bias, RMSE and MAE map plots.

Better comments in the colour map code.

27b91e2

Better Comments, some further changes to code.

1b2e670

Added back instances of lead time.

52dd1e7

New function for loading netCDF files added to

ddc5883

compat.py

Marimo app cell for loading data from .nc

35396a7

Remove .nc-loading function again, do it with

e8a92aa

earthkit instead.

All kinds of changes. Co-Development session with

06be6fa

Francesco. Got a long way towards the png plots. Co-authored-by: Francesco Zanetta <francesco.zanetta@meteoswiss.ch>

Generalized to the other non-trivial (non-wind)

61c728e

variables.

Some changes to plotting script.

761f8d9

Plotting region now dynamical (but not yet

1cd4dbf

properly working). Output written to .png now working.

Dynamic Regions now working.

cdb2ccd

Store results under experiment hash.

558689c

Introduced new domain "switzerland_small" for more

986a7ee

detailed inspection of results at smaller spatial scale.

Reverse Red-Blue colour maps for bias.

7f70fb4

Preliminary changes to plotting script for getting

50e899f

symmetric colour map for bias.

Temporarily changed plotting script back to original

a0aefc0

to see if all of it still works.

Louis-Frey added 2 commits February 5, 2026 16:04

Generate Colour breaks more elegantly.

ccf6279

Colour breaks from function for all T2m metrics.

258b2de

Louis-Frey requested review from dnerini and frazane February 9, 2026 13:41

Louis-Frey added 2 commits February 9, 2026 19:09

Map plots now also for baselines.

5c0151b

Fix memory leak occurring during creation of html.

7b50809

Louis-Frey added 4 commits February 10, 2026 17:01

"Alias" Rule for Plotting maps of Baselines

c3d37a2

This does the same thing with less code.

Reorganisation of Domains.

8323fbd

Introduce Seasonal Stratification.

d03781c

Definitive Colour Levels for 2m-Temperature

48e6789

dnerini marked this pull request as ready for review February 12, 2026 12:07

dnerini reviewed Feb 12, 2026

View reviewed changes

Louis-Frey marked this pull request as draft February 12, 2026 12:53

Louis-Frey added 12 commits February 13, 2026 16:17

Colour levels for Precipitation complete.

1c3e629

Colour Levels for Dew-Point Temperature

a4a5607

and similar style for MAE and RMSE of T2m.

Calculate and Evaluate Wind Speed too

f245ea3

Fix for Wind speed calculation

ccea8db

Now works!

Name Wind Speed consistent with the previous def.

ce4242f

Verification files not temporary any more

9628fae

Some SLURM optimizations.

Precipitation plotting: meters to millimeters.

59d4b12

Map plotting not on default busy nodes.

28ad4c5

Change back (clean up in the end)

Preliminary Colour Levels for Wind Bias.

db60db5

Might have to slightly adjust for ICON (emulator).

Merge main into feature branch

e2570fb

Replace EXPERIMENT_HASH by EXPERIMENT_NAME

2fcfd8a

Solves a Problem that occurred due to the merge.

Colour Levels for Pressure Bias.

c00c5a0

Conversation

jonasbhend commented Jan 7, 2026 • edited by Louis-Frey Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Louis-Frey commented Feb 9, 2026

Uh oh!

frazane commented Feb 9, 2026

Uh oh!

Louis-Frey commented Feb 9, 2026

Uh oh!

Louis-Frey commented Feb 9, 2026

Uh oh!

Louis-Frey commented Feb 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jonasbhend commented Jan 7, 2026 •

edited by Louis-Frey

Loading