Skip to content

implement scoring for the sub-steps within the forecast window #1896

Open
iluise wants to merge 8 commits intodevelopfrom
iluise/develop/eval-substeps
Open

implement scoring for the sub-steps within the forecast window #1896
iluise wants to merge 8 commits intodevelopfrom
iluise/develop/eval-substeps

Conversation

@iluise
Copy link
Collaborator

@iluise iluise commented Feb 20, 2026

Description

Implement automatic detection of sub-steps within the forecast window (implemented only for regular grids. For OBS this is not needed).

NOTE: this implementation is quite slow at the moment. It's okay for a few samples/fsteps, but already with 16 samples it gets very slow. I know how to make it faster but it requires a major change in the logic of the samples/fstep handling that will be addressed in a separate PR.

It creates plots like:

Screenshot 2026-02-20 at 18 41 32

instead of:

Screenshot 2026-02-20 at 18 41 53

Issue Number

Closes #1782

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@github-actions github-actions bot added the eval anything related to the model evaluation pipeline label Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

eval anything related to the model evaluation pipeline

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

compute scores as a function of the valid time, rather than the forecast step

1 participant