Implement parametrisation for standardising model runs#392
Implement parametrisation for standardising model runs#392mo-nikosbaltas wants to merge 11 commits intomainfrom
Conversation
zmaalick
left a comment
There was a problem hiding this comment.
All good, nothing to add
NParsonsMO
left a comment
There was a problem hiding this comment.
I remember @ehogan saying that this PR should make a variables file per dataset, which is not currently happening.

I've also queried a couple of other bits.
There was a problem hiding this comment.
I can believe it was necessary to pull this step out into its own app (if it just needs running once overall rather than separately for each dataset), so if @ehogan is happy with the addition of a new app(?) then this seems fine.
There was a problem hiding this comment.
The extra API (restrucre_dirs) is required because it acts as the fan-in api for synchronisation. We have two instances of standardise_model_data running in parallel and unless they both finish you cannot restructure the data. I had discussed this with Alistair as well.
There was a problem hiding this comment.
There may be a way to run restructure_for_cmip6 in a way that can be done in parallel, but I propose that any investigation related to this is deferred. The name of the app should be changed, but I proposed that this is also deferred (to #318).
There was a problem hiding this comment.
Following the agreement we made at the technical meeting this morning to revert configure_standardise.sh, restructure_dirs.sh should be reverted to this previous version.
There was a problem hiding this comment.
Done. Although not different it had only some useful logging which have been removed. 7d8aef1
There was a problem hiding this comment.
The version of restructure_dirs.sh in this PR has not been reverted to this previous version. Please revert the changes to restructure_dirs.sh so that it matches this previous version.
There was a problem hiding this comment.
I removed all comments and checks and reverted back to the serial version. f889818
| @@ -1,4 +1,4 @@ | |||
| # (C) Crown Copyright 2024-2025, Met Office. | |||
| # (C) Crown Copyright 2024-2026, Met Office. | |||
There was a problem hiding this comment.
It doesn't look like this file has changed, so the copyright shouldn't be updated.
There was a problem hiding this comment.
This has now changed.
| fi | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # 1. Create variables.txt once (shared by both runs) |
There was a problem hiding this comment.
I think we are meant to make two now, not one shared one.
There was a problem hiding this comment.
We had changed our minds!!! Perhaps you missed that. Not allowed to have a separate 'create_variables' app.
There was a problem hiding this comment.
Would it not just be a case of making a variables_<dataset>.txt filepath akin to REQUEST_PATH = ${CYLC_WORKFLOW_SHARE_DIR}/etc/request_%(dataset)s.cfg while keeping it in the same (combined) app?
There was a problem hiding this comment.
As discussed at the technical meeting this morning, the reason we didn't want to introduce a create_variables app that created a single variables file is because it's possible for the same variables to come from different streams depending on the model evaluation run (/ dataset) used, which means a variables file is required for each dataset. Keeping the variables file creation within configure_standardise is preferred, since it minimises the number of changes compared to the previous version of the code. Edited to add that since configure_standardise is running in parallel, each job should write its own variables file, rather than try to write a single file with the same name, which is not safe.
There was a problem hiding this comment.
Keep it as it is.
There was a problem hiding this comment.
All defensive programming and logging have been removed.
CMEW/flow.cylc
Outdated
| [[[environment]]] | ||
| ROSE_TASK_APP = standardise_model_data | ||
| SITE = {{ SITE }} | ||
| RUN_LABEL = %(dataset)s |
There was a problem hiding this comment.
It doesn't seem good to define these things (RUN_LABEL and REQUEST_PATH) twice (once in configure_standardise<recipe,dataset> and once in standardise_model_data) - could you just define them once then inherit?
There was a problem hiding this comment.
Also, please use the appropriate Cylc variable for the parameterisation variable, i.e. ${CYLC_TASK_PARAM_dataset} rather than %(dataset)s. An example of this for the recipe parameterised variable is available at https://github.com/MetOffice/CMEW/blob/main/CMEW/flow.cylc#L51.
There was a problem hiding this comment.
Would it be possible to use only ${CYLC_TASK_PARAM_dataset} in favour of ${RUN_LABEL}, please? (Using ${RUN_LABEL} obfuscates the recognisable and understandable variable ${CYLC_TASK_PARAM_dataset})
|
|
||
| def create_request(): | ||
| """Retrieve CDDS request information from Rose suite configuration. | ||
| def create_request() -> configparser.ConfigParser: |
There was a problem hiding this comment.
I note from an earlier comment of Emma's on another review that CMEW should not contain type hints:
#349 (comment)
I'm not sure why or if that's an absolute, again probably one for @ehogan.
Should there still be a docstring?
There was a problem hiding this comment.
Python functions must have docstrings. Documentation requirements: Python documentation provides more information.
There was a problem hiding this comment.
Type hinting has been removed and docstrings reinstated. docstrings were removed after previous changes were made.
There was a problem hiding this comment.
All type hinting and the inclusion of docstrings had been done 7d8aef1
| BASH_XTRACEFD=1 | ||
| set -xeuo pipefail | ||
|
|
||
| echo "[INFO] Running configure_standardise for REF and EVAL runs" |
There was a problem hiding this comment.
At the technical meeting this morning, we agreed that configure_standardise.sh should be reverted to this previous version. Specifically:
-o pipefailis only needed inrose-app.conffiles, as described in the Developer Guide: Rose requirements- The echoing and defensive programming is not required; the options provided to
setat the top of the script do the following:-x Print commands and their arguments as they are executed.-e Exit immediately if a command exits with a non-zero status.-u Treat unset variables as an error when substituting.
- If we want to validate variables, this should be done in the
flow.cylcfile, e.g. https://github.com/MetOffice/CMEW/blob/main/CMEW/flow.cylc#L5, and not in an individual app. However, this is out of scope of this issue (a new issue should be opened should we want to validate variables)
There was a problem hiding this comment.
The version of configure_standardise.sh in this PR has not been reverted to this previous version. Please revert the changes to configure_standardise.sh so that it matches this previous version.
There was a problem hiding this comment.
Basically removed some useful comments and checks. The only diffrence is that the create_variables_file.py script has to be executed before create_request_file.py otherwise it breaks. f889818
As discussed in the technical meeting this morning, |
| @@ -1,4 +1,4 @@ | |||
| # (C) Crown Copyright 2024-2025, Met Office. | |||
| # (C) Crown Copyright 2024-2026, Met Office. | |||
There was a problem hiding this comment.
Following the agreement we made at the technical meeting this morning to revert configure_standardise.sh, the rose-app.conf file for standardise_model_data should be reverted to this previous version and the standardise_model_data.sh script removed.
There was a problem hiding this comment.
The version of standardise_model_data/rose-app.conf in this PR has not been reverted to this previous version. Please revert the changes to standardise_model_data/rose-app.conf so that it matches this previous version, and then remove the standardise_model_data.sh script.
There was a problem hiding this comment.
standardise_model_data tasks run in parallel and they are standalone tasks so rose-app.conf had to be changed. You cannot revert to the serial version.
mo-nikosbaltas
left a comment
There was a problem hiding this comment.
Making comment visible.
There was a problem hiding this comment.
Done. Although not different it had only some useful logging which have been removed. 7d8aef1
| fi | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # 1. Create variables.txt once (shared by both runs) |
|
|
||
| def create_request(): | ||
| """Retrieve CDDS request information from Rose suite configuration. | ||
| def create_request() -> configparser.ConfigParser: |
There was a problem hiding this comment.
All type hinting and the inclusion of docstrings had been done 7d8aef1
| @@ -1,4 +1,4 @@ | |||
| # (C) Crown Copyright 2024-2025, Met Office. | |||
| # (C) Crown Copyright 2024-2026, Met Office. | |||
| fi | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # 1. Create variables.txt once (shared by both runs) |
| @@ -1,4 +1,4 @@ | |||
| # (C) Crown Copyright 2024-2025, Met Office. | |||
| # (C) Crown Copyright 2024-2026, Met Office. | |||
There was a problem hiding this comment.
This has now changed.
| @@ -1,18 +1,11 @@ | |||
| #!/bin/bash | |||
| # (C) Crown Copyright 2024-2026, Met Office. | |||
| # (C) Crown Copyright 2024-2025, Met Office. | |||
There was a problem hiding this comment.
Oh, now it has changed... So maybe it should now be 2026?

Closes #338
PR creation checklist for the developer
<issue_number>above ☝️ has been replaced with the issue number.mainhas been selected as the base branch.<issue_number>_<short_description_of_feature>.good first issuelabel) have been added to the PR.Climate Model Evaluation Workflow (CMEW)project has been added to the PR.Definition of Done for the developer
docdirectory, including the Quick Start section; select one of the following):PR creation checklist for the reviewer
<issue_number>above ☝️ has been replaced with the issue number.mainhas been selected as the base branch.<issue_number>_<short_description_of_feature>.good first issuelabel) have been added to the PR.Climate Model Evaluation Workflow (CMEW)project has been added to the PR.Definition of Done for the reviewer
docdirectory, including the Quick Start section; select one of the following):Important
#<pull_request_number>: <pull_request_title>when writing the merge commit message for the pull request, so the pull request number is immediately visible on GitHub, regardless of the length of the pull request title.