Skip to content

Conversation

@lewisjared
Copy link
Contributor

@lewisjared lewisjared commented Feb 9, 2026

Summary

  • Add CMIP7 as an alternative data source for all 16 ESMValTool diagnostics in data_requirements
  • Add CMIP7 to recipe infrastructure for CMIP7 replacing the deprecated drs attribute
  • Add get_cmip_source_type() helper so update_recipe methods dynamically select CMIP6 or CMIP7 input files

Per-diagnostic pattern

Each CMIP7 DataRequirement mirrors its CMIP6 counterpart with:

  • frequency filter instead of table_id
  • variant_label instead of member_id in group_by
  • SourceDatasetType.CMIP7 for AddSupplementaryDataset

Add CMIP7 as an alternative data source for all 16 ESMValTool
diagnostics using OR-logic (tuple-of-tuples) in data_requirements.

Infrastructure:
- Add CMIP7 to FACETS mapping in recipe.py with ESMValCore-native facets
- Add CMIP7 to drs/rootpath config in build_cmd
- Add get_cmip_source_type() helper for update_recipe methods
- Add CMIP7 to build_execution_result source type detection

Per-diagnostic pattern:
- CMIP7 requirements use frequency instead of table_id in filters
- Use variant_label instead of member_id in group_by
- Use SourceDatasetType.CMIP7 for AddSupplementaryDataset
- update_recipe methods use get_cmip_source_type() dynamically

Special handling:
- cloud_scatterplots: rename get_cmip6_data_requirements to
  get_cmip_data_requirements, use dynamic project in suptitle
- climate_at_global_warming_levels: custom cmip7_matching_facets
  without table_id
- regional diagnostics: add CMIP7 as third OR-alternative alongside
  CMIP6/obs4MIPs, plus CMIP7 test data specs
@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

❌ Patch coverage is 89.05109% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ool/src/climate_ref_esmvaltool/diagnostics/base.py 71.42% 2 Missing and 2 partials ⚠️
...ef-esmvaltool/src/climate_ref_esmvaltool/recipe.py 42.85% 2 Missing and 2 partials ⚠️
...limate-ref-core/src/climate_ref_core/esgf/cmip7.py 81.81% 2 Missing ⚠️
...climate-ref-core/src/climate_ref_core/providers.py 60.00% 1 Missing and 1 partial ⚠️
...ol/diagnostics/climate_at_global_warming_levels.py 71.42% 1 Missing and 1 partial ⚠️
...te-ref-core/src/climate_ref_core/cmip6_to_cmip7.py 98.30% 0 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
...-esmvaltool/src/climate_ref_esmvaltool/__init__.py 100.00% <100.00%> (ø)
...esmvaltool/diagnostics/climate_drivers_for_fire.py 100.00% <100.00%> (ø)
..._esmvaltool/diagnostics/cloud_radiative_effects.py 100.00% <100.00%> (ø)
...e_ref_esmvaltool/diagnostics/cloud_scatterplots.py 100.00% <100.00%> (ø)
...tool/src/climate_ref_esmvaltool/diagnostics/ecs.py 100.00% <100.00%> (ø)
...ool/src/climate_ref_esmvaltool/diagnostics/enso.py 100.00% <100.00%> (ø)
.../src/climate_ref_esmvaltool/diagnostics/example.py 100.00% <100.00%> (ø)
...valtool/diagnostics/regional_historical_changes.py 95.41% <100.00%> (ø)
...e_ref_esmvaltool/diagnostics/sea_ice_area_basic.py 100.00% <100.00%> (ø)
..._ref_esmvaltool/diagnostics/sea_ice_sensitivity.py 100.00% <100.00%> (ø)
... and 9 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace repo/tag_or_commit/url in CondaDiagnosticProvider with a
pip_packages attribute that handles all dev installs uniformly.
Pin ESMValCore to main branch commit 71ab2196 which adds CMIP7 support.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the climate-ref-esmvaltool provider to support running all ESMValTool diagnostics against CMIP7 datasets (alongside existing CMIP6/obs4MIPs support), mainly by adding CMIP7 facet mappings, CMIP7-aware recipe generation, and OR-ed data_requirements.

Changes:

  • Add CMIP7 facet mapping + ESMValTool config entries (drs/rootpath) and CMIP7 path preparation.
  • Update diagnostics to accept CMIP6 or CMIP7 via OR-ed data_requirements, and select the active CMIP source dynamically in update_recipe.
  • Add/adjust CMIP7 test cases for regional diagnostics and update the base config test expectation.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/climate-ref-esmvaltool/tests/unit/diagnostics/test_base.py Update expected rootpath entries after adding CMIP7.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/recipe.py Add CMIP7 facet mapping; tweak prepare_climate_data handling.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/base.py Add get_cmip_source_type() helper; add CMIP7 to ESMValTool config and selector extraction.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/zec.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/tcre.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/tcr.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/sea_ice_sensitivity.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/sea_ice_area_basic.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/regional_historical_changes.py Add CMIP7 data requirement alternatives + CMIP7 TestCases.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/example.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/enso.py Add CMIP7 alternative requirements + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/ecs.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/cloud_scatterplots.py Generalize CMIP requirements (CMIP6/CMIP7) + make plot title project-aware.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/cloud_radiative_effects.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/climate_drivers_for_fire.py Add CMIP7 alternative requirement + dynamic recipe update source.
packages/climate-ref-esmvaltool/src/climate_ref_esmvaltool/diagnostics/climate_at_global_warming_levels.py Add CMIP7 alternative requirement + CMIP7-specific matching/grouping facets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…branding

ESMValCore dev version replaced rootpath/drs/search_esgf config with
a projects-style format using LocalDataSource. Update config generation
to match. Also add 3D atmospheric variables (ta, ua, va, hus, etc.)
to VARIABLE_BRANDING with vertical_label 'al', and fix symlink
re-creation in prepare_climate_data.
CMIP7 test cases used table_id: "Amon" which excludes areacella (an fx
variable). Replace with frequency: ["fx", "mon"] to match the CMIP6
test case pattern, ensuring fixed-field variables are fetched.
@lewisjared
Copy link
Contributor Author

@bouweandela I was able to run both the cmip6/cmip7 test cases for the historical annual cycle locally.

Let me know if there is any obvious mistakes before I start fleshing out more of the test cases.

Use create_cmip7_filename() to generate proper CMIP7 filenames
(e.g., tas_tavg-h2m-hxy-u_mon_glb_gn_CanESM5_historical_r1i1p1f1_185001-201412.nc)
instead of keeping CMIP6 filenames. Update the ESMValTool CMIP7
filename_template to match the official MIP-DRS7 specification.
@bouweandela
Copy link
Contributor

bouweandela commented Feb 9, 2026

Looks like a great start @lewisjared!

Regarding the CMIP7 data requirements:

  • the CMIP6 table_id consisted of both the realm and the frequency, so you may want to put entries for realm in the data requirements too
  • the ESMValTool diagnostics have been developed for global data, so I doubt that anything other than region: glb will work

I noticed some failing tests because facets are different per project (e.g. obs4MIPs doesn't have ensemble members) but that is not reflected in the current implementation, see https://github.com/Climate-REF/climate-ref/actions/runs/21719510099/job/62644893377#logs

@bouweandela
Copy link
Contributor

We may also want to consider/check which the diagnostic authors which branding_suffix should be used for the variables in the data requirements. All of them? Only one? Different per variable?

@lewisjared
Copy link
Contributor Author

the CMIP6 table_id consisted of both the realm and the frequency, so you may want to put entries for realm in the data requirements too

According to the global attributes doc "Note that "realm" may be assigned multiple realms, separated by a single space, with the first one listed considered primary.".
I think we will currently compare exactly. I'll make a issue and follow up with this.

Do we have any information about how the external_variables files will be named? I can't find anything on the CMIP7 guidance site.

@bouweandela
Copy link
Contributor

According to the global attributes doc "Note that "realm" may be assigned multiple realms, separated by a single space, with the first one listed considered primary.".

I used the CMIP6 table_id instead of just frequency in the data requirements for the ESMValTool diagnostics because I noticed that some of the diagnostics failed to run if the realm was different from expected. ESMValTool recipes will expect the value of mip, which I believe translates to realm in CMIP7 and table_id in CMIP6, to have the value for table_id (minus the "Table " bit) used at the top of the CMOR tables. For CMIP7, the table_id in the CMOR tables appears to be the same as realm: https://github.com/WCRP-CMIP/cmip7-cmor-tables/blob/6737d39d5424ad20550ad117f28512cf69fa2901/tables/CMIP7_aerosol.json#L15, while for CMIP6 it was usually a composite of the first (few) letter(s) of the realm followed by the frequency https://github.com/PCMDI/cmip6-cmor-tables/blob/087fe45d21c082e28723e0f930e4266abe91b853/Tables/CMIP6_Amon.json#L5.

Do we have any information about how the external_variables files will be named? I can't find anything on the CMIP7 guidance site.

They will follow the same naming scheme as other variables. Here is an example of the areacella variable in the CMOR tables: https://github.com/WCRP-CMIP/cmip7-cmor-tables/blob/6737d39d5424ad20550ad117f28512cf69fa2901/tables/CMIP7_atmos.json#L54
and with the CMIP6 as CMIP7 CMORizer example notebook that results in something like MIP-DRS7/CMIP7/CMIP/PCMDI/PCMDI-test-1-0/historical/r1i1p1f3/glb/fx/areacella/ti-u-hxy-u/gn/v20260109/areacella_ti-u-hxy-u_fx_glb_gn_PCMDI-test-1-0_historical_r1i1p1f3.nc

…ppings

Replace raw dict usage with a frozen attrs class for type-safe
serialisation/deserialisation of Data Request variable mappings.
The class is used both in the extract script (to_dict for JSON output)
and at load time (from_dict when reading the bundled JSON).
- Replace __file__ with stable relative path in extract script metadata
- Fix bundled JSON description containing absolute path
- Validate branding suffix format before splitting in extract script
- Fix docstring to match raise-on-duplicate behavior
- Move cache check before xr.open_dataset in CMIP7 converter
The function was using cmip6_path.name for the output filename
instead of generating a proper CMIP7 filename. Import and use
create_cmip7_filename to produce correct CMIP7 DRS filenames.
- Fix docstring grammar in extract script
- Add 60s timeout to urllib.request.urlopen
- Add assert_not_called checks for mock_open/mock_convert in cache test
…nstruction

_convert_file_to_cmip7 now looks up the DReq entry using table_id and
variable_id to inject branding_suffix and region into the facets before
calling create_cmip7_filename. This prevents empty branding components
in the generated filenames.

Tests no longer mock create_cmip7_filename, instead providing table_id
in the facets so real filename construction is exercised.
For variables where out_name differs from variable_id (e.g. tasmax ->
tas), the filename and DRS path now correctly use the CMIP7 out_name.
The variable_id attribute in the dataset stays as the CMIP6 identity.

- create_cmip7_filename and create_cmip7_path prefer out_name over
  variable_id (with fallback)
- convert_cmip6_to_cmip7_attrs sets out_name and branded_variable
  from the DReq entry
- _convert_file_to_cmip7 injects out_name during DReq enrichment
- Added end-to-end tests for tasmax filename generation
- Make _get_dreq_entry a public API (get_dreq_entry) for cross-module use
- Use out_name in DRS path construction in _convert_file_to_cmip7
- Use DReq region instead of hardcoded 'glb' in convert_cmip6_to_cmip7_attrs
- Fix pytest parametrize ids to use pytest.param with explicit id strings
- Add tests for out_name != variable_id (tasmax/tasmin) and non-glb region (ImonAnt)
…ments

* origin/dr-mappings:
  fix: remove redundant information from breaking changelog entry for Data Request API
  docs: add breaking changelog entry for DReq API changes
  fix: address third round of PR review comments
  fix: use out_name from DReq for CMIP7 filenames and paths
  fix: enrich cmip7_facets with DReq branding_suffix before filename construction
  chore: exclude setting chnage
  fix: address second round of PR review comments
  fix(core): use create_cmip7_filename in _convert_file_to_cmip7
  fix: address PR review comments
  docs: add changelog entry for PR #530
  refactor(core): update documentation and remove unused CMIP7 name mapping functions
  feat(core): add DReqVariableMapping attrs class for CMIP6-to-CMIP7 mappings

# Conflicts:
#	packages/climate-ref-core/src/climate_ref_core/cmip6_to_cmip7.py
#	packages/climate-ref-core/src/climate_ref_core/esgf/cmip7.py
Merge dr-mappings branch and update CMIP7 data requirements:
- Add region: glb filter to all CMIP7 FacetFilter instances
- Add branded_variable_name to regional historical test cases
- Add region and branded_variable_name to CMIP7Request.available_facets
- Remove duplicate filename creation code from merge artifact in cmip7.py
Copilot AI review requested due to automatic review settings February 12, 2026 14:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +96 to +98
"experiment_id": (
# TODO: Redetermine the scenario naming for CMIP7 and update these accordingly
),
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CMIP7 DataRequirement for ClimateAtGlobalWarmingLevels has an empty tuple for experiment_id (lines 96-98). This means no experiments will be matched, making this diagnostic non-functional for CMIP7 data. The TODO comment indicates this is intentional (scenario names haven't been determined for CMIP7 yet), but an empty tuple will cause the FacetFilter to match nothing. Consider either: (1) removing the CMIP7 DataRequirement until scenario names are finalized, or (2) commenting it out with a clearer explanation that it's a placeholder for future implementation.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants