Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vertical level extraction issue #975

Closed
lee1043 opened this issue Aug 22, 2023 · 1 comment · Fixed by #978
Closed

Vertical level extraction issue #975

lee1043 opened this issue Aug 22, 2023 · 1 comment · Fixed by #978
Assignees
Labels
bug QC Quality Control

Comments

@lee1043
Copy link
Contributor

lee1043 commented Aug 22, 2023

Vertical level extraction does not work properly when given vertical coordinate is hybrid coord (often named as lev), instead of pressure lev (plev). The mean climate annual cycle calculation code extract a layer that is incorrect, which influence statistics, making the erroneous model as outlier. Rather, such models should be excluded (short-term) or corrected (in long-term) for the summary statistics.

mean_clim_parallel_coordinate_plot_4seasons_cmip56

In the above plot, for ta-850: HadGEM3-GC31-LL, HadGEM3-GC31-MM, and UKESM1-0-LL are marked at the top of the axis, showing rms_xyt > 90, which amplifies distribution of statistics from CMIP6 models.

Below is a part of ncdump result for UKESM1-0-LL nc file.

	double lev(lev) ;
		lev:bounds = "lev_bnds" ;
		lev:units = "m" ;
		lev:axis = "Z" ;
		lev:positive = "up" ;
		lev:long_name = "hybrid height coordinate" ;
		lev:standard_name = "atmosphere_hybrid_height_coordinate" ;
		lev:formula = "z = a + b*orog" ;
		lev:formula_terms = "a: lev b: b orog: orog" ;

	float ta(time, lev, lat, lon) ;
		ta:standard_name = "air_temperature" ;
		ta:long_name = "Air Temperature" ;
		ta:comment = "Air Temperature" ;
		ta:units = "K" ;
		ta:original_name = "mo: (stash: m01s30i111, lbproc: 128)" ;
		ta:cell_methods = "area: time: mean" ;
		ta:cell_measures = "area: areacella" ;
		ta:missing_value = 1.e+20f ;
		ta:_FillValue = 1.e+20f ;

While other models:

	double plev(plev) ;
		plev:units = "Pa" ;
		plev:axis = "Z" ;
		plev:positive = "down" ;
		plev:long_name = "pressure" ;
		plev:standard_name = "air_pressure" ;

	float ta(time, plev, lat, lon) ;
		ta:standard_name = "air_temperature" ;
		ta:long_name = "Air Temperature" ;
		ta:comment = "Air Temperature" ;
		ta:units = "K" ;
		ta:original_name = "mo: (stash: m01s30i294, blev: [1000.0, 925.0, 850.0, 700.0, 600.0, 500.0, 400.0, 300.0, 250.0, 200.0, 15
0.0, 100.0, 70.0, 50.0, 30.0, 20.0, 10.0, 5.0, 1.0], lbproc: 128) / (stash: m01s30i304, blev: [1000.0, 925.0, 850.0, 700.0, 600.0, 500.0, 40
0.0, 300.0, 250.0, 200.0, 150.0, 100.0, 70.0, 50.0, 30.0, 20.0, 10.0, 5.0, 1.0], lbproc: 128)" ;
		ta:cell_methods = "time: mean" ;
		ta:cell_measures = "area: areacella" ;
		ta:missing_value = 1.e+20f ;
		ta:_FillValue = 1.e+20f ;
		ta:history = "2022-05-12T23:53:56Z altered by CMOR: Inverted axis: plev." ;

Action item

Short term fix: Exclude those models

Long term goal: apply vertical interpolation if needed

@lee1043 lee1043 added bug QC Quality Control labels Aug 22, 2023
@lee1043 lee1043 self-assigned this Aug 22, 2023
@lee1043
Copy link
Contributor Author

lee1043 commented Aug 23, 2023

Separate but related issue:

When plev is like below,

 plev = 100000.00000001, 92500.00000001, 85000.00000001, 70000.00000001, 
    60000.00000001, 50000.00000001, 40000.00000001, 30000.00000001, 
    25000.00000001, 20000.00000001, 15000.00000001, 10000.00000001, 
    7000.00000001, 5000.00000001, 3000.00000001, 2000.00000001, 
    1000.00000001, 500.00000001, 100.00000001 ;
}

ds.sel(plev=85000) from load_and_regrid.py

returns the following error:

Traceback (most recent call last):
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3653, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1698, in pandas._libs.hashtable.Float64HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1722, in pandas._libs.hashtable.Float64HashTable.get_item
KeyError: 85000.0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/indexes.py", line 769, in sel
    indexer = self.index.get_loc(label_value)
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3655, in get_loc
    raise KeyError(key) from err
KeyError: 85000.0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/bin/mean_climate_driver.py", line 237, in <module>
    ds_test = load_and_regrid(data_path=test_data_full_path, varname=varname, varname_in_file=varname_testdata, level=level, t_grid=t_grid, decode_times=True, regrid_tool=regrid_tool, debug=debug)
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/pcmdi_metrics/mean_climate/lib/load_and_regrid.py", line 61, in load_and_regrid
    ds = ds.sel(plev=level)
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/dataset.py", line 3020, in sel
    query_results = map_index_queries(
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/indexing.py", line 190, in map_index_queries
    results.append(index.sel(labels, **options))
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/indexes.py", line 771, in sel
    raise KeyError(
KeyError: "not all values found in index 'plev'. Try setting the `method` keyword argument (example: method='nearest')."

Cause of error: given key (85000) is not exactly matching with key in the file (85000.00000001)

Workaround: Add method='nearest' to ds.sel(lev=85000). However this can cause error if plev does not have layer for 850 hPa and nearest layer is far below or above (e.g., 700 or 500 hPa).

@lee1043 lee1043 linked a pull request Aug 30, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug QC Quality Control
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant