Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremes Metrics #962

Merged
merged 232 commits into from
Jan 1, 2024
Merged

Extremes Metrics #962

merged 232 commits into from
Jan 1, 2024

Conversation

acordonez
Copy link
Collaborator

@acordonez acordonez commented Jul 19, 2023

Extremes Metrics

See the README for more details.

The Github.io documentation will be updated in a second PR (#990).

This PR contains:
The extremes driver and library code
Jupyter notebook demo

The demo (Notebook #8) uses precipitation data that is already distributed as part of the demo data bundle, but processes it to make a lower resolution version that runs much more quickly. At the original resolution, it takes me around an hour to run each example.

Dependency note: This code also requires adding the "numdifftools" package as a dependency (this is available on conda-forge).

gleckler1 and others added 30 commits April 25, 2018 18:35
# Revision by MFW 08/10/2018: Removed the 4 seasons for speed. Added some comments
Update make_extrema_longrun_3day.py
@acordonez
Copy link
Collaborator Author

@acordonez I am getting a new error from Demo8 notebook's extremes_driver.py -p basic_extremes_param.py. Does it look familiar to you? (Below log message is updated)

No sftlf file found for GISS-E2-H r6i1p1

-----------------------
model, run, variable: GISS-E2-H r6i1p1 pr
test_data (model in this case) full_path:
   demo_output/extremes_tmp/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc
Generating land sea mask.
Generating precipitation block extrema.
Writing results to netCDF.
Generating metrics.
INFO::2023-12-22 13:00::pcmdi_metrics:: Results saved to a json file: [/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/GISS-E2-H_block_extremes_metrics.json](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/GISS-E2-H_block_extremes_metrics.json)
2023-12-22 13:00:39,824 [INFO]: base.py(write:251) >> Results saved to a json file: [/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/GISS-E2-H_block_extremes_metrics.json](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/GISS-E2-H_block_extremes_metrics.json)
2023-12-22 13:00:39,824 [INFO]: base.py(write:251) >> Results saved to a json file: [/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/GISS-E2-H_block_extremes_metrics.json](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/GISS-E2-H_block_extremes_metrics.json)
INFO::2023-12-22 13:00::pcmdi_metrics:: Results saved to a json file: [/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/block_extremes_metrics.json](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/block_extremes_metrics.json)
2023-12-22 13:00:56,588 [INFO]: base.py(write:251) >> Results saved to a json file: [/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/block_extremes_metrics.json](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/block_extremes_metrics.json)
2023-12-22 13:00:56,588 [INFO]: base.py(write:251) >> Results saved to a json file: [/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/block_extremes_metrics.json](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/Documents/Research/git/pcmdi_metrics_20230620_pcmdi/pcmdi_metrics/doc/jupyter/Demo/demo_output/extremes_ex1/block_extremes_metrics.json)
Generating return values.
demo_output/extremes_ex1/netcdf/GISS-E2-H_r6i1p1_land_Rx5day_2000-2005.nc
Return value for single realization
Stationary case
[/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/return_value.py:288](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/return_value.py:288): RuntimeWarning: Mean of empty slice
  scale_factor = np.abs(np.nanmean(data))
demo_output/extremes_ex1/netcdf/GISS-E2-H_r6i1p1_land_Rx1day_2000-2005.nc
Return value for single realization
Stationary case
[/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/return_value.py:288](https://file+.vscode-resource.vscode-cdn.net/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/return_value.py:288): RuntimeWarning: Mean of empty slice
  scale_factor = np.abs(np.nanmean(data))
2023-12-22 13:00:57,246 [WARNING]: dataset.py(open_dataset:128) >> "No time coordinates were found in this dataset to decode. If time coordinates were expected to exist, make sure they are detectable by setting the CF 'axis' or 'standard_name' attribute (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time'). Afterwards, try decoding again with `xcdat.decode_time`."
2023-12-22 13:00:57,246 [WARNING]: dataset.py(open_dataset:128) >> "No time coordinates were found in this dataset to decode. If time coordinates were expected to exist, make sure they are detectable by setting the CF 'axis' or 'standard_name' attribute (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time'). Afterwards, try decoding again with `xcdat.decode_time`."
Traceback (most recent call last):
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/bin/extremes_driver.py", line 607, in <module>
    tmp = compute_metrics.metrics_json_return_value(
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/compute_metrics.py", line 638, in metrics_json_return_value
    obs = obs.bounds.add_missing_bounds()
AttributeError: 'NoneType' object has no attribute 'bounds'

@lee1043 I think the new push should fix this error

@lee1043
Copy link
Contributor

lee1043 commented Dec 22, 2023

@acordonez thank you for the fix, it resolved the error. However there is a new error as below, do you have an idea?

%%bash
extremes_driver.py  -p basic_extremes_param.py \
--case_id extremes_ex2 \
--reference_data_path demo_output/extremes_tmp/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc  \
--reference_data_set GPCP-1-3 \
--generate_sftlf \
--regrid True
Metrics output path not found.
Creating metrics output directory demo_output/extremes_ex2
No reference sftlf file template provided.

-----------------------
model, run, variable: Reference GPCP-1-3 pr
test_data (model in this case) full_path:
   demo_output/extremes_tmp/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc
Generating land sea mask.
Generating precipitation block extrema.
Writing results to netCDF.
No sftlf file found for GISS-E2-H r6i1p1

-----------------------
model, run, variable: GISS-E2-H r6i1p1 pr
test_data (model in this case) full_path:
   demo_output/extremes_tmp/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc
Generating land sea mask.
Generating precipitation block extrema.
Writing results to netCDF.
Generating metrics.
Traceback (most recent call last):
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/bin/extremes_driver.py", line 458, in <module>
    result_dict = compute_metrics.metrics_json(
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/compute_metrics.py", line 565, in metrics_json
    obs_m = obs_dict[m].regridder.horizontal(
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/xcdat/regridder/accessor.py", line 323, in horizontal
    regridder = regrid_tool(input_grid, output_grid, **options)
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/xcdat/regridder/regrid2.py", line 52, in __init__
    self._dst_lat = self._output_grid.bounds.get_bounds("Y")
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/xcdat/bounds.py", line 247, in get_bounds
    raise KeyError(
KeyError: "No bounds data variables were found for the 'Y' axis. Make sure the dataset has bound data vars and their names match the 'bounds' attributes found on their related time coordinate variables. Alternatively, you can add bounds with `ds.bounds.add_missing_bounds()` or `ds.bounds.add_bounds()`."

@acordonez
Copy link
Collaborator Author

@acordonez thank you for the fix, it resolved the error. However there is a new error as below, do you have an idea?

%%bash
extremes_driver.py  -p basic_extremes_param.py \
--case_id extremes_ex2 \
--reference_data_path demo_output/extremes_tmp/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc  \
--reference_data_set GPCP-1-3 \
--generate_sftlf \
--regrid True
Metrics output path not found.
Creating metrics output directory demo_output/extremes_ex2
No reference sftlf file template provided.

-----------------------
model, run, variable: Reference GPCP-1-3 pr
test_data (model in this case) full_path:
   demo_output/extremes_tmp/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc
Generating land sea mask.
Generating precipitation block extrema.
Writing results to netCDF.
No sftlf file found for GISS-E2-H r6i1p1

-----------------------
model, run, variable: GISS-E2-H r6i1p1 pr
test_data (model in this case) full_path:
   demo_output/extremes_tmp/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc
Generating land sea mask.
Generating precipitation block extrema.
Writing results to netCDF.
Generating metrics.
Traceback (most recent call last):
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/bin/extremes_driver.py", line 458, in <module>
    result_dict = compute_metrics.metrics_json(
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/compute_metrics.py", line 565, in metrics_json
    obs_m = obs_dict[m].regridder.horizontal(
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/xcdat/regridder/accessor.py", line 323, in horizontal
    regridder = regrid_tool(input_grid, output_grid, **options)
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/xcdat/regridder/regrid2.py", line 52, in __init__
    self._dst_lat = self._output_grid.bounds.get_bounds("Y")
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/xcdat/bounds.py", line 247, in get_bounds
    raise KeyError(
KeyError: "No bounds data variables were found for the 'Y' axis. Make sure the dataset has bound data vars and their names match the 'bounds' attributes found on their related time coordinate variables. Alternatively, you can add bounds with `ds.bounds.add_missing_bounds()` or `ds.bounds.add_bounds()`."

@lee1043 At this point the obs dataset should have bounds because the last change I made was to move the call to add_missing_bounds() into the obs preprocessing section of compute_metrics. So I need to see if this is a data issue or a code issue.

@lee1043
Copy link
Contributor

lee1043 commented Dec 22, 2023

@acordonez I think this reminds me that I have seen a similar issue which was occurred by regrid2 of xcdat was dropping bounds information after interpolating. I cannot recall this issue was reported to the xcdat team, if not, I will report it. During the mean time, I think a workaround could be after the regrid2 interpolation, apply add_missing_bounds, to re-add bounds information.

@lee1043
Copy link
Contributor

lee1043 commented Dec 22, 2023

@acordonez I have to retracrt my above comment because it seems it might not be the case for this error, because the error was coming from here:

                    obs_m = obs_dict[m].regridder.horizontal(
                        season, target, tool="regrid2"
                    )

Could you check if obs_dict[m] have bounds in it?

@acordonez
Copy link
Collaborator Author

@lee1043 I think I see the issue, there is a second function that would need the add_missing_bounds call. But I'm seeing if there's somewhere in the driver I can put it so I'm not adding bounds to the same dataset over and over.

@acordonez
Copy link
Collaborator Author

@lee1043 I did some more digging and obs_dict[m] definitely has bounds. This screenshot shows the outputs of some print statements before line 565 when the code fails. The dataset has the lat_bnds, lon_bnds, and time_bnds variables, and lat, lon, and time all have the bounds attribute. Along with that, the input dataset definitely has bounds, and earlier in the code I explicitly check for missing bounds. So I'm stumped why this call is failing.

Screenshot 2023-12-22 at 1 56 10 PM

@acordonez
Copy link
Collaborator Author

@lee1043 In the error message it looks like maybe it's the target grid that's missing bounds?

@lee1043
Copy link
Contributor

lee1043 commented Dec 22, 2023

@lee1043 In the error message it looks like maybe it's the target grid that's missing bounds?

That could be, as the error was coming from the line with output_grid, but that is still strange... as I was expecting it supposed to be handled by xCDAT. Can you create a minimal code to reproduce this error, with saving the obs_dict[m] as a netcdf file as an input for the minimal code, and then open an issue in xCDAT repo?

@acordonez
Copy link
Collaborator Author

@lee1043 I added bounds to the target grid and am still seeing the issue. I'll work on an example for an issue.

@acordonez
Copy link
Collaborator Author

@lee1043 I added bounds to the target grid and am still seeing the issue. I'll work on an example for an issue.

Actually adding bounds to the target might have fixed the issue, I think I'm seeing an error in a different part of the code now.

@acordonez
Copy link
Collaborator Author

@lee1043 Sorry for all the back-and-for but this new change adds bounds to the target grid for regridding, and works on my end.

@lee1043
Copy link
Contributor

lee1043 commented Dec 22, 2023

@acordonez thank you for the fix. The code runs without the error, but the output json does not look right. Thank you for patiently addressing these issues!

extremes_driver.py  -p basic_extremes_param.py
import os
import json
output_path = os.path.join(demo_output_directory, "extremes_ex1/GISS-E2-H_block_extremes_metrics.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))
{
  "GISS-E2-H": {
    "r6i1p1": {
      "Rx1day": {
        "land": {
          "mean": {
            "ANN": NaN,
            "DJF": NaN,
            "JJA": NaN,
            "MAM": NaN,
            "SON": NaN
          },
          "std_xy": {
            "ANN": NaN,
            "DJF": NaN,
            "JJA": NaN,
            "MAM": NaN,
            "SON": NaN
          }
        }
      },
      "Rx5day": {
        "land": {
          "mean": {
            "ANN": NaN,
            "DJF": NaN,
            "JJA": NaN,
            "MAM": NaN,
            "SON": NaN
          },
          "std_xy": {
            "ANN": NaN,
            "DJF": NaN,
            "JJA": NaN,
            "MAM": NaN,
            "SON": NaN
          }
        }
      }
    }
  }
}

And maybe related on this, it seems below error is coming from plotting.

extremes_driver.py  -p basic_extremes_param.py --case_id "extremes_ex3" --plots
Metrics output path not found.
Creating metrics output directory demo_output/extremes_ex3
No sftlf file found for GISS-E2-H r6i1p1

-----------------------
model, run, variable: GISS-E2-H r6i1p1 pr
test_data (model in this case) full_path:
   demo_output/extremes_tmp/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc
Generating land sea mask.
Generating precipitation block extrema.
Writing results to netCDF.
Creating maps
Traceback (most recent call last):
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/bin/extremes_driver.py", line 431, in <module>
    meta = plot_extremes.make_maps(
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/plot_extremes.py", line 19, in make_maps
    plot_extremes(data, index, model, run, yrs, output_template)
  File "/Users/lee1043/mambaforge/envs/pmp_devel_20231129/lib/python3.10/site-packages/pcmdi_metrics/extremes/lib/plot_extremes.py", line 44, in plot_extremes
    min_lev = math.floor(ds.min() [/](https://file+.vscode-resource.vscode-cdn.net/) 10) * 10
ValueError: cannot convert float NaN to integer

@acordonez
Copy link
Collaborator Author

@lee1043 This latest commit fixed the all-Nan issue you flagged on my end. The notebook ran all the way for me and I'm hoping it runs for you!

@lee1043 lee1043 mentioned this pull request Dec 23, 2023
@lee1043 lee1043 merged commit 0e1ea73 into main Jan 1, 2024
5 checks passed
@lee1043 lee1043 deleted the 534_ao_extremes branch January 1, 2024 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MFW extremes
6 participants