Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update MoV code to use xCDAT #1020

Merged
merged 101 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
7c54910
add more stats for MoV driver
lee1043 Jan 8, 2024
86ca9b2
update
lee1043 Jan 9, 2024
a15e5c8
update eofs to v1.4.1
lee1043 Jan 10, 2024
8ca46ad
clean up
lee1043 Jan 11, 2024
9cc5783
pre-commit fix
lee1043 Jan 11, 2024
05fafeb
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Jan 11, 2024
5dc869f
some functions moved to io
lee1043 Jan 11, 2024
585c867
clean up
lee1043 Jan 12, 2024
1579cf1
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Jan 12, 2024
80dbdba
update
lee1043 Jan 14, 2024
60feadc
clean up
lee1043 Jan 15, 2024
53ec879
duplicate string constructor to io because of circular import error
lee1043 Jan 16, 2024
1e2073f
pre-commit fix
lee1043 Jan 16, 2024
ed93d69
use fill_template from io instead of utils
lee1043 Jan 16, 2024
79902b7
use calcTCOR from newer position and pre-commit fix
lee1043 Jan 16, 2024
e4d3498
update
lee1043 Jan 17, 2024
a087755
update
lee1043 Jan 17, 2024
348b859
clean up, add regrid utils
lee1043 Jan 17, 2024
ccbc08d
debug and updates
lee1043 Jan 18, 2024
02d3068
bug fix (continue)
lee1043 Jan 18, 2024
a0a716a
bug fix
lee1043 Jan 24, 2024
0fa57f6
add north test as a part of the driver
lee1043 Jan 24, 2024
ce979f7
bug fix
lee1043 Jan 24, 2024
e495fee
bug fix
lee1043 Jan 25, 2024
8a606bc
pre-commit fix
lee1043 Jan 25, 2024
ad84f7a
clean up
lee1043 Jan 25, 2024
2d89f57
pre-commit fix
lee1043 Jan 25, 2024
3c1a873
bug fix
lee1043 Jan 26, 2024
60772d4
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Jan 26, 2024
052d3e8
bug fix
lee1043 Jan 26, 2024
c1a28b1
bug fix
lee1043 Jan 26, 2024
8d10543
simplify, clean up, add link to PMP installation
lee1043 Jan 26, 2024
6bb67ae
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Jan 26, 2024
64a8b31
add logo and clean up in the demo notebook
lee1043 Jan 27, 2024
28e6cf8
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Jan 27, 2024
f968029
clean up
lee1043 Jan 30, 2024
5f039de
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Feb 1, 2024
8c8caeb
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Feb 2, 2024
f2c1593
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Feb 7, 2024
426e7fc
logic simplified
lee1043 Feb 7, 2024
8d4f9de
clean up
lee1043 Feb 8, 2024
6bc7232
update
lee1043 Feb 12, 2024
84e5f9d
Merge pull request #1060 from PCMDI/feature/lee1043-mov-modularize
lee1043 Feb 22, 2024
fbca25c
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Feb 22, 2024
c9c5d3c
make code calender flexible -- reduced calendar dependency
lee1043 Feb 22, 2024
131df5c
pre-commit fix
lee1043 Feb 22, 2024
65b69c4
bug fix
lee1043 Feb 26, 2024
cf6fb8a
move timeseries adjustment in a new separate file
lee1043 Feb 26, 2024
ae02779
Merge branch 'feature/1012_lee1043_stats-MoV_xcdat' into feature/lee1…
lee1043 Feb 26, 2024
dd09f56
Merge pull request #1062 from PCMDI/feature/lee1043-mov-modularize
lee1043 Feb 26, 2024
be22674
separate adjust timeseries
lee1043 Feb 26, 2024
5f9ca3e
clean up
lee1043 Feb 26, 2024
51af8ca
Merge branch 'feature/1012_lee1043_stats-MoV_xcdat' of github.com:PCM…
lee1043 Feb 26, 2024
dd91e78
clean up
lee1043 Feb 26, 2024
3df31a1
update
lee1043 Feb 27, 2024
0544c51
Merge branch 'feature/1012_lee1043_stats-MoV_xcdat' of github.com:PCM…
lee1043 Feb 29, 2024
12ed70b
clean up
lee1043 Feb 29, 2024
156a3e4
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Mar 4, 2024
cc569fb
rename to simplify
lee1043 Mar 4, 2024
8cdfa78
clean up + bug fix
lee1043 Mar 7, 2024
61da5b6
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Mar 7, 2024
496ed5f
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Mar 12, 2024
3b2dceb
pre-commit fix
lee1043 Mar 12, 2024
8ec3dac
clean up
lee1043 Mar 12, 2024
000fe20
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Mar 15, 2024
9c09dc4
add demo script but for pcmdi internal
lee1043 Apr 4, 2024
81c794f
bug fix
lee1043 Apr 4, 2024
db0a396
bug fix
lee1043 Apr 4, 2024
00fc735
add missing bounds for sanity check
lee1043 Apr 4, 2024
f2f6736
in progress...
lee1043 Apr 4, 2024
0e18825
clean up..
lee1043 Apr 4, 2024
31c4298
fix bug for SAM region
lee1043 Apr 4, 2024
c6f6a81
enable automatic assignment of eofn_obs and eofn_mod by mode name
lee1043 Apr 4, 2024
f84c310
pre-commit clean up
lee1043 Apr 4, 2024
dee6096
remove eofn_obs and eofn_mod from pcmdi params
lee1043 Apr 5, 2024
77158d4
clean up
lee1043 Apr 5, 2024
6b2a562
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Apr 5, 2024
a5fa6cd
bug fix
lee1043 Apr 15, 2024
219334b
bug fix for sign flip -- revealed by SAM test
lee1043 Apr 15, 2024
f48ea9a
update required xcdat version regarding https://github.com/PCMDI/pcmd…
lee1043 Apr 16, 2024
5880600
pre-commit fix
lee1043 Apr 16, 2024
5ad2b21
moved missing bounds adding to io function
lee1043 Apr 16, 2024
376bffb
bug fix for centered rmse
lee1043 Apr 17, 2024
b115384
pre-commit fix
lee1043 Apr 17, 2024
b14c5fb
reduce potential memory usage
lee1043 Apr 17, 2024
719d4a3
bug fix: normalize by map std for centered RMSE calculation
lee1043 Apr 19, 2024
c4d2791
keep updated
lee1043 Apr 24, 2024
113d045
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Apr 24, 2024
d886049
clean up and simplified
lee1043 Apr 24, 2024
6a414c6
Merge branch 'feature/1012_lee1043_stats-MoV_xcdat' of github.com:PCM…
lee1043 Apr 24, 2024
436cfc9
initial commit for custom season capability
lee1043 Apr 26, 2024
e600190
add custom season capability
lee1043 Apr 26, 2024
b9e5aea
updated notebook to include custom season
lee1043 Apr 26, 2024
c6d9e3c
Merge pull request #1085 from PCMDI/feature/1012_lee1043_stats-MoV_xc…
lee1043 Apr 26, 2024
5aac6f7
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Apr 26, 2024
abbbd9a
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 Apr 30, 2024
91bb1c9
bug fix
lee1043 May 1, 2024
1392682
clean up
lee1043 May 1, 2024
893c5b3
pre-commit fix
lee1043 May 1, 2024
04829b1
Merge branch 'main' into feature/1012_lee1043_stats-MoV_xcdat
lee1043 May 2, 2024
2bd8aec
clean up, more debug printout added
lee1043 May 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Prev Previous commit
Next Next commit
debug and updates
  • Loading branch information
lee1043 committed Jan 18, 2024
commit ccbc08df287b47d04bea5e54d970aa9d0a1f0ad3
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ repos:
- id: black

- repo: https://github.com/timothycrosley/isort
rev: 5.12.0
rev: 5.13.2
hooks:
- id: isort
args: ["--honor-noqa"]
Expand All @@ -34,7 +34,7 @@ repos:
# Python linting
# =======================
- repo: https://github.com/pycqa/flake8
rev: 6.0.0
rev: 7.0.0
hooks:
- id: flake8
args: ["--config=setup.cfg"]
Expand Down
3 changes: 2 additions & 1 deletion pcmdi_metrics/io/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
da_to_ds,
get_axis_list,
get_data_list,
get_grid,
get_latitude_bounds_key,
get_latitude_key,
get_latitude,
Expand All @@ -21,4 +22,4 @@
get_time_key,
select_subset,
)
from .default_regions_define import load_regions_specs, region_subset # noqa
from .regions import load_regions_specs, region_subset # noqa
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import xarray as xr
import xcdat as xc

from pcmdi_metrics.io import da_to_ds
from pcmdi_metrics.io import da_to_ds, get_longitude, select_subset


def load_regions_specs() -> dict:
Expand Down Expand Up @@ -76,63 +76,67 @@ def load_regions_specs() -> dict:


def region_subset(
ds: Union[xr.Dataset, xr.DataArray], region: str, regions_specs: dict = None
ds: Union[xr.Dataset, xr.DataArray],
region: str,
data_var: str = "variable",
regions_specs: dict = None,
) -> Union[xr.Dataset, xr.DataArray]:
"""
ds: xarray.Dataset
regions_specs: dict
region: string
"""_summary_

Parameters
----------
ds : Union[xr.Dataset, xr.DataArray]
_description_
region : str
_description_
data_var : str, optional
_description_, by default None
regions_specs : dict, optional
_description_, by default None

Returns
-------
Union[xr.Dataset, xr.DataArray]
_description_
"""
if isinstance(ds, xr.DataArray):
is_dataArray = True
varname = "variable"
ds = da_to_ds(ds, varname)
ds = da_to_ds(ds, data_var)
else:
is_dataArray = False

if regions_specs is None:
regions_specs = load_regions_specs()

if "domain" in list(regions_specs[region].keys()):
if "latitude" in list(regions_specs[region]["domain"].keys()):
if "domain" in regions_specs[region]:
if "latitude" in regions_specs[region]["domain"]:
lat0 = regions_specs[region]["domain"]["latitude"][0]
lat1 = regions_specs[region]["domain"]["latitude"][1]
# proceed subset
if "latitude" in (ds.coords.dims):
ds = ds.sel(latitude=slice(lat0, lat1))
elif "lat" in (ds.coords.dims):
ds = ds.sel(lat=slice(lat0, lat1))
ds = select_subset(ds, lat=(lat0, lat1))

if "longitude" in list(regions_specs[region]["domain"].keys()):
if "longitude" in regions_specs[region]["domain"]:
lon0 = regions_specs[region]["domain"]["longitude"][0]
lon1 = regions_specs[region]["domain"]["longitude"][1]

# check original dataset longitude range
if "longitude" in (ds.coords.dims):
lon_min = ds.longitude.min()
lon_max = ds.longitude.max()
elif "lon" in (ds.coords.dims):
lon_min = ds.lon.min()
lon_max = ds.lon.max()

# longitude range swap if needed
if (
min(lon0, lon1) < 0
): # when subset region lon is defined in (-180, 180) range
if (
min(lon_min, lon_max) < 0
): # if original data lon range is (-180, 180) no treatment needed
lon_min = get_longitude(ds).min().values.item()
lon_max = get_longitude(ds).max().values.item()

# Check if longitude range swap is needed
if min(lon0, lon1) < 0:
# when subset region lon is defined in (-180, 180) range
if min(lon_min, lon_max) < 0:
# if original data lon range is (-180, 180), no treatment needed
pass
else: # if original data lon range is (0, 360), convert swap lon
else:
# if original data lon range is (0, 360), convert and swap lon
ds = xc.swap_lon_axis(ds, to=(-180, 180))

# proceed subset
if "longitude" in (ds.coords.dims):
ds = ds.sel(longitude=slice(lon0, lon1))
elif "lon" in (ds.coords.dims):
ds = ds.sel(lon=slice(lon0, lon1))
ds = select_subset(ds, lon=(lon0, lon1))

if is_dataArray:
return ds["variable"]
return ds[data_var]
else:
return ds
22 changes: 22 additions & 0 deletions pcmdi_metrics/io/xcdat_dataset_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,3 +197,25 @@ def da_to_ds(d: Union[xr.Dataset, xr.DataArray], var: str = "variable") -> xr.Da
raise TypeError(
"Input must be an instance of either xarrary.DataArray or xarrary.Dataset"
)


def get_grid(
ds: xr.Dataset,
) -> xr.Dataset:
"""Get grid information

Parameters
----------
ds : xr.Dataset
xarray dataset to extract grid information that has latitude, longitude, and their bounds included

Returns
-------
xr.Dataset
xarray dataset with grid information
"""
lat_key = get_latitude_key(ds)
lon_key = get_longitude_key(ds)
lat_bnds_key = get_latitude_bounds_key(ds)
lon_bnds_key = get_longitude_bounds_key(ds)
return ds[[lat_key, lon_key, lat_bnds_key, lon_bnds_key]]
52 changes: 26 additions & 26 deletions pcmdi_metrics/stats/compute_statistics_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import xcdat as xc


def _check_data_convert_to_ds_if_needed(
def da_to_ds(
d: Union[xr.Dataset, xr.DataArray], var: str = "variable"
):
if isinstance(d, xr.Dataset):
Expand All @@ -29,8 +29,8 @@ def annual_mean(dm, do, var="variable"):
"Comments": "Assumes input are 12 months climatology",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

dm_am = dm.temporal.average(var)
do_am = do.temporal.average(var)
Expand Down Expand Up @@ -84,8 +84,8 @@ def bias_xy(dm, do, var="variable", weights=None):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

dif = dm[var] - do[var]
if weights is None:
Expand All @@ -104,8 +104,8 @@ def bias_xyt(dm, do, var="variable"):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

ds = dm.copy(deep=True)
ds["dif"] = dm[var] - do[var]
Expand All @@ -124,8 +124,8 @@ def cor_xy(dm, do, var="variable", weights=None):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

if weights is None:
weights = dm.spatial.get_weights(axis=["X", "Y"])
Expand Down Expand Up @@ -155,7 +155,7 @@ def mean_xy(d, var="variable", weights=None):
"Contact": "[email protected]",
}

d = _check_data_convert_to_ds_if_needed(d, var)
d = da_to_ds(d, var)

lat_key = xc.axis.get_dim_keys(d, axis="Y")
lon_key = xc.axis.get_dim_keys(d, axis="X")
Expand All @@ -176,8 +176,8 @@ def meanabs_xy(dm, do, var="variable", weights=None):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

if weights is None:
weights = dm.spatial.get_weights(axis=["X", "Y"])
Expand All @@ -197,8 +197,8 @@ def meanabs_xyt(dm, do, var="variable"):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

ds = dm.copy(deep=True)
ds["absdif"] = abs(dm[var] - do[var])
Expand All @@ -219,8 +219,8 @@ def rms_0(dm, do, var="variable", weighted=True):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

dif_square = (dm[var] - do[var]) ** 2
if weighted:
Expand All @@ -240,8 +240,8 @@ def rms_xy(dm, do, var="variable", weights=None):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

dif_square = (dm[var] - do[var]) ** 2
if weights is None:
Expand All @@ -259,8 +259,8 @@ def rms_xyt(dm, do, var="variable"):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

ds = dm.copy(deep=True)
ds["diff_square"] = (dm[var] - do[var]) ** 2
Expand All @@ -280,8 +280,8 @@ def rmsc_xy(dm, do, var="variable", weights=None, NormalizeByOwnSTDV=False):
"Contact": "[email protected]",
}

dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

if weights is None:
weights = dm.spatial.get_weights(axis=["X", "Y"])
Expand Down Expand Up @@ -310,7 +310,7 @@ def std_xy(ds, var="variable", weights=None):
"Contact": "[email protected]",
}

ds = _check_data_convert_to_ds_if_needed(ds, var)
ds = da_to_ds(ds, var)

if weights is None:
weights = ds.spatial.get_weights(axis=["X", "Y"])
Expand All @@ -334,7 +334,7 @@ def std_xyt(d, var="variable"):
"Contact": "[email protected]",
}
ds = d.copy(deep=True)
ds = _check_data_convert_to_ds_if_needed(ds, var)
ds = da_to_ds(ds, var)
average = d.spatial.average(var, axis=["X", "Y"]).temporal.average(var)[var]
ds["anomaly"] = (d[var] - average) ** 2
variance = (
Expand All @@ -353,8 +353,8 @@ def zonal_mean(dm, do, var="variable"):
"Contact": "[email protected]",
"Comments": "",
}
dm = _check_data_convert_to_ds_if_needed(dm, var)
do = _check_data_convert_to_ds_if_needed(do, var)
dm = da_to_ds(dm, var)
do = da_to_ds(do, var)

dm_zm = dm.spatial.average(var, axis=["X"])
do_zm = do.spatial.average(var, axis=["X"])
Expand Down
1 change: 1 addition & 0 deletions pcmdi_metrics/variability_mode/lib/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
)
from .landmask import data_land_mask_out, estimate_landmask # noqa
from .lib_variability_mode import ( # noqa
check_start_end_year,
debug_print,
get_domain_range,
read_data_in,
Expand Down
12 changes: 7 additions & 5 deletions pcmdi_metrics/variability_mode/lib/calc_stat.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
from time import gmtime, strftime

from pcmdi_metrics.io import region_subset
from pcmdi_metrics.io import get_grid, region_subset
from pcmdi_metrics.stats import bias_xy as calcBias
from pcmdi_metrics.stats import cor_xy as calcSCOR
from pcmdi_metrics.stats import mean_xy
from pcmdi_metrics.stats import rms_xy as calcRMS
from pcmdi_metrics.stats import rmsc_xy as calcRMSc
from pcmdi_metrics.utils import regrid


def calc_stats_save_dict(
Expand Down Expand Up @@ -60,12 +61,13 @@ def calc_stats_save_dict(
# . . . . . . . . . . . . . . . . . . . . . . . . .
if obs_compare:
if method in ["eof", "cbf"]:
ref_grid_global = eof_lr_obs.getGrid()
ref_grid_global = get_grid(eof_lr_obs)
# Regrid (interpolation, model grid to ref grid)
debug_print("regrid (global) start", debug)
eof_model_global = eof_lr.regrid(
ref_grid_global, regridTool="regrid2", mkCyclic=True
)
# eof_model_global = eof_lr.regrid(eof_lr,
# ref_grid_global, regridTool="regrid2", mkCyclic=True
# )
eof_model_global = regrid(eof_lr, ref_grid_global)
debug_print("regrid end", debug)
# Extract subdomain
# eof_model = eof_model_global(region_subdomain)
Expand Down