Any ideas on why there are large differences when regridding E3SM land sea mask with xcdat xESMF compared to cdms2 ESMF? #521
-
Question criteria
Describe your questionOverviewIn e3sm_diags I am replacing the cdms2 horizontal regridder API (ESMF, conservative) with the xcdat horizontal regridder API (xESMF, conservative). The horizontal regridder is used to regrid a land sea mask to a variable's grid before it is applied on the variable. The ProblemThe regridded land sea mask produced by xcdat compared to cdms2 appear to be way off. Since the regridded land sea masks are off, the masked variables are also off. Are there any ideas on why this might be happening? Note, I regridded other variables and the results are relatively close between both regridders. I mainly run into this problem when regridding the land sea mask. Implementation Logic
Test Cases (from MCVE section below)# 1. COMPARE GRIDS -- identical (PASSING)
np.testing.assert_equal(output_grid_xc.lat.data, output_grid_cd.getAxis(0))
np.testing.assert_equal(output_grid_xc.lon.data, output_grid_cd.getAxis(1))
# 2. COMPARE INPUT MASKS -- the same (PASSING)
np.testing.assert_equal(ds_mask_xr["LANDFRAC"].data, mask_tvar.data)
# 3. COMPARE REGRIDDED MASKS -- not equal, regridding results in differences (FAILING)
# FIXME: This test case is failing
np.testing.assert_equal(dv_mas_var, mask_tvar_regrid.data)
# AssertionError:
# Arrays are not equal
# Mismatched elements: 18426 / 64800 (28.4%)
# Max absolute difference: 0.15945476
# Max relative difference: 1.
# x: array([[[1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],...
# y: array([[[1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],...
# 4. COMPARE OUTPUT MASK AVERAGES -- identical (PASSING)
mask_tvar_regrid.mean() == dv_mas_var.mean()
# 5. COMPARE REGRIDDED MASKS WITH LOWER LIMIT -- not equal (FAILING)
# FIXME: This test case is failing
lower_limit = 0.65 # This is based on the region value for land/ocean
mask_var_limit = dv_mas_var < lower_limit
mask_tvar_limit = mask_tvar_regrid < lower_limit
np.testing.assert_equal(mask_var_limit.data, mask_tvar_limit.data)
# AssertionError:
# Arrays are not equal
# Mismatched elements: 261 / 64800 (0.403%)
# x: array([[[False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],...
# y: array([[[False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],... Are there are any possible answers you came across?
Minimal Complete Verifiable Example (MVCE)# %%
# flake8 noqa: F401
import cdms2
from cdms2.tvariable import TransientVariable
import numpy as np
import xcdat as xc # noqa
import xarray as xr
# Source: https://esgf-data2.llnl.gov/thredds/fileServer/user_pub_work/CMIP6/CMIP/E3SM-Project/E3SM-2-0/historical/r1i1p1f1/Amon/cl/gr/v20220830/cl_Amon_E3SM-2-0_historical_r1i1p1f1_gr_200001-201412.nc
FILEPATH = "qa/658-lat-lon-set/cl_Amon_E3SM-2-0_historical_r1i1p1f1_gr_200001-201412.nc"
# Source: https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/driver/acme_ne30_ocean_land_mask.nc
LAND_OCEAN_MASK_PATH = "e3sm_diags/driver/acme_ne30_ocean_land_mask.nc"
# %%
# ----------------------------------------------------------------------------
# xCDAT
# ----------------------------------------------------------------------------
# 1. Prepare the input variable and mask datasets.
ds_var = xr.open_dataset(FILEPATH)
ds_mask_xr = xr.open_dataset(LAND_OCEAN_MASK_PATH)
# 2. Regrid the mask variable to the input variable dataset grid.
# NOTE: xesmf "conservative_normed" is equivalent to esmf "conservative"
output_grid_xc = ds_var.regridder.grid
ds_mask_xr_regrid = ds_mask_xr.regridder.horizontal(
"LANDFRAC", output_grid_xc, tool="xesmf", method="conservative_normed"
)
mask_var_regrid = ds_mask_xr_regrid["LANDFRAC"]
# %%
# ------------------------------------------------------------------------------
# cdms2
# ------------------------------------------------------------------------------
# 1. Prepare the input datasets and variables.
ds_tvar = cdms2.open(FILEPATH)
tvar = TransientVariable(ds_tvar["cl"])
ds_mask_cd = cdms2.open(LAND_OCEAN_MASK_PATH)
mask_tvar = TransientVariable(ds_mask_cd["LANDFRAC"])
# 2. Regrid the mask variable to the input variable dataset grid.
output_grid_cd = tvar.getGrid()
mask_tvar_regrid = mask_tvar.regrid(
output_grid_cd,
regridTool="esmf",
regridMethod="conservative",
)
# ------------------------------------------------------------------------------
# TEST CASES
# ------------------------------------------------------------------------------
# 1. COMPARE GRIDS -- identical (PASSING)
np.testing.assert_equal(output_grid_xc.lat.data, output_grid_cd.getAxis(0))
np.testing.assert_equal(output_grid_xc.lon.data, output_grid_cd.getAxis(1))
# 2. COMPARE INPUT MASKS -- the same (PASSING)
np.testing.assert_equal(ds_mask_xr["LANDFRAC"].data, mask_tvar.data)
# 3. COMPARE REGRIDDED MASKS -- not equal, regridding results in differences (FAILING)
# FIXME: This test case is failing
np.testing.assert_equal(mask_var_regrid, mask_tvar_regrid.data)
# AssertionError:
# Arrays are not equal
# Mismatched elements: 18426 / 64800 (28.4%)
# Max absolute difference: 0.15945476
# Max relative difference: 1.
# x: array([[[1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],...
# y: array([[[1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],
# [1., 1., 1., ..., 1., 1., 1.],...
# 4. COMPARE OUTPUT MASK AVERAGES -- identical (PASSING)
mask_tvar_regrid.mean() == mask_var_regrid.mean()
# 5. COMPARE REGRIDDED MASKS WITH LOWER LIMIT -- not equal (FAILING)
# FIXME: This test case is failing
lower_limit = 0.65 # This is based on the region value for land/ocean
mask_var_limit = mask_var_regrid < lower_limit
mask_tvar_limit = mask_tvar_regrid < lower_limit
np.testing.assert_equal(mask_var_limit.data, mask_tvar_limit.data)
# AssertionError:
# Arrays are not equal
# Mismatched elements: 261 / 64800 (0.403%)
# x: array([[[False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],...
# y: array([[[False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],
# [False, False, False, ..., False, False, False],...
#%%
# PLOT THE RESULTS AND DIFFERENCES
mask_xesmf = mask_var_regrid.copy()
mask_xesmf.plot()
# %%
# Using `from_cdms2` to turn the TransientVariable into an xr.DataArray
# because it is easier to plot.
mask_esmf = xr.DataArray.from_cdms2(mask_tvar_regrid)
mask_esmf.plot()
# Plot the difference
(mask_xesmf - mask_esmf).plot() Relevant log outputNo response Environmentxcdat=0.6.0rc1 INSTALLED VERSIONScommit: None xarray: 2023.6.0 Anything else we need to know?API Documentation: |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 19 replies
-
Any thoughts or ideas @xCDAT/core-developers? |
Beta Was this translation helpful? Give feedback.
-
Checking on some old results comparing cdms (conservative) vs xesmf (conservative_normed). The difference is very small (megabit diff: In [13] in validation notebook.) |
Beta Was this translation helpful? Give feedback.
Thank you @chengzhuzhang! I followed your guidance and now the xESMF and ESMF regridding results are close with around 1e-7 absolute and relative max difference. I will mark my comment here as the answer.
To sum up this discussion:
cdms2
was not properly importingesmf
and usinglibcf
as the fallback regridding tool. You will …