Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PresRat correction #215

Open
wants to merge 78 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
d88de6f
Initializing PresRat with reference
castelao Apr 23, 2024
3d6f9ae
feat: ZeroRateMixin()
castelao Apr 23, 2024
8d3fb02
refact: Moving ZeroRateMixin to mixins
castelao May 24, 2024
c8ad983
Adding ipython into dev environment
castelao May 24, 2024
86ac9cf
fix: ZeroRateMixin shouldn't inherit DataRetrievalBase
castelao May 24, 2024
a91dad7
refact: Moving PresRat to qdm sub-module
castelao May 24, 2024
41504a0
fix: Use the given threshold (it was hardcoded before)
castelao May 24, 2024
1686520
PresRat using ZeroRateMixin
castelao May 24, 2024
b92ec3c
doc: Explicit steps for PresRat
castelao May 27, 2024
720848d
Prototype for quantiles on time windows
castelao May 28, 2024
b173b6c
fix: Missing load PresRat at module
castelao May 28, 2024
72ebb5e
Overwritting QDM methods for PresRat requirements
castelao Jun 3, 2024
67a3387
Don't need to overwrite __init__ if defining threshold in the runtime
castelao Jun 3, 2024
d53d8b6
doc: More details on what is PresRat
castelao Jun 3, 2024
cf16c82
Testing apply_zero_precipitation_rate()
castelao Jun 6, 2024
41baa19
test: zero_precipitation_rate()
castelao Jun 11, 2024
a1d484a
test: presrat_zero_rate with multiple thresholds
castelao Jun 12, 2024
d3a7ba4
fix, doc: Typo 'monthly'
castelao Jun 12, 2024
3ea12ee
test: apply_zero_precipitation_rate()
castelao Jun 13, 2024
d3a9d3f
test: Minimalist check for test_presrat_transform()
castelao Jun 13, 2024
5d9a09d
test: parallel results must equal serial
castelao Jun 13, 2024
6a89536
style: Fixing code style
castelao Jun 17, 2024
4461c41
test: test_presrat_transform_nochanges()
castelao Jun 20, 2024
df92e2c
test: Missing to update PresRat standard params for testing
castelao Jun 20, 2024
35931a3
test: Adding back a small noise on future projections
castelao Jun 20, 2024
5f414b3
test: Defining some resources used in the tests
castelao Jun 20, 2024
1074746
clean, test:
castelao Jun 20, 2024
5d172f3
test: test_presrat_zero_rate with multiple thresholds
castelao Jun 20, 2024
6b67870
test, clean: Combining tests
castelao Jun 20, 2024
d5b8647
test: A fixture with fut_cc dataset itself for efficiency
castelao Jun 20, 2024
1353350
test: Standard presrat transform test with dataset from memory
castelao Jun 20, 2024
59d8b2d
test: test_presrat_transform_nozerochanges()
castelao Jun 20, 2024
1fa54db
test: test_presrat_transform_nochanges() working
castelao Jun 20, 2024
5968726
style:
castelao Jun 20, 2024
88ce7c8
test, doc: test_presrat_transform_nochanges()
castelao Jun 20, 2024
68ab5ab
Saving used threshold in metadata
castelao Jun 21, 2024
65f54c9
Entry point for presrat functions
castelao Jun 21, 2024
98c839d
feat: apply_zero_precipitation_rate()
castelao Jun 21, 2024
6a766d7
feat: get_spatial_bc_presrat()
castelao Jun 21, 2024
b0c5beb
fix, test: test_presrat_transform_nochanges()
castelao Jun 21, 2024
bdab03d
feat: Saving requirements to estimate K factor
castelao Jun 21, 2024
f8c88e3
Adding time indices
castelao Jun 21, 2024
7cafe23
fix: zero_precipitation_rate() dealing with NaN
castelao Jun 21, 2024
b17b8d1
refact: Moved apply_zero_precipitation_rate() into bias_tranforms
castelao Jun 21, 2024
9140482
fix: Missing to load xarray
castelao Jun 21, 2024
0a69638
refact: get_spatial_bc_quantiles returing params as dict
castelao Jun 24, 2024
a68fa7a
feat: local_presrat_bc()
castelao Jun 24, 2024
757e0b0
style: Mostly moving to single quote which is the standard used in sup3r
castelao Jun 24, 2024
e0d8c56
test, clean: PresRat tests were already moved into its own module
castelao Jun 24, 2024
dd26ec4
Locking numpy to < 2.0
castelao Jun 24, 2024
ca9123a
doc: Updating one-line description of PresRat
castelao Jun 24, 2024
24e47d8
test: Updating fut_cc_notrend() to match the 'resource'
castelao Jun 24, 2024
3f46c69
test, doc: Better explaing the expected effect of NaNs
castelao Jun 24, 2024
0571a14
feat: PresRat.write_outputs()
castelao Jun 24, 2024
d869c31
fix, doc: Weird string literal issue
castelao Jun 24, 2024
2fae27d
style: Sorting imports
castelao Jun 24, 2024
82a9814
test, doc: More info in PresRat tests
castelao Jun 24, 2024
8b402ea
test: Improving test_presrat_calc()
castelao Jun 24, 2024
9dd55e9
test, refact: Combining tests
castelao Jun 24, 2024
5a439ac
style: Code formating
castelao Jun 24, 2024
7ebbe87
test, refact: Conforming normalizing dataset
castelao Jun 24, 2024
0657edc
style: Missing space
castelao Jun 24, 2024
a98838a
fix, feat: Missing to include presrat_notrend_params()
castelao Jun 24, 2024
de068e5
style: Unused import
castelao Jun 24, 2024
b87c747
test, fix: Missing a few fixtures
castelao Jun 24, 2024
fa32ff8
doc: Missing documentation on testing functions
castelao Jun 24, 2024
c9169c4
test: Global zero rate threshold (ZR_THRESHOLD)
castelao Jun 24, 2024
b5f6d94
clean: Removing ideas for later
castelao Jun 24, 2024
9647c53
Bump Python to 3.11.9
castelao Jun 25, 2024
66376f2
style: Conforming with sup3r style
castelao Jun 26, 2024
bd026d1
style: UP039
castelao Jul 1, 2024
c35f911
fix: Wrong type
castelao Jul 1, 2024
6679323
doc: local_presrat_bc()
castelao Jul 1, 2024
d6ffc84
style: E741 - ambiguous-variable-name
castelao Jul 1, 2024
5822e3a
cfg: Adding UP039 as fixable
castelao Jul 1, 2024
ec26101
style: Fixing notes to conform with numpydoc
castelao Jul 1, 2024
b1ef3b9
doc: Adding TODO label
castelao Jul 1, 2024
8a5d7b7
style: missing-trailing-comma
castelao Jul 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
276 changes: 274 additions & 2 deletions pixi.lock

Large diffs are not rendered by default.

9 changes: 6 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,9 @@ exclude = [
]

[tool.ruff.lint]
fixable = []
fixable = [
"UP039", # unnecessary-class-parentheses
]
# preview = true
# logger-objects = []
task-tags = ["TODO", "FIXME", "XXX"]
Expand Down Expand Up @@ -253,13 +255,13 @@ channels = ["conda-forge", "anaconda", "main"]
platforms = ["osx-arm64", "linux-64"]

[tool.pixi.dependencies]
python = "==3.11"
python = "~=3.11.0"
cftime = ">=1.6.2"
dask = ">=2022.0"
h5netcdf = ">=1.1.0"
pillow = ">=10.0"
matplotlib = ">=3.1"
numpy = ">=1.7.0"
numpy = "~=1.7"
pandas = ">=2.0"
scipy = ">=1.0.0"
tensorflow = ">2.4,<2.16"
Expand Down Expand Up @@ -291,3 +293,4 @@ pytest = ">=5.2"
build = ">=0.6"
twine = ">=5.0"
ruff = ">=0.4"
ipython = ">=8.0"
11 changes: 8 additions & 3 deletions sup3r/bias/__init__.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,24 @@
"""Bias calculation and correction modules."""

from .bias_transforms import (global_linear_bc, local_linear_bc,
local_qdm_bc, monthly_local_linear_bc)
from .bias_calc import (LinearCorrection, MonthlyLinearCorrection,
MonthlyScalarCorrection, SkillAssessment)
from .qdm import QuantileDeltaMappingCorrection
from .bias_transforms import (apply_zero_precipitation_rate,
global_linear_bc, local_linear_bc,
local_qdm_bc, local_presrat_bc,
monthly_local_linear_bc)
from .qdm import PresRat, QuantileDeltaMappingCorrection

__all__ = [
"apply_zero_precipitation_rate",
"global_linear_bc",
"local_linear_bc",
"local_qdm_bc",
"local_presrat_bc",
"monthly_local_linear_bc",
"LinearCorrection",
"MonthlyLinearCorrection",
"MonthlyScalarCorrection",
"PresRat",
"QuantileDeltaMappingCorrection",
"SkillAssessment",
]
4 changes: 2 additions & 2 deletions sup3r/bias/bias_calc.py
Original file line number Diff line number Diff line change
Expand Up @@ -984,7 +984,7 @@ def run(self,
class MonthlyLinearCorrection(LinearCorrection):
"""Calculate linear correction *scalar +adder factors to bias correct data

This calculation operates on single bias sites on a montly basis
This calculation operates on single bias sites on a monthly basis
"""

NT = 12
Expand Down Expand Up @@ -1044,7 +1044,7 @@ class MonthlyScalarCorrection(MonthlyLinearCorrection):
scalar factors are computed as mean(base_data) / mean(bias_data). Adder
factors are still written but are exactly zero.

This calculation operates on single bias sites on a montly basis
This calculation operates on single bias sites on a monthly basis
"""

@staticmethod
Expand Down
266 changes: 256 additions & 10 deletions sup3r/bias/bias_transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,18 +158,19 @@ def get_spatial_bc_quantiles(lat_lon: np.array,
>>> lat_lon = np.array([
... [39.649033, -105.46875 ],
... [39.649033, -104.765625]])
>>> params = get_spatial_bc_quantiles(
... lat_lon, "ghi", "rsds", "./dist_params.hdf")
>>> params, cfg = get_spatial_bc_quantiles(
... lat_lon, "ghi", "rsds", "./dist_params.hdf")
"""
ds = {'base': f'base_{base_dset}_params',
'bias': f'bias_{feature_name}_params',
'bias_fut': f'bias_fut_{feature_name}_params'}
out = _get_factors(lat_lon, ds, bias_fp, threshold)
'bias_fut': f'bias_fut_{feature_name}_params',
}
params = _get_factors(lat_lon, ds, bias_fp, threshold)

with Resource(bias_fp) as res:
cfg = res.global_attrs

return out["base"], out["bias"], out["bias_fut"], cfg
return params, cfg


def global_linear_bc(input, scalar, adder, out_range=None):
Expand Down Expand Up @@ -488,11 +489,15 @@ def local_qdm_bc(data: np.array,
>>> unbiased = local_qdm_bc(biased_array, lat_lon_array, "ghi", "rsds",
... "./dist_params.hdf")
"""
base, bias, bias_fut, cfg = get_spatial_bc_quantiles(lat_lon,
base_dset,
feature_name,
bias_fp,
threshold)
params, cfg = get_spatial_bc_quantiles(lat_lon,
base_dset,
feature_name,
bias_fp,
threshold)
base = params["base"]
bias = params["bias"]
bias_fut = params["bias_fut"]

if lr_padded_slice is not None:
spatial_slice = (lr_padded_slice[0], lr_padded_slice[1])
base = base[spatial_slice]
Expand Down Expand Up @@ -520,3 +525,244 @@ def local_qdm_bc(data: np.array,
tmp = QDM(tmp)
# Reorgnize array back from (time, space) to (spatial, spatial, temporal)
return tmp.T.reshape(data.shape)


def apply_zero_precipitation_rate(arr: np.ndarray, rate):
"""Enforce the zero precipitation rate
Replace lowest values by zero to satisfy the given rate of zero
precipitation.

Parameters
----------
arr : np.array
An array of values to be analyzed. Usually precipitation but it
rate : float or np.array
Rate of zero, or negligible, days of precipitation.

Returns
-------
corrected : np.array
A copy of given array that satisfies the rate of zero precipitation
days, i.e. the lowest values of precipitation are changed to zero
to satisfy that rate.

Examples
--------
>>> data = np.array([5, 0.1, np.nan, 0.2, 1])
>>> apply_zero_precipitation_rate(data, 0.30)
array([5. , 0. , nan, 0.2, 1. ])
"""
assert arr.ndim == 3
assert rate.ndim >= 2
assert arr.shape[:2] == rate.shape[:2]

for i in range(rate.shape[0]):
for j in range(rate.shape[1]):
r = rate[i, j, 0]
if np.isfinite(r):
a = arr[i, j]
valid = np.isfinite(a)
idx = np.argsort(a)[valid][:round(r * valid.astype('i').sum())]
a[idx] = 0
arr[i, j] = a

return arr


def get_spatial_bc_presrat(lat_lon: np.array,
base_dset: str,
feature_name: str,
bias_fp: str,
threshold: float = 0.1):
"""Statistical distributions previously estimated for given lat/lon points

Recover the parameters that describe the statistical distribution
previously estimated with
:class:`~sup3r.bias.bias_calc.QuantileDeltaMappingCorrection` for three
datasets: ``base`` (historical reference), ``bias`` (historical biased
reference), and ``bias_fut`` (the future biased dataset, usually the data
to correct).

Parameters
----------
lat_lon : ndarray
Array of latitudes and longitudes for the domain to bias correct
(n_lats, n_lons, 2)
base_dset : str
Name of feature used as historical reference. A Dataset with name
"base_{base_dset}_params" will be retrieved from ``bias_fp``.
feature_name : str
Name of the feature that is being corrected. Datasets with names
"bias_{feature_name}_params" and "bias_fut_{feature_name}_params" will
be retrieved from ``bias_fp``.
bias_fp : str
Filepath to bias correction file from the bias calc module. Must have
datasets "base_{base_dset}_params", "bias_{feature_name}_params", and
"bias_fut_{feature_name}_params" that define the statistical
distributions.
threshold : float
Nearest neighbor euclidean distance threshold. If the coordinates are
more than this value away from the bias correction lat/lon, an error
is raised.

Returns
-------
base : np.array
Parameters used to define the statistical distribution estimated for
the ``base_dset``. It has a shape of (I, J, P), where (I, J) are the
same first two dimensions of the given `lat_lon` and P is the number
of parameters and depends on the type of distribution. See
:class:`~sup3r.bias.bias_calc.QuantileDeltaMappingCorrection` for more
details.
bias : np.array
Parameters used to define the statistical distribution estimated for
(historical) ``feature_name``. It has a shape of (I, J, P), where
(I, J) are the same first two dimensions of the given `lat_lon` and P
is the number of parameters and depends on the type of distribution.
See :class:`~sup3r.bias.bias_calc.QuantileDeltaMappingCorrection` for
more details.
bias_fut : np.array
Parameters used to define the statistical distribution estimated for
(future) ``feature_name``. It has a shape of (I, J, P), where (I, J)
are the same first two dimensions of the given `lat_lon` and P is the
number of parameters used and depends on the type of distribution. See
:class:`~sup3r.bias.bias_calc.QuantileDeltaMappingCorrection` for more
details.
cfg : dict
Metadata used to guide how to use of the previous parameters on
reconstructing the statistical distributions. For instance,
`cfg['dist']` defines the type of distribution. See
:class:`~sup3r.bias.bias_calc.QuantileDeltaMappingCorrection` for more
details, including which metadata is saved.

Warnings
--------
Be careful selecting which `bias_fp` to use. In particular, if
"bias_fut_{feature_name}_params" is representative for the desired target
period.

See Also
--------
sup3r.bias.bias_calc.QuantileDeltaMappingCorrection
Estimate the statistical distributions loaded here.

Examples
--------
>>> lat_lon = np.array([
... [39.649033, -105.46875 ],
... [39.649033, -104.765625]])
>>> params, cfg = get_spatial_bc_quantiles(
... lat_lon, "ghi", "rsds", "./dist_params.hdf")
"""
ds = {'base': f'base_{base_dset}_params',
'base_zero_rate': f'{base_dset}_zero_rate',
'bias': f'bias_{feature_name}_params',
'bias_fut': f'bias_fut_{feature_name}_params',
'bias_mean_mh': f'{feature_name}_mean_mh',
'bias_mean_mf': f'{feature_name}_mean_mf',
}
params = _get_factors(lat_lon, ds, bias_fp, threshold)

with Resource(bias_fp) as res:
cfg = res.global_attrs

return params, cfg


def local_presrat_bc(data: np.ndarray,
time: np.ndarray,
lat_lon: np.array,
base_dset: str,
feature_name: str,
bias_fp,
lr_padded_slice=None,
threshold=0.1,
relative=True,
no_trend=False):
"""Bias correction using PresRat

Parameters
----------
data : np.ndarray
Sup3r input data to be bias corrected, assumed to be 3D with shape
(spatial, spatial, temporal) for a single feature.
lat_lon : ndarray
Array of latitudes and longitudes for the domain to bias correct
(n_lats, n_lons, 2)
base_dset :
Name of feature that is used as (historical) reference. Dataset with
names "base_{base_dset}_params" will be retrieved.
feature_name : str
Name of feature that is being corrected. Datasets with names
"bias_{feature_name}_params" and "bias_fut_{feature_name}_params" will
be retrieved.
bias_fp : str
Filepath to statistical distributions file from the bias calc module.
Must have datasets "bias_{feature_name}_params",
"bias_fut_{feature_name}_params", and "base_{base_dset}_params" that
are the parameters to define the statistical distributions to be used
to correct the given `data`.
lr_padded_slice : tuple | None
Tuple of length four that slices (spatial_1, spatial_2, temporal,
features) where each tuple entry is a slice object for that axes.
Note that if this method is called as part of a sup3r forward pass, the
lr_padded_slice will be included automatically in the kwargs for the
active chunk. If this is None, no slicing will be done and the full
bias correction source shape will be used.
no_trend: bool, default=False
An option to ignore the trend component of the correction, thus
resulting in an ordinary Quantile Mapping, i.e. corrects the bias by
comparing the distributions of the biased dataset with a reference
datasets. See
``params_mf`` of :class:`rex.utilities.bc_utils.QuantileDeltaMapping`.
Note that this assumes that params_mh is the data distribution
representative for the target data.
"""

params, cfg = get_spatial_bc_presrat(lat_lon,
base_dset,
feature_name,
bias_fp,
threshold)
base = params["base"]
bias = params["bias"]
bias_fut = params["bias_fut"]

if lr_padded_slice is not None:
spatial_slice = (lr_padded_slice[0], lr_padded_slice[1])
base = base[spatial_slice]
bias = bias[spatial_slice]
bias_fut = bias_fut[spatial_slice]

if no_trend:
mf = None
else:
mf = bias_fut.reshape(-1, bias_fut.shape[-1])
# The distributions are 3D (space, space, N-params)
# Collapse 3D (space, space, N) into 2D (space**2, N)
QDM = QuantileDeltaMapping(base.reshape(-1, base.shape[-1]),
bias.reshape(-1, bias.shape[-1]),
mf,
dist=cfg['dist'],
relative=relative,
sampling=cfg["sampling"],
log_base=cfg["log_base"])

# input 3D shape (spatial, spatial, temporal)
# QDM expects input arr with shape (time, space)
tmp = data.reshape(-1, data.shape[-1]).T
# Apply QDM correction
tmp = QDM(tmp)
# Reorgnize array back from (time, space) to (spatial, spatial, temporal)
tmp = tmp.T.reshape(data.shape)

tmp = apply_zero_precipitation_rate(tmp, params["base_zero_rate"])

month = time.month
for m in range(12):
idx = month == m + 1
x_hat = tmp[:, :, idx].mean(axis=-1)
k = params["bias_mean_mf"][:, :, m] / x_hat
tmp[:, :, idx] *= k[:, :, np.newaxis]

return tmp
Loading
Loading