Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy SUM with 1D weights sometimes fails #5083

Open
schlunma opened this issue Nov 24, 2022 · 3 comments
Open

Lazy SUM with 1D weights sometimes fails #5083

schlunma opened this issue Nov 24, 2022 · 3 comments

Comments

@schlunma
Copy link
Contributor

🐛 Bug Report

Collapsing a cube with lazy data using iris.analysis.SUM and 1D weights

cube = cube.collapsed('a', iris.analysis.SUM, weights=[0, 1]) 

sometimes fails due to broadcasting errors:

Traceback (most recent call last):
  File "/home/b/b309141/scripts/iris/cube_weights.py", line 24, in <module>
    cube = cube.collapsed('a', iris.analysis.SUM, weights=[0, 1])
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/cube.py", line 3785, in collapsed
    data_result = aggregator.lazy_aggregate(
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 545, in lazy_aggregate
    return self.lazy_func(data, axis=axis, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 1136, in inner_stat
    dask_result = dask_stats_function(array, axis=axis, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 1136, in inner_stat
    dask_result = dask_stats_function(array, axis=axis, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 1408, in _lazy_sum
    wsum = da.sum(weights_in * array, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 223, in wrapper
    return f(self, other)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 2349, in __rmul__
    return elemwise(operator.mul, other, self)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 4762, in elemwise
    broadcast_shapes(*shapes)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 4690, in broadcast_shapes
    raise ValueError(
ValueError: operands could not be broadcast together with shapes (2,) (2, 3)

How To Reproduce

import dask.array as da
import iris
print("iris version:", iris.__version__)
print()
from iris.coords import AuxCoord
from iris.cube import Cube
aux_coord = AuxCoord([1, 2], var_name='a')
cube = Cube(da.arange(2 * 3).reshape(2, 3), aux_coords_and_dims=[(aux_coord, 0)])
cube = cube.collapsed('a', iris.analysis.SUM, weights=[0, 1])

Expected behaviour

No fail.

Environment

  • OS & Version: Red Hat Enterprise Linux 8.5 (Ootpa)
  • Iris Version: Tested with 3.2.1 and 3.5.0.dev3, so I guess versions in between are also affected
@rcomer
Copy link
Member

rcomer commented Nov 24, 2022

I've confirmed this failure is still reproduced in the v3.4 release candidate. This is only a problem for the lazy case because in the real case collapsed always moves the aggregation dimension(s) to the trailing dim

iris/lib/iris/cube.py

Lines 3922 to 3932 in 57647d2

# First reshape the data so that the dimensions being aggregated
# over are grouped 'at the end' (i.e. axis=-1).
dims_to_collapse = sorted(dims_to_collapse)
end_size = reduce(
operator.mul, (self.shape[dim] for dim in dims_to_collapse)
)
untouched_shape = [self.shape[dim] for dim in untouched_dims]
new_shape = untouched_shape + [end_size]
dims = untouched_dims + dims_to_collapse
unrolled_data = np.transpose(self.data, dims).reshape(new_shape)

That said, if you pass your weights as a list in the real case, you get a failure here

if weights is not None and weights.ndim > 1:

Possibly we just need to use broadcast_to_shape before

wsum = np.sum(weights_in * array, **kwargs)

@trexfeathers
Copy link
Contributor

Probably best to look at this AFTER the changes in #5084!

@schlunma
Copy link
Contributor Author

Yes, definitely! I actually noticed this bug while writing tests for #5084 (it is still present after applying the changes from this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Development

No branches or pull requests

3 participants