Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unusual comparison in coord contiguity check #4473

Open
jonseddon opened this issue Dec 29, 2021 · 3 comments
Open

Unusual comparison in coord contiguity check #4473

jonseddon opened this issue Dec 29, 2021 · 3 comments
Labels
Stale A stale issue/pull-request

Comments

@jonseddon
Copy link
Contributor

📰 Custom Issue

In iris.coords.Coord._discontiguity_in_bounds() there's an unusual comparison that I'm struggling to understand the logic of. Please have a look at https://github.com/SciTools/iris/blob/main/lib/iris/coords.py#L1686

diffs_along_axis will be an array of bools, with a value of True when the difference between cells is greater than the tolerance.

points_close_enough is then comparing this array of bools against atol + rtol * cell_size, which is an array of floats, and doesn't seem to be a comparison that you would want to do.

I think that the logic's currently returning the correct result because the calculation of points_close_enough effectively inverts each bool in diffs_along_axis. np.all() only returns True (i.e. the coord is contiguous) when all of it's elements are True (so is a logical-and for inputs of bools). If any diffs_along_axis elements were greater than the tolerance then you will therefore get a False returned by np.all().

Although this does work I was wondering if this may not have been the original intention of this bit of code. If this was the intention then it's not immediately clear to the reader how it actually works and could perhaps do with being simplified. This code was introduced by #3144, merged by @pp-mo.

I'm not totally sure of the intentions of calculating points_close_enough currently and so I don't feel able to create a PR to fix this in case I'm missing anything. A simplification could be to go from:

                diffs_along_axis = diffs_between_cells > (
                    atol + rtol * cell_size
                )

                points_close_enough = diffs_along_axis <= (
                    atol + rtol * cell_size
                )
                contiguous_along_axis = np.all(points_close_enough)
                return diffs_along_axis, contiguous_along_axis

to:

                diff_too_big = diffs_between_cells > (
                    atol + rtol * cell_size
                )

                return diffs_along_axis, not diff_too_big

but I'm not sure if this would be chopping out some functionality that points_close_enough was intended to add. If you are happy with my simplification then please let me know and I can raise a PR. An alternative would be to revert to the logic that existed before #3144. Otherwise I'd be grateful if you could suggest an alternative implementation for this bit of a head scratcher of a piece of code.

@stephenworsley
Copy link
Contributor

I believe I attempted to address this, along with some other issues, in #3404. It seems like this became stale due to a lack of momentum, though it might be worth reopening now.

@bjlittle bjlittle added the Peloton 🚴‍♂️ Target a breakaway issue to be caught and closed by the peloton label Jan 12, 2022
@stephenworsley
Copy link
Contributor

To clarify the work in #3404, this tackles a number of problems, some of which are seperable and some of which relate to the problem here. As i see it, the current problems are as follows:

  1. Contiguitiy checking of 2D bounds currently does not check all corners of the bounds. (This should be entirely seperable)
  2. The current mathematics do not make sense, especially this line, since diffs_along_axis is a boolean array but is treated as a float:
    points_close_enough = diffs_along_axis <= (
  3. The returned value diffs is an array of floats for 1D coords but is a tuple of boolean arrays for 2D coords. These ought to be either both floats or both booleans.
  4. Wherever the method _discontiguity_in_bounds is used, it appears to be used as if the returned values for diffs were a boolean array and not, as is described in the docsting, a float array, e.g.

    iris/lib/iris/util.py

    Lines 1831 to 1837 in 55e6a21

    _, (diffs_x, diffs_y) = coord._discontiguity_in_bounds(
    rtol=rel_tol, atol=abs_tol
    )
    bad_points_boolean[:, :-1] = np.logical_or(
    bad_points_boolean[:, :-1], diffs_x
    )

These last three issues are fairly interlinked and the last one in particular would make a simple fix more difficult without risking potentially interfering with code elsewhere.

@trexfeathers trexfeathers assigned bjlittle and unassigned pp-mo Nov 16, 2022
@bjlittle bjlittle removed the Peloton 🚴‍♂️ Target a breakaway issue to be caught and closed by the peloton label Feb 22, 2023
@bjlittle bjlittle removed their assignment Feb 22, 2023
Copy link
Contributor

github-actions bot commented Jul 7, 2024

In order to maintain a backlog of relevant issues, we automatically label them as stale after 500 days of inactivity.

If this issue is still important to you, then please comment on this issue and the stale label will be removed.

Otherwise this issue will be automatically closed in 28 days time.

@github-actions github-actions bot added the Stale A stale issue/pull-request label Jul 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale A stale issue/pull-request
Projects
Status: No status
Development

No branches or pull requests

5 participants