Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeshCoord constructor always uses metadata of node coordinate regardless of location #4860

Closed
schlunma opened this issue Jul 12, 2022 · 7 comments · Fixed by #5020
Closed

Comments

@schlunma
Copy link
Contributor

schlunma commented Jul 12, 2022

The MeshCoord constructor always uses the metadata (standard_name, long_name, units, etc.) of the node coordinate, regardless of the location that is specified:

super().__init__(
points,
bounds=bounds,
standard_name=node_coord.standard_name,
long_name=node_coord.long_name,
var_name=None, # We *don't* "represent" the underlying node var
units=node_coord.units,
attributes=node_coord.attributes,
)

That feels really weird to me and can lead to unexpected behavior. For example, if you compare the original face coordinate that is used to construct a Mesh and compare it to the output of Mesh.to_MeshCoord you get different results:

# define node coordinates
(node_lat, node_lon) = ...

# define face coordinates with different metadata
(face_lat, face_lon) = ...

# construct mesh with it
mesh = Mesh(                                                                                                                                                                                                                                                           
            topology_dimension=2,                                                                                                                                                                                                                                              
            node_coords_and_axes=[(node_lat, 'y'), (node_lon, 'x')],                                                                                                                                                                                                           
            connectivities=[connectivity],                                                                                                                                                                                                                                     
            face_coords_and_axes=[(face_lat, 'y'), (face_lon, 'x')])

# extract MeshCoord
mesh_face_lat = mesh.to_MeshCoord('face', 'y')

Here, the metadata of face_lat and mesh_face_lat differ since the metadata of face_lat and node_lat differ.

I am not sure if this is an intended feature or not, so please feel free to just close this if this is intentional.

Cheers 👍


Iris version: 3.2.1.post0

@bjlittle
Copy link
Member

@schlunma Thanks for taking the time to report this, much appreciated!

I'm particularly excited simply because it means that you're using our experimental ugrid support, which is brilliant 🤩

We're mostly all busy this week attending SciPy 2022 (13-15 July), but we'll get back to you asap 👍

I'll class this issue as a bug, until we know otherwise.

@schlunma
Copy link
Contributor Author

Thanks @Bill, that's great!

Yes, I've used it quite extensively in the last couple of weeks (also in combination with iris-esmf-regrid), and I'm really happy with it 👍 Thanks for all the great work on this 🚀

@pp-mo
Copy link
Member

pp-mo commented Jul 13, 2022

I am not sure if this is an intended feature or not, so please feel free to just close this if this is intentional.

Well, it kind-of is intended this way, but we can be open to persuasion ...
The logic for this was that the "number 1 purpose" of meshcoords is to provide a location-indexed arrays of location bounds -- since that information is otherwise not easily available.
The bounds are always values of the node_coords, hence for the bounds (at least) it makes some sense to say that the applicable metadata (e.g. the units) are those from the node_coords variables.
( Note: we are also intending in future to support MeshCoords with bounds but no points, though that is currently not possible : see #4438 ).

However, I guess it is at least arguable that we should be requiring the location + noords coordinates two to have equivalent or compatible metadata , which we could test for -- e.g. what if the face_coords had different units to the node_coords?

It may also be worth considering that MeshCoords don't actually add anything to the cube.location + cube.mesh information : They are a bridging concept providing unstructured coordinate info in a form more like the structured equivalents.

What do you think will be useful ?

Just great to hear that you're using this, though 🚀 !
We don't have so much feedback just yet.

@schlunma
Copy link
Contributor Author

Thanks for these insights @pp-mo, I can definitely understand the reasoning behind this! However, I still think this is confusing and can be wrong in some (unrealistic, but not completely impossible) cases.

Example

Consider the data below where the node metadata is very different to the face metadata (even the units differ!).

Netcdf file with unstructured grid
netcdf cube {
dimensions:
        Mesh2d_node = 10242 ;
        Mesh2d_face = 20480 ;
        bnds_3 = 3 ;
        Mesh_2d_face_N_nodes = 3 ;
variables:
        int Mesh_2d ;
                Mesh_2d:cf_role = "mesh_topology" ;
                Mesh_2d:topology_dimension = 2 ;
                Mesh_2d:Conventions = "UGRID-1.0" ;
                Mesh_2d:node_coordinates = "nlon nlat" ;
                Mesh_2d:face_coordinates = "flon flat" ;
                Mesh_2d:face_node_connectivity = "mesh2d_face" ;
        double nlon(Mesh2d_node) ;
                nlon:units = "rad" ;
                nlon:standard_name = "grid_longitude" ;
                nlon:long_name = "node longitude" ;
        double nlat(Mesh2d_node) ;
                nlat:units = "rad" ;
                nlat:standard_name = "grid_latitude" ;
                nlat:long_name = "node latitude" ;
        double flon(Mesh2d_face) ;
                flon:bounds = "flon_bnds" ;
                flon:units = "degrees_east" ;
                flon:standard_name = "longitude" ;
                flon:long_name = "face longitude" ;
        double flon_bnds(Mesh2d_face, bnds_3) ;
        double flat(Mesh2d_face) ;
                flat:bounds = "flat_bnds" ;
                flat:units = "degrees_north" ;
                flat:standard_name = "latitude" ;
                flat:long_name = "face latitude" ;
        double flat_bnds(Mesh2d_face, bnds_3) ;
        int mesh2d_face(Mesh2d_face, Mesh_2d_face_N_nodes) ;
                mesh2d_face:cf_role = "face_node_connectivity" ;
                mesh2d_face:start_index = 1 ;
        int64 tas(Mesh2d_face) ;
                tas:standard_name = "air_temperature" ;
                tas:long_name = "Near-surface air temperature" ;
                tas:units = "K" ;
                tas:mesh = "Mesh_2d" ;
                tas:location = "face" ;

// global attributes:
                :Conventions = "CF-1.7" ;
}

Loading this cube gives this:

air_temperature / (K)               (-- : 20480)                                                                                                                                                                                                                              
    Mesh coordinates:                                                                                                                                                                                                                                                         
        grid_latitude                   x
        grid_longitude                  x
    Attributes:
        Conventions                 'CF-1.7'

The first thing you notice is that somehow the coordinates describing the faces have the name grid_latitude and grid_longitude, even though in the netcdf file these coordinates are called latitude and longitude. If we have a closer look at one of those coordinates, we get:

MeshCoord :  grid_latitude / (rad)
    mesh: <Mesh: 'Mesh_2d'>
    location: 'face'
    points: [ 52.61807547,  53.87577873, ..., -46.82864641, -44.85063277]
    bounds: [
        [ 0.92919108,  0.92919108,  0.89622949],
        [ 0.92919108,  0.96233691,  0.92919108],
        ...,
        [-0.8285508 , -0.82920591, -0.79519872],
        [-0.79519872, -0.76070502, -0.79398573]]
    shape: (20480,)  bounds(20480, 3)
    dtype: float64
    standard_name: 'grid_latitude'
    long_name: 'node latitude'
    axis: 'y'

You clearly see that the points and the bounds clearly do not fit together, and the units clearly do not match the point values. This is not only confusing, but also wrong. In addition, the bounds here do not match the ones given in the original file by flat_bnds.

Ideas

One possible idea on how to deal with this could be:

  1. If the mesh is constructed without a face/edge coordinate, the approach that is currently implemented is the way to go. Asking for a MeshCoord(mesh, 'face/edge', 'x/y') should result in a coordinate with empty points and bounds calculated from the connectivity and the node coordinate. In this case, the node metadata correctly describes the face/edge coordinate. However, if I interpret the code correctly, asking for a face/edge MeshCoord when no face/edge coordinate has been used to construct the mesh is not possible at the moment.

  2. If the mesh is constructed with a face/edge coordinate, the situation is more tricky. First, it should be checked if the metadata of the nodes and the face/edge coordinates match (units + standard_name at the very least, I wouldn't check others). This allows us to safely use the more natural face/edge coordinate metadata for the face/edge MeshCoords. One additional problem might arise if the face/edge coordinates already have bounds:

  • face/edge coordinates have bounds: it might be reasonably to check if these bounds match the ones calculated from the connectivity and the nodes. If they don't match, it might be worthwhile to raise an error here, even though ugrid-checks only prints a warning (WARN A205 : Mesh coordinate variable "clon" within Mesh_2d:face_coordinates has bounds="clon_bnds", which does not match the expected values calculated from the "vlon" node coordinate and the face connectivity, "mesh2d_face".).
  • face/edge coordinates do not have bounds: Calculate the bounds from the connectivity and the nodes as it is done right now.

One additional thought I have here is that it might be reasonable to perform theses checks already when creating the mesh. That way problems arise very early.

Let me know what you think 🚀

@schlunma
Copy link
Contributor Author

Could someone with the necessary permissions add the ESMValTool label to this? Thanks!! 👍

@pp-mo
Copy link
Member

pp-mo commented Oct 3, 2022

Hi @schlunma .
Just to say, we're very aware that this is top priority on the ESMValTool Iris project, and I am now looking into it !

I think that pinning down all these decisions may be a bit tricky.
So, I will make some initial decisions + implement them in a PR,
while still expecting that you may want to request more changes.
Hopefully, you can evaluate that before the workshop in 2 weeks' time -- e.g. you can try actual code. Then if it needs more work, we can work it out together.

@schlunma
Copy link
Contributor Author

schlunma commented Oct 4, 2022

Hi @pp-mo, thanks, that sounds great!

I will be travelling this week and next week, so I might be slow to respond, but I will definitely try to test it before the workshop👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants