Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly allow for either pixel or gridline registered grids #476

Merged
merged 41 commits into from
Jul 16, 2020

Conversation

weiji14
Copy link
Member

@weiji14 weiji14 commented Jun 10, 2020

Description of proposed changes

xarray assumes pixel registration (aka data values represent centre of the pixel), whereas GMT defaults to using gridline registration (aka data values represent top left corner of pixel). PyGMT currently defaults to gridline registration, following upstream GMT, but users have no way to easily set it to pixel registration.

Gridline vs Pixel registration

See http:https://xarray.pydata.org/en/stable/plotting.html#coordinates and https://docs.generic-mapping-tools.org/latest/cookbook/options.html#grid-registration-the-r-option for context. NOAA also has a good page that talks about grid registration at https://ngdc.noaa.gov/mgg/global/gridregistration.html

Grid vs Cell registration

This is a major bug, and may cause breaking changes for some users. Users will have to be aware of what type of registration their raster grids' are using, and unfortunately, those using xarray may need to set the registration manually to p (for pixel) for correctness, unless we default to using pixel registration in PyGMT. Alternatively, the salem library has an xarray accessor to represent an xarray grid as a center_grid or corner_grid, as used in weiji14/deepbedmap#150.

Will be good to have some discussion on what's the best plan forward.

Fixes #375? #390? Needed for #451.

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

@weiji14 weiji14 added the bug Something isn't working label Jun 10, 2020
Default registration is still "GMT_GRID_NODE_REG" as per upstream, but we might want to override that in some cases.
@weiji14
Copy link
Member Author

weiji14 commented Jun 10, 2020

Ok, it's getting late in my timezone, but I just want to jot down some notes before bedtime. Please correct me if I'm wrong! But I think we should check Line 105 in the code block below.

region = []
inc = []
# Reverse the dims because it is rows, columns ordered. In geographic
# grids, this would be North-South, East-West. GMT's region and inc are
# East-West, North-South.
for dim in grid.dims[::-1]:
coord = grid.coords[dim].values
coord_incs = coord[1:] - coord[0:-1]
coord_inc = coord_incs[0]
if not np.allclose(coord_incs, coord_inc):
raise GMTInvalidInput(
"Grid appears to have irregular spacing in the '{}' dimension.".format(
dim
)
)
region.extend([coord.min(), coord.max()])
inc.append(coord_inc)
if any([i < 0 for i in inc]): # Sort grid when there are negative increments
inc = [abs(i) for i in inc]
grid = grid.sortby(variables=list(grid.dims), ascending=True)
matrix = as_c_contiguous(grid.values[::-1])
return matrix, region, inc

Specifically, the code region.extend([coord.min(), coord.max()]) can't surely be valid for both gridline and pixel registered cases. If I'm thinking properly, it is incorrect for the pixel registered case (which xarray uses) by a pixel, the region should be something like coord.min() - inc/2, coord.max() + inc/2).

I've commited a small change at bb7afdc, and have been testing out both GMT_GRID_NODE_REG and GMT_GRID_PIXEL_REG, but they both result in the same figure. Here's a minimal test case I've been playing with.

import pygmt
import xarray as xr

fig = pygmt.Figure()
grid = xr.DataArray(data=[[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]])
fig.grdimage(grid=grid, region=[-2, 2, -2, 2], projection="X5c", frame=True)
fig.savefig("pixel_or_node_registration.png")
fig.show()

produces:

pixel_registration

This is with the current default gridline settings you can try at https://github.com/GenericMappingTools/try-gmt. Notice how each pixel/square is centred on non-decimal coordinates (i.e. (0,0), (0,1), (0,2), (1,1), etc), and that a 3x3 grid is produced. The same figure is produced if I try using pixel registration, and I feel that it's wrong somehow but my brain can't explain it properly. Will continue tomorrow. Edit: Got a bit caught up with other commitments, feel free to jump in anyone if you have any ideas on how to fix this bug.

Trying to use pixel registration as the default when converting an xarray.DataArray to a virtual GMT grid, instead of gridline registration.
@weiji14
Copy link
Member Author

weiji14 commented Jun 19, 2020

Specifically, the code region.extend([coord.min(), coord.max()]) can't surely be valid for both gridline and pixel registered cases. If I'm thinking properly, it is incorrect for the pixel registered case (which xarray uses) by a pixel, the region should be something like coord.min() - inc/2, coord.max() + inc/2).

Found a way to illustrate this problem: when running grdinfo at https://github.com/GenericMappingTools/try-gmt on the actual grd file vs the virtual earth_relief grid:

import pprint
import pygmt

pprint.pprint(pygmt.grdinfo("@earth_relief_60m"), width=1000)

produces:

('/home/jovyan/.gmt/server/earth_relief_60m.grd: Title: Earth Relief at 01 arc degree\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: Command: grdfilter SRTM15+V2.1.nc -Fg111.2 -D1 -I01d -rp -Gearth/earth_relief/earth_relief_01d_p.grd=ns+s0.5+o0 --IO_NC4_DEFLATION_LEVEL=9 --IO_NC4_CHUNK_SIZE=4096 --PROJ_ELLIPSOID=Sphere\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: Remark: Obtained by Gaussian Cartesian filtering (111.2 km fullwidth) from SRTM15+V2.1.nc [Tozer et al., 2019; http:https://dx.doi.org/10.1029/2019EA000658]\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: Pixel node registration used [Geographic grid]\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: Grid file format: ns = GMT netCDF format (16-bit integer), CF-1.7\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: x_min: -180 x_max: 180 x_inc: 1 name: longitude n_columns: 360\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: y_min: -90 y_max: 90 y_inc: 1 name: latitude n_rows: 180\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: z_min: -8182 z_max: 5651.5 name: elevation (m)\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: scale_factor: 0.5 add_offset: 0 packed z-range: [-16364,11303]\n'
 '/home/jovyan/.gmt/server/earth_relief_60m.grd: format: netCDF-4 chunk_size: 180,180 shuffle: on deflation_level: 9\n')

whereas

grid = pygmt.datasets.earth_relief.load_earth_relief()
pprint.pprint(pygmt.grdinfo(grid))

produces

(': Title: \n'
 ': Command: \n'
 ': Remark: \n'
 ': Gridline node registration used [Cartesian grid]\n'
 ': Unrecognized grid file format! Probably not a GMT grid\n'
 ': x_min: -179.5 x_max: 179.5 x_inc: 1 name:  n_columns: 360\n'
 ': y_min: -89.5 y_max: 89.5 y_inc: 1 name:  n_rows: 180\n'
 ': z_min: -8182 z_max: 5651.5 name: z\n'
 ': scale_factor: 1 add_offset: 0\n')

Notice how the x_min/x_max of the virtual file isn't at -180/180, ditto with y_min/y_max. Plus it thinks the grid is gridline registered, rather than pixel registered. Also, doesn't seem okay treating the grid as Cartesian instead of Geographic (but that's probably a separate issue with the virtualfile mechanism). Not sure if it's a limitation of the GMT_Create_Data C API? I've pushed a commit using GMT_GRID_PIXEL_REG as the default to see how things turn out. Edit: It runs into a Segmentation Fault 😨

@seisman
Copy link
Member

seisman commented Jun 20, 2020

There may be a potential bug in GMT. Please follow GenericMappingTools/gmt#3502 and GenericMappingTools/gmt#3503 for upstream feedback.

@seisman
Copy link
Member

seisman commented Jun 20, 2020

It's confirmed that there was a bug when passing a pixel-registered matrix to GMT C API. It was already fixed in GenericMappingTools/gmt#3503 and merged in to master. I can confirm that the current PR works well. Thus what we need to do is:

  1. decide the default registration for array data
  2. try our best to detect and guess the grid registration (especially for netcdf file loaded as xarray)
  3. let users specify the registration they want to use
  4. wait for the GMT 6.1 release.

@seisman
Copy link
Member

seisman commented Jun 22, 2020

doesn't seem okay treating the grid as Cartesian instead of Geographic (but that's probably a separate issue with the virtualfile mechanism). Not sure if it's a limitation of the GMT_Create_Data C API?

family, geometry, mode="GMT_CONTAINER_ONLY", ranges=region, inc=inc

Changing GMT_CONTAINER_ONLY to GMT_CONTAINER_ONLY|GMT_GRID_IS_GEO, then GMT will know it's a geographic grid.

Previously it thought every xarray.DataArray grid was Cartesian! So the load_earth_relief() grid passed to GMT was treated as a Cartesian virtualfile, whereas the actual `@earth_relief` file was properly read as Geographic.

Co-authored-by: Dongdong Tian <[email protected]>
@@ -1242,7 +1242,7 @@ def virtualfile_from_grid(self, grid, registration="GMT_GRID_PIXEL_REG"):
gmt_grid = self.create_data(
family,
geometry,
mode="GMT_CONTAINER_ONLY",
mode="GMT_CONTAINER_ONLY|GMT_GRID_IS_GEO",
Copy link
Member

@seisman seisman Jun 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I didn't mean this change. Users can pass both Cartesian and geographic grids to GMT. We need to use GMT_CONTAINER_ONLY or GMT_CONTAINER_ONLY|GMT_GRID_IS_GEO for different grid types.

BTW, GMT_CONTAINER_ONLY is equivalent to GMT_CONTAINER_ONLY|GMT_GRID_IS_CARTESIAN.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I thought it 'knew' automatically what the grid type was. Should actually get some tests in for this.

BTW, GMT_CONTAINER_ONLY is equivalent to GMT_CONTAINER_ONLY|GMT_GRID_IS_CARTESIAN.

Ok, good to know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to add it as a valid_modifier.
@vercel vercel bot temporarily deployed to Preview June 22, 2020 23:10 Inactive
Should be GMT_IS_OUTPUT instead of GMT_OUTPUT. Also update some docstrings that were missed in the #210 refactor PR.
For xarray grids read from disk, there is an 'encoding' dictionary, with a 'source' key that gives the path to the file. Running `grdinfo` on that file can give us the grid registration. This works on the earth_relief grids. For grids that are not read from disk, we still default to assuming gridline registration ("GMT_GRID_NODE_REG").
@weiji14
Copy link
Member Author

weiji14 commented Jun 23, 2020

It's confirmed that there was a bug when passing a pixel-registered matrix to GMT C API. It was already fixed in GenericMappingTools/gmt#3503 and merged in to master. I can confirm that the current PR works well. Thus what we need to do is:

1. decide the default registration for array data

2. try our best to detect and guess the grid registration (especially for netcdf file loaded as xarray)

I've added an automatic gridline/pixel registration detector in 6d4ece4. It simply does grdinfo on the xarray.DataArray's NetCDF source (specifically via grid.encoding["source"], and will work on the earth_relief grids. If that doesn't work (e.g. for grids generated purely in Python), it will fallback to gridline registration by default.

Values go from latitude 89S to 89N. Avoids the `grdimage [INFORMATION]: 359 (of 361) inconsistent grid values at North pole.` message.
@weiji14 weiji14 marked this pull request as ready for review July 15, 2020 23:28
Copy link
Member Author

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation-wise, this PR should be done. However, there are cases on Linux when a black image is returned (hard to reproduce consistently, except when running make test), possibly because of a flaky issue with our test suite, or one that might be solved with #517? Not too sure about Windows.

We could merge this in first and try to track down the issue properly (see if it's a problem in PyGMT or upstream with the GMT C API), or wait until we are confident that the bug is isolated. I have kept the test_grdimage_over_dateline.png image small (~10KB) in case we need to change it for any reason.

Comment on lines +78 to +80
@pytest.mark.runfirst
@pytest.mark.mpl_image_compare
def test_grdimage_over_dateline(xrgrid):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running test_grdimage_over_dateline first (before the other tests) makes the tests pass on Linux (see commit 3499047 and bf03e5b). This is a hacky workaround for the flakiness I described at https://github.com/GenericMappingTools/pygmt/pull/476/files#r453429872.

inc = [abs(i) for i in inc]
grid = grid.sortby(variables=list(grid.dims), ascending=True)

matrix = as_c_contiguous(grid.values[::-1])
matrix = as_c_contiguous(grid[::-1].values)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted 5a685b2 at 60c791d, will stick to just grid[::-1].values for now (i.e. flipping the grid left right in xarray rather than in numpy).

@seisman
Copy link
Member

seisman commented Jul 16, 2020

Now all the tests pass. Does it mean this PR works?

@weiji14
Copy link
Member Author

weiji14 commented Jul 16, 2020

Now all the tests pass. Does it mean this PR works?

Short answer: Yes (and it has always worked for macOS).

Long answer: See #476 (comment). The tests pass when I run test_grdimage_over_dateline before the other tests. On my Linux box, it works fine if I just call grdimage as normal in a script. But the make test fails consistently for some unknown reason.

@@ -29,7 +29,8 @@ test:
@echo ""
@cd $(TESTDIR); python -c "import $(PROJECT); $(PROJECT).show_versions()"
@echo ""
cd $(TESTDIR); pytest $(PYTEST_ARGS) $(PROJECT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments here, explain why we need two pytest.

@@ -51,3 +73,19 @@ def test_grdimage_fails():
fig = Figure()
with pytest.raises(GMTInvalidInput):
fig.grdimage(np.arange(20).reshape((4, 5)))


@pytest.mark.runfirst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a brief comment here, explain why we need "runfirst".

Ensure no gaps are plotted over the 180 degree international dateline.
Specifically checking that `xrgrid.gmt.gtype = 1` sets `GMT_GRID_IS_GEO`,
and that `xrgrid.gmt.registration = 0` sets `GMT_GRID_NODE_REG`. Note that
there would be a gap over the dateline if a pixel registered grid is used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link to issue #375?

fig = Figure()
assert xrgrid.gmt.registration == 0 # gridline registration
xrgrid.gmt.gtype = 1 # geographic coordinate system
fig.grdimage(grid=xrgrid, region="g", projection="A0/0/1c", V="d")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fig.grdimage(grid=xrgrid, region="g", projection="A0/0/1c", V="d")
fig.grdimage(grid=xrgrid, region="g", projection="A0/0/1c")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch it to V="i" for now, so we still get some extra information.

@seisman
Copy link
Member

seisman commented Jul 16, 2020

There is a pytest warning:

python3.7/site-packages/_pytest/mark/structures.py:327: PytestUnknownMarkWarning: Unknown pytest.mark.runfirst - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    PytestUnknownMarkWarning,

@weiji14
Copy link
Member Author

weiji14 commented Jul 16, 2020

Just for completeness, although this PR does fix #375, it doesn't fix @MarkWieczorek's reported bug at #390. I tried running the example but couldn't get the cylindrical 'Q' projection to plot in xarray (bottom row), and the Mollweide 'W' projection doesn't work too (middle row). Using a NetCDF file works fine though (top row). This results are the same when ran on the code in this PR or the current master branch at f401f85, so it might be an upstream GMT 6.1 issue.

test test
This PR Master branch at f401f85

@seisman
Copy link
Member

seisman commented Jul 16, 2020

I once tried to debug the issue in #390, it seems the boundary conditions are slightly different for an xarray input and a netCDF input. Mostly likely a GMT bug.

@weiji14
Copy link
Member Author

weiji14 commented Jul 16, 2020

Yeah, the resampling doesn't work properly on virtualfiles perhaps. Will merge and try to track down the problem another time.

@weiji14 weiji14 merged commit 9a4c0a0 into master Jul 16, 2020
Release v0.2.x automation moved this from In progress to Done Jul 16, 2020
@weiji14 weiji14 deleted the grid_registration branch July 16, 2020 05:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

grdimage does not correctly plot xarray DataArrays when they don't contain the redundant longitude at 360 E
4 participants