Wrap GMT's standard data type GMT_GRID for grids #2398

seisman · 2023-03-05T15:33:05Z

Description of proposed changes

This PR wraps GMT's standard data type GMT_GRID for grids, so that we can store the output grid in GMT virtual files and don't have to write the output grid into a temporary file and load it as xarray.DataArray.

Expected enhancements/features

Here is a list of enhancements/features related to the GMT_GRID data type:

Wrap GMT's data structure GMT_GRID_HEADER for grid metadata [Wrap GMT's data structure GMT_GRID_HEADER for grid/image/cube headers #3127]
Wrap GMT's standard data type GMT_GRID for grids
Add the Session.virtualfile_to_raster method for handling output to an actual file or a virtual file [Similar to the Session.virtualfile_to_dataset in PR clib: Add virtualfile_to_dataset method for converting virtualfile to a dataset #3083]
Refactor wrappers that write grids: (1) Simplify the codes; (2) Get rid of temporary files; (3) Improve performance
Passing GMT_GRID directly to GMT API. Currently, for an xarray.DataArray grid, we first convert it to a matrix, then pass the matrix with some metadata (like inc, registration and gtype) to GMT. This way has proven to cause troubles (e.g., Cylindrical projections 'Q' plot incorrectly in pygmt when using xarrays #390), mainly due to grid paddings. It would be better if we could internally convert xarray.DataArray to GMT_GRID and then pass GMT_GRID to GMT API. It may help us fix many long-term issues. [Will do in future PRs]

The `GMT_GRID`/`GMT_GRID_HEADER` data structure

The GMT_GRID data structure is defined at gmt/src/gmt_resources.h:

struct GMT_GRID {	/* To hold a GMT gmt_grdfloat grid and its header in one container */
	struct GMT_GRID_HEADER *header;	/* Pointer to full GMT header for the grid */
	gmt_grdfloat *data;             /* Pointer to the gmt_grdfloat grid */
	double *x, *y;                  /* Vector of coordinates */
	void *hidden;                   /* Row-by-row machinery information [NULL] */
};

it contains the x/y array, the data array and the grid header. The grid header GMT_GRID_HEADER is defined at gmt/src/gmt_resources.h. It contains many important metadata information about the grid. Here are some important header variables:

n_columns: Number of columns
n_rows: Number of rows
registration: 0 for gridline registration and 1 for pixel registration.
wesn: min/max of x/y coordinates, i.e., [xmin, xmax, ymin, ymax]
z_min/z_max: min/max of z values
inc: x and y increment [should be equal to x[1] - x[0] and y[1] - y[0]]
z_scale_factor/z_add_offset: Used for data compressing by storing floating-point data as smaller integers in a netCDF file. The relation between the actual data z' and the data stored in netCDF file (z) is z' = z * scale_factor + offset (i.e., unpacking). Need to note that when reading a netCDF file, GMT already does the conversion, so z_min/z_max and the data array are already unpacked so we usually don't need to care about these two variables.
x_units/y_units: x/y units in the form of long_name [units]. The [units] part is optional. Some examples are: x/y/longitude [degrees_east]/latitude [degrees_north].
z_units: Also in the form of long_name [units].
pad: Extra columns/rows on west/east/south/north sides. Paddings are mainly used for grid manipulations at grid boundary. Defaults are [2, 2, 2, 2].
mx: actual columns in memory. Equal to n_columns + pad[0] + pad[1]
my: actual rows in memory. Equal to n_rows + pad[2] + pad[3]

Conversions between netCDF and `GMT_GRID`

Although netCDF is the default grid format for I/O, GMT uses the GMT_GRID data structure as its internal representation of grids. Thus, GMT needs to convert netCDF data/metadata to GMT_GRID for input and do the inverse for output. The conversions between netCDF and GMT_GRID are done by functions gmt_nc_read_grd/gmt_nc_write_grd in gmt/src/gmt_nc.c. Below Here are some technical notes:

x_units/y_units/z_units has the form of long_name [units], in which [units] is optional. long_name and units are netCDF attributes. gmtnc_get_units
When reading, GMT reads the variable's long_name and units attributes and then sets the x_units/y_units/z_units string. If long_name is not defined, then use the variable's name instead. If units is not defined, then [units] is not appended. gmtnc_get_units
For writing, GMT extracts the long_name and units attributes by parsing the x_units/y_units/z_units string, and setting the netCDF attributes if they're defined. More attributes are added if it's a geographic coordinate. If units="degrees_east", set attributes standard_name to longitude and axis to X; If units="degrees_north", set attributes standard_name to latitude and axis to Y. gmtnc_put_units
The default dimension names are lon/time/x for x axis and lat/time/y for y axis. reference
The default name for the data is always z for 2-D grid or cube for 3-D cube, unless the variable name is given. The variable name is stored in the hidden structure HH->varname, but how is HH->varname determined?. reference
When writing a netCDF file, GMT defines some global attributes, including Conventions (CF-1.7), title, history, description, GMT_version, node_offset (only for pixel grids). reference
When writing x/y vectors, axis attributes actual_range (from header-wesn) and axis (X or Y) are also set. reference
When writing the data, some attributes are defined: scale_factor (if not 1.0), add_offset (if not 0.0), cpt (if defined by GMT), _FillValue (for NaN), actual_range.

Benefits of current implementation

Simplify the codes of module wrappers
No longer to write output grid to temporary files
Performance improvements (Wrap GMT's standard data type GMT_GRID for grids #2398 (comment))

Limitations of current implementation

In addition to the metadata stored in the GMT_GRID_HEADER object, GMT also has some hidden metadata which is stored in a hidden data structure, for example, grdtype (Cartesian or Geographic grid?), cpt (the default CPT). The hidden data structure is private and may change between different GMT versions. So it's not a good practice to wrap and use the hidden data structure. But it also means we can't know the grdtype/cpt attributes when converting GMT_GRID to xarray.DataArray.

For grdtype, it's reasonable to say that the grid is geographic if x's unit is degrees_east and y's unit is degrees_north (see the note above and also CF conventions), but for cpt, we can't know if a grid has a default CPT or not.

Another limitation is that, grdcut now can't cut an image because we assume that the output is a grid. An initial fix is at #3115. But, as far as I can see, although many GMT modules can accept an image (e.g., grdcut, grdimage, grdview), only grdcut and grdmix can output an image.

References

Function gmtnc_put_units in gmt_nc.c
Function gmtnc_get_units in gmt_nc.c
Function gmtnc_grd_info in gmt_nc.c
Function gmt_grd_init in gmt_grdio.c

pygmt/src/grdcut.py

seisman · 2023-03-05T16:03:01Z

pygmt/datatypes.py

+ pad = header.pad[:]
+ x = self.x[: header.n_columns]
+ y = self.y[: header.n_rows]
+ data = np.array(self.data[: header.mx * header.my]).reshape(


The grid has n_columns columns and n_rows rows, but usually the number of data points in memory are not n_columns * n_rows due to grid pads. For some specific reasons, GMT usually adds two more rows/columns on west, east, south, north sides. So, we should have:

mx = pad[0] + n_columns + pad[1] my = pad[2] + n_rows + pad[3]

That's why we need to use the following codes to get the data matrix:

data = np.array(self.data[: header.mx * header.my]).reshape( (header.my, header.mx) )[pad[2] : header.my - pad[3], pad[0] : header.mx - pad[1]]

weiji14

This is amazing, thanks for pulling this together @seisman! I'll need to take a closer look to review this properly, but it's looking very promising 🤩

Currently the unit tests are failing, mostly because the output xarray.DataArray doesn't have the same attributes/metadata as when we were using the NetCDF tempfile. Will need to find a way to copy those metadata over somehow.

pygmt/datatypes.py

weiji14

Most of the tests on grdview (which uses grdcut for the fixture) are failing with an incorrect figure size. Is there something to be fixed with the padding?

pygmt/datatypes.py

seisman · 2023-03-06T07:33:57Z

Most of the tests on grdview (which uses grdcut for the fixture) are failing with an incorrect figure size. Is there something to be fixed with the padding?

No, if you compare the baseline and result images, you'll see that the differences are caused by the gtype property. gtype should be 1 for the grid used in grdview tests, but currently the GMT_GRID structure doesn't store the gtype information.

pygmt/clib/session.py

seisman · 2023-03-06T14:11:51Z

Most of the tests on grdview (which uses grdcut for the fixture) are failing with an incorrect figure size. Is there something to be fixed with the padding?

No, if you compare the baseline and result images, you'll see that the differences are caused by the gtype property. gtype should be 1 for the grid used in grdview tests, but currently the GMT_GRID structure doesn't store the gtype information.

Added a temporary workaround at 6280e39. And now all tests pass except test_accessor_grid_source_file_not_exist which is expected to fail.

pygmt/datatypes.py

pygmt/clib/session.py

codspeed-hq · 2024-03-14T01:16:41Z

CodSpeed Performance Report

Merging #2398 will improve performances by ×2

_{⚠️ No base runs were found}

_{Falling back to comparing datatypes/gmtgrid (84d6546) with main (e3d52e2)}

Summary

⚡ 18 improvements
✅ 80 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`datatypes/gmtgrid`	Change
⚡	`test_binstats_no_outgrid`	159.7 ms	141.2 ms	+13.06%
⚡	`test_load_remote_dataset_benchmark_with_region`	147.8 ms	121.3 ms	+21.82%
⚡	`test_dimfilter_no_outgrid`	87 ms	59.8 ms	+45.6%
⚡	`test_grdclip_no_outgrid`	86.8 ms	59 ms	+46.99%
⚡	`test_grdcut_dataarray_in_dataarray_out`	86.6 ms	58.9 ms	+46.96%
⚡	`test_grdfill_dataarray_out`	86.6 ms	59.4 ms	+45.88%
⚡	`test_grdfilter_dataarray_in_dataarray_out`	87.3 ms	59.7 ms	+46.32%
⚡	`test_grdgradient_no_outgrid`	86.9 ms	59.3 ms	+46.46%
⚡	`test_equalize_grid_no_outgrid`	86.6 ms	58.9 ms	+46.96%
⚡	`test_grdproject_no_outgrid[+proj=merc +ellps=WGS84 +units=m +width=10]`	103.7 ms	69 ms	+50.22%
⚡	`test_grdproject_no_outgrid[EPSG:3395 +width=10]`	104.3 ms	62.2 ms	+67.65%
⚡	`test_grdproject_no_outgrid[M10c]`	101.5 ms	59.3 ms	+71.3%
⚡	`test_grdsample_dataarray_out`	91.5 ms	68 ms	+34.51%
⚡	`test_sphdistance_no_outgrid`	582.9 ms	512.3 ms	+13.77%
⚡	`test_surface_input_xyz`	127.2 ms	63.1 ms	×2
⚡	`test_regular_grid_no_outgrid`	312.2 ms	247.5 ms	+26.17%
⚡	`test_xyz2grd_input_array[Dataset]`	245.3 ms	180.8 ms	+35.67%
⚡	`test_xyz2grd_input_array[array]`	248.6 ms	184 ms	+35.12%

…rid method, except grdcut

…o support grid/image/cube

pygmt/clib/session.py

seisman · 2024-04-08T13:49:21Z

I think this PR is now ready for review. Generally speaking, this PR does three things:

Wrap GMT_GRID data structure
Add Session.virtualfile_to_raster method which can read a virtualfile content as GMT_GRID/GMT_IMAGE/GMT_CUBE
Refactor all wrappers to the virtualfiles for grid output and fix any failing tests

I think it makes more sense to split this big PR into three smaller PRs. Maybe after this PR is reviewed and approved.

seisman · 2024-04-08T14:06:23Z

pygmt/clib/session.py

@@ -1946,6 +1951,70 @@ def virtualfile_to_dataset(
 return result.to_numpy()
 return result # pandas.DataFrame output

+ def virtualfile_to_raster(


In this method, kind can be grid, cube, image or None. If None, we will call Session.inquire_virtualfile to determine the family of the virtualfile.

Most GMT modules can write grids only, so we can set kind="grid" to avoid calling Session.inquire_virtualfile.

Here are some exceptions:

grdmix writes images

grdcut writes grids or images

grdimage -A writes images [Unimplemented in PyGMT]

greenspine writes grids or cubes

grdinterpolate writes grids or cubes

Most GMT modules can write grids only, so we can set kind="grid" to avoid calling Session.inquire_virtualfile.

Yes, I think we can set kind="grid", unless we know if GMT will allow passing GMT_IMAGE to certain grd* modules in the future. For example, maybe grdclip would allow GMT_IMAGE inputs at some point, and so we might want to set kind=None for future-proofing.

We need to do more refactoring if grdclip would support GMT_IMAGE in the future, since we need to check the kind of the input data (grid in, grid out or image in, image out), similar to what we have to do to grdcut (xref: #2398 (comment)).

Ok, I meant if upstream GMT were to support passing GMT_IMAGE into grdclip in the future, but not sure when that would happen. Ok to just default to kind='grid' for now, since that's the current behaviour I think.

pygmt/datatypes/grid.py

weiji14 · 2024-04-08T19:26:11Z

I think it makes more sense to split this big PR into three smaller PRs. Maybe after this PR is reviewed and approved.

Can probably keep it as one PR, I think you've done a good job of cherrypicking the GMT grid header and inquire virtualfile stuff out already, and the code here is reasonable to review.

I'll give this a proper final review after Thursday-ish as I'm attending the MIGARS conference this week (have already received thanks from a couple of PyGMT users here, so I'll pass that on 😃).

Co-authored-by: Yvonne Fröhlich <[email protected]>

pygmt/clib/session.py

pygmt/datatypes/grid.py

weiji14 · 2024-04-13T20:27:58Z

pygmt/datatypes/grid.py

+ # Flip the coordinates and data if necessary so that coordinates are ascending.
+ # `grid.sortby(list(grid.dims))` sometimes causes crashes.
+ # The solution comes from https://github.com/pydata/xarray/discussions/6695.
+ for dim in grid.dims:
+ if grid[dim][0] > grid[dim][1]:
+ grid = grid.isel({dim: slice(None, None, -1)})


Could you elaborate a bit on this part? Sort ascending is probably needed for the x (longitude) dimension. But for the y (latitude) dimension, it is quite common to sort in descending order (y going from North Pole +90 to South Pole -90). What is the typical sorting order returned from GMT?

I think GMT requires ascending order for longitude/latitude. Similar codes are already used at (The codes below use sortby, but we can't use it here due to a mysterious reason (xref #2398 (comment)).)

pygmt/pygmt/clib/conversion.py

Lines 122 to 124 in 70d666e

if any(i < 0 for i in inc): # Sort grid when there are negative increments

inc = [abs(i) for i in inc]

grid = grid.sortby(variables=list(grid.dims), ascending=True)

I tried to find some evidence from the GMT source codes but failed because the codes are so complicated. Perhaps the most straightforward evidence is from following lines (https://github.com/GenericMappingTools/gmt/blob/e493ccc220d9e997abb655cb461142ed2e7cc777/src/gmt_nc.c#L754):

/* Determine grid step */ header->inc[GMT_X] = gmt_M_get_inc (GMT, dummy[0], dummy[1], header->n_columns, registration); if (gmt_M_is_dnan(header->inc[GMT_X]) || gmt_M_is_zero (header->inc[GMT_X])) header->inc[GMT_X] = 1.0; if (header->n_columns == 1) registration = GMT_GRID_PIXEL_REG; /* The only way to have a grid like that */ /* Extend x boundaries by half if we found pixel registration */ if (registration == GMT_GRID_NODE_REG && header->registration == GMT_GRID_PIXEL_REG) header->wesn[XLO] = dummy[0] - header->inc[GMT_X] / 2.0, header->wesn[XHI] = dummy[1] + header->inc[GMT_X] / 2.0; else /* Use as is */ header->wesn[XLO] = dummy[0], header->wesn[XHI] = dummy[1];

in which dummy[0] and dummy[1] is the min/max value of the x vector. For pixel registration, GMT extends the x boundaries by half a grid. The codes only make sense when head->inc[GMT_X] is positive.

Ok, from what I can tell, this means GMT expects grids with coordinates sorted in ascending order. But here in the to_dataarray() method, we're going from GMT virtualfile to xarray.DataArray, so is this sorting still needed or can we return the grid as is? Maybe the sort ascending logic is only needed in virtualfile_from_grid (which is implemented in the dataarray_to_matrix conversion function already)?

But here in the to_dataarray() method, we're going from GMT virtualfile to xarray.DataArray, so is this sorting still needed or can we return the grid as is? Maybe the sort ascending logic is only needed in virtualfile_from_grid (which is implemented in the dataarray_to_matrix conversion function already)?

Hmmm, your comments make more sense. I tried to remove these sorting codes locally. Here is the output:

In [1]: import xarray as xr In [2]: from pygmt import which In [3]: path = which("@earth_relief_01d_g", download="a") In [4]: grid1 = xr.load_dataarray(path) In [6]: from pygmt.clib import Session ...: with Session() as lib: ...: with lib.virtualfile_out(kind="grid") as voutgrd: ...: lib.call_module("read", f"{path} {voutgrd} -Tg") ...: grid = lib.read_virtualfile(voutgrd, kind="grid") ...: grid2 = grid.contents.to_dataarray() ...: In [7]: grid1 Out[7]: <xarray.DataArray 'z' (lat: 181, lon: 361)> Size: 523kB array([[ 2865. , 2865. , 2865. , ..., 2865. , 2865. , 2865. ], [ 3088. , 3087.5, 3087. , ..., 3088.5, 3088. , 3088. ], [ 3100.5, 3100.5, 3101. , ..., 3101.5, 3101. , 3100.5], ..., [-3745.5, -3729. , -3722.5, ..., -3734. , -3742. , -3745.5], [-2940. , -2945. , -2951. , ..., -2895.5, -2921.5, -2940. ], [-3861. , -3861. , -3861. , ..., -3861. , -3861. , -3861. ]]) Coordinates: * lon (lon) float64 3kB -180.0 -179.0 -178.0 -177.0 ... 178.0 179.0 180.0 * lat (lat) float64 1kB -90.0 -89.0 -88.0 -87.0 ... 87.0 88.0 89.0 90.0 Attributes: actual_range: [-7174. 5350.] long_name: elevation (m) In [8]: grid2 Out[8]: <xarray.DataArray 'z' (lat: 181, lon: 361)> Size: 523kB array([[-3861. , -3861. , -3861. , ..., -3861. , -3861. , -3861. ], [-2940. , -2945. , -2951. , ..., -2895.5, -2921.5, -2940. ], [-3745.5, -3729. , -3722.5, ..., -3734. , -3742. , -3745.5], ..., [ 3100.5, 3100.5, 3101. , ..., 3101.5, 3101. , 3100.5], [ 3088. , 3087.5, 3087. , ..., 3088.5, 3088. , 3088. ], [ 2865. , 2865. , 2865. , ..., 2865. , 2865. , 2865. ]]) Coordinates: * lat (lat) float64 1kB 90.0 89.0 88.0 87.0 ... -87.0 -88.0 -89.0 -90.0 * lon (lon) float64 3kB -180.0 -179.0 -178.0 -177.0 ... 178.0 179.0 180.0 Attributes: Conventions: CF-1.7 title: SRTM15 Earth Relief v2.5.5 at 01 arc degree history: description: Reduced by Gaussian Cartesian filtering (314.5 km fullwidt... long_name: elevation (m) actual_range: [-7174. 5350.]

It seems xr.load_dataarray returns an xarray.DataArray with ascending lon/lat coodinates, but now GMT returns descending latitude. I think I was trying to make GMT_GRID.to_dataaray() return the exactly same xarray.DataArray object as xr.load_dataarray. But, maybe there are some misunderstandings here.

It seems xr.load_dataarray returns an xarray.DataArray with ascending lon/lat coodinates, but now GMT returns descending latitude. I think I was trying to make GMT_GRID.to_dataaray() return the exactly same xarray.DataArray object as xr.load_dataarray. But, maybe there are some misunderstandings here.

Hmm yeah, I don't think we need to match the behaviour of xr.load_dataarray here. I tried with rioxarray.open_rasterio and it gave descending latitude:

import pygmt import rioxarray import xarray as xr path = pygmt.which("@earth_relief_01d_g", download="a") grid2 = rioxarray.open_rasterio(path) print(grid2) # <xarray.DataArray 'z' (band: 1, y: 181, x: 361)> Size: 131kB # [65341 values with dtype=int16] # Coordinates: # * band (band) int64 8B 1 # * x (x) float64 3kB -180.0 -179.0 -178.0 ... 178.0 179.0 180.0 # * y (y) float64 1kB 90.0 89.0 88.0 87.0 ... -87.0 -88.0 -89.0 -90.0 # spatial_ref int64 8B 0 # Attributes: (12/20) # lat#actual_range: [-90. 90.] # lat#axis: Y # lat#long_name: latitude # lat#standard_name: latitude # lat#units: degrees_north # lon#actual_range: [-180. 180.] # ... ... # title: SRTM15 Earth Relief v2.5.5 at 01 arc degree # actual_range: [-7174. 5350.] # long_name: elevation (m) # scale_factor: 0.5 # _FillValue: -32768 # add_offset: 0.0

so we could probably remove the sorting in to_dataarray(), assuming this doesn't break any of the unit tests.

I don't think we need to match the behaviour of xr.load_dataarray here.

Matching the behavior of xr.load_dataarray makes it easier to test the output of the virtualfiles using xr.DataArray.indentical.

I don't think we need to match the behaviour of xr.load_dataarray here.

Matching the behavior of xr.load_dataarray makes it easier to test the output of the virtualfiles using xr.DataArray.identical.

The returned attributes are not the same, so xr.testing.assert_identical will fail anyway.

Actually, we're using xr.testing.assert_allclose in the tests. If we don't sort the coordinates ascendingly, then we have to update all these tests.

Hmm yeah, I just tried running make test locally after commenting out the sorting code, and got 19 test failures. Most of these are from the expected_grid fixture returning a DataArray with y (latitude) sorted in ascending order:

================================================================================== short test summary info =================================================================================== FAILED ../pygmt/datatypes/grid.py::pygmt.datatypes.grid._GMT_GRID.to_dataarray FAILED ../pygmt/tests/test_dimfilter.py::test_dimfilter_no_outgrid - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdclip.py::test_grdclip_no_outgrid - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdfill.py::test_grdfill_dataarray_out - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdfill.py::test_grdfill_asymmetric_pad - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdfilter.py::test_grdfilter_dataarray_in_dataarray_out - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdgradient.py::test_grdgradient_no_outgrid - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdhisteq.py::test_equalize_grid_no_outgrid - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdlandmask.py::test_grdlandmask_no_outgrid - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdproject.py::test_grdproject_no_outgrid[M10c] - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdproject.py::test_grdproject_no_outgrid[EPSG:3395 +width=10] - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdproject.py::test_grdproject_no_outgrid[+proj=merc +ellps=WGS84 +units=m +width=10] - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_grdsample.py::test_grdsample_dataarray_out - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_surface.py::test_surface_input_file - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_surface.py::test_surface_input_data_array - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_surface.py::test_surface_input_xyz - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_triangulate.py::test_regular_grid_no_outgrid - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_xyz2grd.py::test_xyz2grd_input_array[array] - AssertionError: Left and right DataArray objects are not close FAILED ../pygmt/tests/test_xyz2grd.py::test_xyz2grd_input_array[Dataset] - AssertionError: Left and right DataArray objects are not close ======================================================== 19 failed, 657 passed, 4 skipped, 5 xfailed, 2 xpassed, 5 warnings in 58.94s ========================================================

Let's just keep the ascending sort for now then.

pygmt/datatypes/grid.py

weiji14 · 2024-04-13T20:32:10Z

pygmt/tests/test_datasets_load_remote_datasets.py

@@ -35,8 +33,9 @@ def test_load_remote_dataset_benchmark_with_region():
 assert data.gmt.registration == 0
 assert data.shape == (11, 21)
 # The cpt attribute was added since GMT 6.4.0
- if Version(__gmt_version__) >= Version("6.4.0"):
- assert data.attrs["cpt"] == "@earth_age.cpt"
+ # Can't access the cpt attribute using virtual files


Ok, so auto colormap won't work after this PR is merged? We'll need to remember to document this in the PyGMT v0.12.0 changelog.

Co-authored-by: Wei Ji <[email protected]>

pygmt/clib/session.py

weiji14

Thanks again @seisman for the wonderful work here, from digging into the GMT C code and handling all the grid header parsing! There are still a few limitations such in regard to the CPT attribute and grdtype detection, as you've mentioned in the top post, but we can address that in the future. The speedup improvements will be worth it 😄

weiji14 · 2024-04-16T02:40:58Z

pygmt/tests/test_datasets_load_remote_datasets.py

@@ -35,8 +33,9 @@ def test_load_remote_dataset_benchmark_with_region():
 assert data.gmt.registration == 0
 assert data.shape == (11, 21)
 # The cpt attribute was added since GMT 6.4.0
- if Version(__gmt_version__) >= Version("6.4.0"):
- assert data.attrs["cpt"] == "@earth_age.cpt"
+ # Can't access the cpt attribute using virtual files


Ok, let's keep track about this .cpt attribute idea at #499 (comment).

seisman changed the title ~~Support GMT's standard data type GMT_GRID for grids~~ WIP: Support GMT's standard data type GMT_GRID for grids Mar 5, 2023

seisman added maintenance Boring but important stuff for the core devs enhancement Improving an existing feature labels Mar 5, 2023

seisman commented Mar 5, 2023

View reviewed changes

pygmt/src/grdcut.py Outdated Show resolved Hide resolved

seisman commented Mar 5, 2023

View reviewed changes

weiji14 removed the maintenance Boring but important stuff for the core devs label Mar 5, 2023

weiji14 reviewed Mar 5, 2023

View reviewed changes

pygmt/datatypes.py Outdated Show resolved Hide resolved

pygmt/datatypes.py Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

seisman mentioned this pull request Mar 6, 2023

Do not raise a KeyError exception if attribute actual_range doesn't exist #2400

Merged

7 tasks

weiji14 reviewed Mar 6, 2023

View reviewed changes

pygmt/datatypes.py Outdated Show resolved Hide resolved

michaelgrund reviewed Mar 6, 2023

View reviewed changes

pygmt/clib/session.py Outdated Show resolved Hide resolved

pygmt/clib/session.py Outdated Show resolved Hide resolved

pygmt/clib/session.py Outdated Show resolved Hide resolved

weiji14 mentioned this pull request Mar 23, 2023

Add Figure.tilemap to plot XYZ tile maps #2394

Merged

7 tasks

weiji14 reviewed Jun 10, 2023

View reviewed changes

pygmt/datatypes.py Outdated Show resolved Hide resolved

pygmt/clib/session.py Outdated Show resolved Hide resolved

This was referenced Oct 9, 2023

Wrap GMT's standard data type GMT_DATASET for table inputs #2729

Merged

Get rid of temporary files from pygmt functions and plotting methods #2730

Closed

seisman force-pushed the datatypes/gmtgrid branch from 849ed60 to 26f71ab Compare October 10, 2023 07:35

seisman changed the title ~~WIP: Support GMT's standard data type GMT_GRID for grids~~ POC: Support GMT's standard data type GMT_GRID for grids Oct 11, 2023

seisman force-pushed the datatypes/gmtgrid branch from 0459339 to 880bff9 Compare October 11, 2023 05:46

seisman added this to the 0.12.0 milestone Dec 11, 2023

seisman mentioned this pull request Jan 14, 2024

clib: Wrap the GMT API function GMT_Read_VirtualFile #2993

Merged

7 tasks

seisman mentioned this pull request Feb 20, 2024

clib: Add the virtualfile_out method for creating output virtualfile #3057

Merged

7 tasks

seisman changed the title ~~POC: Support GMT's standard data type GMT_GRID for grids~~ Support GMT's standard data type GMT_GRID for grids Feb 20, 2024

seisman removed this from the 0.12.0 milestone Feb 26, 2024

seisman mentioned this pull request Mar 6, 2024

grdsample (pixel to gridline) returns nans #3090

Open

seisman added the run/benchmark Trigger the benchmark workflow in PRs label Mar 13, 2024

seisman force-pushed the datatypes/gmtgrid branch from 26977ed to f2445d1 Compare March 15, 2024 12:23

seisman mentioned this pull request Mar 16, 2024

Add function to load Blue Marble dataset #2235

Draft

6 tasks

seisman added 4 commits April 1, 2024 19:46

Add Session.virtualfile_to_grid method for outputing a grid

daad3af

Refactor all grid-related modules to use the Session.virtualfile_to_g…

cd2c5d7

…rid method, except grdcut

Improve the docstrings of the outgrid parameter

c5435b0

Fix two failing tests

ddb7a81

seisman force-pushed the datatypes/gmtgrid branch from 4ba67df to ddb7a81 Compare April 1, 2024 11:46

seisman added 4 commits April 7, 2024 20:15

Merge branch 'main' into datatypes/gmtgrid

0b81cbb

Rename Session.virtualfile_to_grid to Session.virtualfile_to_raster t…

500568d

…o support grid/image/cube

Fix static type issues

6b38734

Explicitly set kind='grid' to avoid calling Session.inquire_virtualfile

ab9a838

weiji14 reviewed Apr 7, 2024

View reviewed changes

pygmt/clib/session.py Outdated Show resolved Hide resolved

Merge branch 'main' into datatypes/gmtgrid

5735122

seisman added the needs review This PR has higher priority and needs review. label Apr 8, 2024

seisman commented Apr 8, 2024

View reviewed changes

yvonnefroehlich reviewed Apr 8, 2024

View reviewed changes

pygmt/datatypes/grid.py Outdated Show resolved Hide resolved

seisman and others added 2 commits April 9, 2024 13:07

Fix a typo [skip ci]

01664f8

Co-authored-by: Yvonne Fröhlich <[email protected]>

Merge branch 'main' into datatypes/gmtgrid

863ac43

weiji14 reviewed Apr 13, 2024

View reviewed changes

Apply suggestions from code review

2e51ba7

Co-authored-by: Wei Ji <[email protected]>

seisman commented Apr 15, 2024

View reviewed changes

pygmt/clib/session.py Outdated Show resolved Hide resolved

seisman added 2 commits April 16, 2024 08:43

Merge branch 'main' into datatypes/gmtgrid

b3ce412

Session.virtualfile_to_raster: let kind default to 'grid'

97c3cc9

weiji14 mentioned this pull request Apr 16, 2024

Make a gmt xarray accessor to store metadata from grdinfo #499

Open

weiji14 approved these changes Apr 16, 2024

View reviewed changes

weiji14 added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Apr 16, 2024

seisman removed the final review call This PR requires final review and approval from a second reviewer label Apr 16, 2024

seisman merged commit 809880c into main Apr 16, 2024
20 of 21 checks passed

seisman deleted the datatypes/gmtgrid branch April 16, 2024 13:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrap GMT's standard data type GMT_GRID for grids #2398

Wrap GMT's standard data type GMT_GRID for grids #2398

seisman commented Mar 5, 2023 •

edited

Loading

seisman Mar 5, 2023

weiji14 left a comment

This comment was marked as outdated.

weiji14 left a comment

seisman commented Mar 6, 2023

seisman commented Mar 6, 2023

codspeed-hq bot commented Mar 14, 2024 •

edited

Loading

seisman commented Apr 8, 2024

seisman Apr 8, 2024

weiji14 Apr 13, 2024

seisman Apr 15, 2024

weiji14 Apr 15, 2024

weiji14 commented Apr 8, 2024

weiji14 Apr 13, 2024

seisman Apr 15, 2024

weiji14 Apr 15, 2024 •

edited

Loading

seisman Apr 16, 2024

weiji14 Apr 16, 2024 •

edited

Loading

seisman Apr 16, 2024

weiji14 Apr 16, 2024

seisman Apr 16, 2024

weiji14 Apr 16, 2024 •

edited

Loading

weiji14 Apr 13, 2024

weiji14 left a comment

weiji14 Apr 16, 2024

	if any(i < 0 for i in inc): # Sort grid when there are negative increments
	inc = [abs(i) for i in inc]
	grid = grid.sortby(variables=list(grid.dims), ascending=True)

Wrap GMT's standard data type GMT_GRID for grids #2398

Wrap GMT's standard data type GMT_GRID for grids #2398

Conversation

seisman commented Mar 5, 2023 • edited Loading

Expected enhancements/features

The GMT_GRID/GMT_GRID_HEADER data structure

Conversions between netCDF and GMT_GRID

Benefits of current implementation

Limitations of current implementation

References

Choose a reason for hiding this comment

weiji14 left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

weiji14 left a comment

Choose a reason for hiding this comment

seisman commented Mar 6, 2023

seisman commented Mar 6, 2023

codspeed-hq bot commented Mar 14, 2024 • edited Loading

Merging #2398 will improve performances by ×2

Summary

Benchmarks breakdown

seisman commented Apr 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weiji14 commented Apr 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weiji14 Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weiji14 Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weiji14 Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weiji14 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seisman commented Mar 5, 2023 •

edited

Loading

The `GMT_GRID`/`GMT_GRID_HEADER` data structure

Conversions between netCDF and `GMT_GRID`

codspeed-hq bot commented Mar 14, 2024 •

edited

Loading

weiji14 Apr 15, 2024 •

edited

Loading

weiji14 Apr 16, 2024 •

edited

Loading

weiji14 Apr 16, 2024 •

edited

Loading