Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GMT_Write_Data for netCDF grids changes the grid #3800

Open
PaulWessel opened this issue Aug 1, 2020 · 6 comments
Open

GMT_Write_Data for netCDF grids changes the grid #3800

PaulWessel opened this issue Aug 1, 2020 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@PaulWessel
Copy link
Member

Description of the problem

The netCDF grid writing library in GMT deals with chunking and optional or auto scaling/offsetting the grid to take up as little space as possible. This works well for most cases and all modules since the modules were written as stand-alone command-line tools and when one finished the process died. However, when one is using the API to write a grid and then continue to use the grid after writing, the values may have changed due to the fact that the data may have been scaled and offset to fit some specific storage unit like short int. Thus, a module calling another module passing a virtual file may find its grid no longer making sense.

Solutions:

The only way to completely avoid this problem is to undo the translation after the writing has completed. This will add more CPU cycles but will treat the grid as "read-only". Otherwise, we can never guarantee that a grid can continue to be used after writing. This is especially bad if the input grid is a reference from the outside. So I think we simply have to do this.

The only relief would be if one masses in a bit-mode flag like GMT_GRID_FINISHED or something, so that someone who is writing a specific set of steps and know that the process is done after teh writing done spend time undoing a scaling, since it won't be used again. We can never be sure of that in the API since anyone can write a tool that calls surface with -Gresult.nc=ns+s0.1 and then call grdimage, and be surprised at the result.

So @joa-quim and @seisman, I will undo any translation done in gmt_nc.c unless a flag (I may think of another name) is passed, and GMT will not pass that flag unless it is a grid that cannot be returned anywhere.

@PaulWessel PaulWessel added the bug Something isn't working label Aug 1, 2020
@PaulWessel PaulWessel self-assigned this Aug 1, 2020
@PaulWessel
Copy link
Member Author

A related issue is the fact that gmt_nc_write_grd changes the default chunk setting based on the grid at hand, but never reset it. Thus, when I tried to do two calls to GMT_Write_Data I got an error since the chunk size for the first grid was inappropriate for the second. This is being addressed in another PR, but it just points out that there were two assumptions built into this code:

  1. The grid would not be used after the writing
  2. There would only be one call to write a grid during the module lifetime

Neither of those make sense in the API in general.

@PaulWessel
Copy link
Member Author

Any comments on this @GenericMappingTools/core ? Seems we should correct this in one way or another, perhaps when external use is detected via API->external and GMT_IS_REFERENCE is passed?

@maxrjones
Copy link
Member

Seems necessary to me. Did the solution to the related issue get solved already?

@PaulWessel
Copy link
Member Author

I think so, but I would be in a better position to answer if the idiot who posted the comment above had cared to add the link to that PR he referred to...

@PaulWessel
Copy link
Member Author

I agree it seems necessary - provided the grid is passed in is read-only. But let's think about this a bit more:

  • GMT/MEX: grids are column-wise so GMT must duplicate anyway to switch from row-orientation, so this cannot happen.
  • GMT.jl: I think the same thing is true here, so no grid matrix is passed as is even via virtual files, right @joa-quim ?
  • PyGMT: Here I think there is room for mischief in that any grid passed in via GMT_IS_MATRIX and GMT_IS_REFERENCE might be modified in-memory, right @seisman?

So I wonder if this is specifically limited to situations where

  1. The grid was passed in from PyGMT
  2. The grid was flagged as a read-only matrix
  3. The grid was used to write a netCDF output grid file

Please comment on this to see the extent of fixes needed.

@joa-quim
Copy link
Member

Yes, GMT.jl is like GMTMEX. Column-wise so a copy is needed. Sometime in future I may try to send in a Julia allocated memory ... but then I would have to deal with the PAD in Julia (chills ...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants