Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change GMT_IN|GMT_IS_REFERENCE to GMT_IN #517

Draft
wants to merge 35 commits into
base: main
Choose a base branch
from
Draft

Change GMT_IN|GMT_IS_REFERENCE to GMT_IN #517

wants to merge 35 commits into from

Conversation

seisman
Copy link
Member

@seisman seisman commented Jul 12, 2020

Description of proposed changes

Changing GMT_IN|GMT_IS_REFERENCE to GMT_IN when passing data to GMT, to find more potential bugs in GMT.

Fixes #

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

@weiji14
Copy link
Member

weiji14 commented Jul 12, 2020

Wouldn't this be reverting #490? But test_plot_lines_with_arrows passes, so does that mean just GMT_IN is okay?

@seisman
Copy link
Member Author

seisman commented Jul 12, 2020

Yes, it's a little complicated. See GenericMappingTools/gmt#3528 for the long discussions. Please let me know if it's still unclear to you.

@weiji14 weiji14 changed the title Chaneg GMT_IN|GMT_IS_REFERENCE to GMT_IN Change GMT_IN|GMT_IS_REFERENCE to GMT_IN Jul 12, 2020
@weiji14
Copy link
Member

weiji14 commented Jul 12, 2020

Yeah, it looks complicated. The discussion looks to be focused on vectors, I'm not quite sure how grids are dealt with.

As an aside, it looks like the issue with #515 is gone here?!! Edit no, the below image is wrong (See correct image at v0.1.2) See https://pygmt-k7id52784.vercel.app/gallery/grid/track_sampling.html

image

@seisman
Copy link
Member Author

seisman commented Jul 12, 2020

The discussion looks to be focused on vectors, I'm not quite sure how grids are dealt with.

They're all handled in the same way. I'll try to write a short summary for this.

External programs like PyGMT can pass any data (vector, matrix, grid) to GMT API. It's safe if GMT only read the memory, but sometimes GMT also changes the memory or even free them. That's why we have so many crashes with GMT 6.0.

For any input data, we can add a modifier to GMT_IN to tell GMT what to do with the input memory. The modifier can be either GMT_IS_REFERENCE or GMT_IS_DUPLICATE.

  • GMT_IN|GMT_IS_DUPLICATE is the same as GMT_IN. In this mode, GMT will make a copy of the input memory, and do all operations (read, write, or free) to the copy only. It's the safest ways, but it requires more memory due to the duplications.
  • GMT_IN|GMT_IS_REFERENCE: In this mode, GMT uses the memory directly. If the module we call don't change/free the memory, then everything works. However, if a module changes the input memory, we may see crashes or random results.

Actually, when GMT_IN|GMT_IS_REFERENCE is used, GMT does more things for us. As I understand it, some GMT functions always changes the memory, thus, it's a wrong choice to use GMT_IN|GMT_IS_REFERENCE when calling these functions. GMT is very clever and changes GMT_IN|GMT_IS_REFERENCE to GMT_IN|GMT_IS_DUPLICATE to avoid the crashes. However, it's difficult to find all the places where GMT needs to do the change. That's why GMT_IN|GMT_IS_REFERENCE usually works, but sometime crashes.

@seisman
Copy link
Member Author

seisman commented Jul 12, 2020

Using GMT_IN or GMT_IN|GMT_IS_DUPLICATE is the safest and the recommended way for PyGMT, but using GMT_IN|GMT_IS_REFERENCE can help GMT find more bugs. I believe we want a more stable PyGMT.

@weiji14
Copy link
Member

weiji14 commented Jul 12, 2020

Using GMT_IN or GMT_IN|GMT_IS_DUPLICATE is the safest and the recommended way for PyGMT, but using GMT_IN|GMT_IS_REFERENCE can help GMT find more bugs. I believe we want a more stable PyGMT.

I don't know, I'd actually prefer to find bugs in GMT 😆. On a more serious note, if we're duplicating memory, then we might as well write the data to file and pass that into PyGMT. There should be a smarter way to handle this.

@seisman
Copy link
Member Author

seisman commented Jul 12, 2020

if we're duplicating memory, then we might as well write the data to file and pass that into PyGMT.

Sorry, I don't understand what you mean here. Could you explain it a little more?

@weiji14
Copy link
Member

weiji14 commented Jul 12, 2020

Sorry, what I mean is that one of PyGMT's core features is being able to pass in PyData data structures (e.g. numpy, pandas, xarray) directly into GMT, that is, without going through the pain of writing it to a .txt, .csv or .nc file. I know that GMT_IS_DUPLICATE isn't writing to file, but to memory/RAM, but it seems wasteful to have an intermediate copy, and it'll probably slow down PyGMT code quite significantly too.

@weiji14
Copy link
Member

weiji14 commented Jul 12, 2020

That, said, we were using GMT_IN (i.e. GMT_IS_DUPLICATE?) before for vectors, so maybe it's not that big of an issue. But I'm not sure if it's wise to duplicate xarray.DataArray grid files that might be more memory heavy.

@weiji14
Copy link
Member

weiji14 commented Jul 12, 2020

  • GMT_IN|GMT_IS_DUPLICATE is the same as GMT_IN. In this mode, GMT will make a copy of the input memory, and do all operations (read, write, or free) to the copy only. It's the safest ways, but it requires more memory due to the duplications.

Are you sure GMT_IN and GMT_IN|GMT_IS_DUPLICATE is the same? I tried changing this line for #476:

args = (family, geometry, "GMT_IN|GMT_IS_REFERENCE", gmt_grid)

When using GMT_IN, only test_grdimage_over_dateline fails (same as GMT_IS_REFERENCE).

If I switch to using GMT_IN|GMT_IS_DUPLICATE, I get pygmt.exceptions.GMTCLibError: Failed to create a virtual file. on 15 tests

FAILED ../pygmt/clib/session.py::pygmt.clib.session.Session
FAILED ../pygmt/clib/session.py::pygmt.clib.session.Session.virtualfile_from_grid
FAILED ../pygmt/tests/test_grdimage.py::test_grdimage - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_grdimage.py::test_grdimage_slice - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_grdimage.py::test_grdimage_over_dateline - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_grdinfo.py::test_grdinfo - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_grdview.py::test_grdview_wrong_kind_of_drapegrid - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_to_plot_grid - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_truncated_to_zlow_zhigh - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_truncated_at_zlow_only - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_truncated_at_zhigh_only - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_reverse_color_only - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_reverse_zsign_only - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_reverse_color_and_zsign - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.
FAILED ../pygmt/tests/test_makecpt.py::test_makecpt_continuous - pygmt.exceptions.GMTCLibError: Failed to create a virtual file.

@seisman
Copy link
Member Author

seisman commented Jul 14, 2020

Are you sure GMT_IN and GMT_IN|GMT_IS_DUPLICATE is the same?

That's based on my understanding of What Paul said in GenericMappingTools/gmt#3528.

@weiji14
Copy link
Member

weiji14 commented Jul 16, 2020

The examples get weirder and weirder, from black to gray to white! 🤣

image image image
b64f371 72f1c24 c54d54d

@seisman
Copy link
Member Author

seisman commented Feb 11, 2021

/test-gmt-dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
longterm Long standing issues that need to be resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants