Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap gmtselect #1429

Merged
merged 16 commits into from
Oct 29, 2021
Merged

Wrap gmtselect #1429

merged 16 commits into from
Oct 29, 2021

Conversation

weiji14
Copy link
Member

@weiji14 weiji14 commented Aug 10, 2021

Description of proposed changes

Wrapping the gmtselect function which "Selects data table subsets based on multiple spatial criteria". For a GMT example, see https://docs.generic-mapping-tools.org/6.2/gallery/ex24.html.

Preview docs at https://pygmt-git-wrap-gmtselect-gmt.vercel.app/api/generated/pygmt.select.html

Crossref GMT.jl docs at https://www.generic-mapping-tools.org/GMT.jl/v0.31/#GMT.gmtselect.

Parameters/Aliases to wrap:

  • A area_thresh
  • C dist2pt?
  • D resolution?
  • E boundary
  • F polygon
  • G gridmask
  • I reverse
  • L dist2line?
  • R region
  • N mask
  • Z z_subregion/zsubregion

To keep this PR small, I would prefer implementing only 4 out of 7 of the possible gmtselect filters, specifically -A (once #1426 is done), -R rectangular region subset, -D/-N coastline subset, and -Z z value data subset. These, plus the reverse (-I) and grid subset (-G) option. The others can be done in a follow-up PR.

Other implementation points to consider (may need to be done in other PRs):

Addresses first part of #1427.

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash commands are:

  • /format: automatically format and lint the code
  • /test-gmt-dev: run full tests on the latest GMT development version

@weiji14 weiji14 added the feature Brand new feature label Aug 10, 2021
@weiji14 weiji14 added this to the 0.5.0 milestone Aug 10, 2021
@weiji14 weiji14 self-assigned this Aug 10, 2021
Initial commit for wrapping the gmtselect function for #1427
which selects data table subsets based on multiple spatial
criteria. Original GMT `gmtselect` documentation is at
https://docs.generic-mapping-tools.org/6.2/gmtselect.html.
Aliased non-common optional parameters reverse (I) and
z_subregion (Z).
@weiji14 weiji14 requested a review from a team October 3, 2021 11:04
@willschlitzer
Copy link
Contributor

@weiji14 Is the intent just to have the output of this as the file or a DataFrame, or should their be an option for a numpy array as well?

@weiji14
Copy link
Member Author

weiji14 commented Oct 5, 2021

@weiji14 Is the intent just to have the output of this as the file or a DataFrame, or should their be an option for a numpy array as well?

Yeah, maybe just a file or pandas.DataFrame for now, until someone makes a reusable function to handle #1318.

Comment on lines +63 to +64
Pass in either a file name to an ASCII data table, a 2D
{table-classes}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Pass in either a file name to an ASCII data table, a 2D
{table-classes}.
Pass in either a file name to an ASCII data table or a 2D
{table-classes}.

I"m not familiar with using gmtselect, but can't it accept 3D tables (as indicated by z_subregion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring gets rendered like so:

image

So 2D numpy.array is correct. I don't think it's common to speak of a 3D table (though I did find https://stackoverflow.com/questions/24290495/constructing-3d-pandas-dataframe), but I know what you mean by having a z-column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like "2D or 3D {table-classes}"? I understand that 3D tables aren't the norm, but it doesn't make sense to have a parameter specifically for a 3D table but then say that gmtselect only accepts a 2D table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I get when I search for 3D data table:

image

If we are talking about a 3D numpy array, the shape will be something like (1, 2, 3), but in this case, our input cannot be like that. It can only be a 2D shape like (2, 3). The extra z-column isn't considered an extra dimension, more like an extra attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! That makes sense to me; I was thinking a 3D table meant one with data for the third dimension.

{A}
reverse : str
[**cflrsz**].
Reverses the sense of the test for each of the criteria specified:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reverses the sense of the test for each of the criteria specified:
Reverses the criteria of the test for each of the criteria specified:

The use of "sense" doesn't make much sense to me. I know it's from the GMT docs, but would there be a better way to phrase this?

Copy link
Member Author

@weiji14 weiji14 Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it doesn't sound nice, but I think the following statements ("select records NOT inside/within ...") should (hopefully) make it clear to users what is meant by reverse. So maybe just keep it as with the GMT docs?

Copy link
Contributor

@willschlitzer willschlitzer Oct 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's easier to keep it in line with the GMT docs, I'm fine with it. But I might open a PR over at GMT; I don't think the current version makes much sense.

@willschlitzer
Copy link
Contributor

@weiji14 Are you intending to get this pull request complete for v0.5.0?

@weiji14 weiji14 marked this pull request as ready for review October 25, 2021 22:25
@weiji14 weiji14 mentioned this pull request Oct 25, 2021
35 tasks
@weiji14
Copy link
Member Author

weiji14 commented Oct 25, 2021

@weiji14 Are you intending to get this pull request complete for v0.5.0?

Yes, I've added this PR to the priority list at #1576 (comment).

pygmt/src/select.py Outdated Show resolved Hide resolved
Specifically, resolution (D), gridmask (G) and mask (N).
These aliases are currently undocumented/disabled,
but will be implemented/enabled in the future.
pygmt/src/select.py Outdated Show resolved Hide resolved
@willschlitzer willschlitzer added the final review call This PR requires final review and approval from a second reviewer label Oct 28, 2021
@willschlitzer willschlitzer merged commit bc140cb into main Oct 29, 2021
@willschlitzer willschlitzer deleted the wrap/gmtselect branch October 29, 2021 04:34
Release v0.5.x automation moved this from In progress to Done Oct 29, 2021
@willschlitzer willschlitzer moved this from Done to In progress in Release v0.5.x Oct 29, 2021
@seisman seisman removed the final review call This PR requires final review and approval from a second reviewer label Oct 29, 2021
@weiji14 weiji14 moved this from In progress to Done in Release v0.5.x Oct 31, 2021
@seisman seisman mentioned this pull request Mar 9, 2022
sixy6e pushed a commit to sixy6e/pygmt that referenced this pull request Dec 21, 2022
* Wrap  the gmtselect function which selects data table subsets based on multiple spatial
criteria. Aliased non-common optional parameters reverse (I) and
z_subregion (Z), area_thresh (A), resolution (D), gridmask (G) and mask (N).
*Add tests for select
*Add imports for select

Co-authored-by: Will Schlitzer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Brand new feature
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants