Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to automate copying of GMT documentation to PyGMT #895

Open
weiji14 opened this issue Feb 14, 2021 · 9 comments
Open

How to automate copying of GMT documentation to PyGMT #895

weiji14 opened this issue Feb 14, 2021 · 9 comments
Labels
help wanted Helping hands are appreciated question Further information is requested

Comments

@weiji14
Copy link
Member

weiji14 commented Feb 14, 2021

The time we spend on writing and reviewing documentation for PyGMT is getting insane. Specifically, I'm talking about the docstrings in modules (not the tutorials/gallery examples). These are some of the formatting fixes we apply:

  1. Turn short aliases to long aliases (Turn all short aliases into long form #474)
  2. Check that italics/bold/code-block formatting is applied correctly (How to reproduce original GMT arguments in PyGMT documentation #631)
  3. Wrap to 79 characters (Wrap docstrings to 79 chars and check with flake8 #384)

And there are perhaps more 'standards' to be applied soon (#884, #886, etc).

There should be an automated or semi-automated script to just copy things from the canonical GMT function docstring, apply the above formatting standards, and paste it to the PyGMT function. Say when we want to:

  1. wrap a new module
  2. keep up to date with a new GMT release

This would likely require some coordination with upstream GMT. I think there are smart people who might have some ideas around this.

Are you willing to help implement and maintain this feature? Yes, but need teamwork

@weiji14 weiji14 added question Further information is requested help wanted Helping hands are appreciated labels Feb 14, 2021
@liamtoney
Copy link
Member

We should be aware of where it doesn't quite make sense to take the GMT docs w/o appending extra info. For example, currently the docs for e.g. the region arg on master show:

Screen Shot 2021-02-14 at 11 24 20 PM

This is confusing since we really recommend for almost all use cases the form: [xmin, xmax, ymin, ymax] as a list, not the slash string syntax from GMT. But the script could account for these types of args!

@maxrjones
Copy link
Member

Related to both this and #1042, I am interested in whether there is a way to use some NLP tricks to identify shared non-common options between different GMT modules. This would be helpful for ensuring that we aren't redundantly creating PyGMT documentation when helpers could be used and would generally make it easier to develop aliases in a consistent manner.

@maxrjones
Copy link
Member

Even though it's a lot of work to write/review the docstrings, I prefer that for now because there are a lot of pending improvements to the GMT documentation (see GenericMappingTools/gmt#4678 for some examples). So, copying the GMT documentation as is may create more work later on by propagating some of the current limitations.

I think a compromise would be to automatically create a link to the relevant option in GMT for each PyGMT option. The easiest place to do this would be in fmt_docstring(). I think this would be more useful than the current format of linking to the GMT documentation using "Full option list at <gmt-link>", since the use of single-character options is undocumented (#1203 (comment)) and will be eventually deprecated (#262). Each of the single letters in the alias list (example shown below) could link to the related option in the GMT documentation (e.g., https://docs.generic-mapping-tools.org/latest/basemap.html#l). Even after the use of single character options is deprecated, I would prefer that a table relating PyGMT options to their GMT counterparts remains in the docstrings for users with background in using GMT.
image

In addition, we could create more automated checks rather than automated copying. For example, checking that parameters are formatted as code which I think is one of the easier things to miss when writing the docstrings.

@seisman
Copy link
Member

seisman commented Apr 12, 2021

I think a compromise would be to automatically create a link to the relevant option in GMT for each PyGMT option. The easiest place to do this would be in fmt_docstring().

Sounds a good idea.

BTW, I feel that the long list of alias waste so much spaces on the right side, it may be better to use the .. hlist directive instead (see https://sphinx-rtd-theme.readthedocs.io/en/stable/demo/lists_tables.html#hlists for what it looks like)

@weiji14
Copy link
Member Author

weiji14 commented Apr 12, 2021

Related to both this and #1042, I am interested in whether there is a way to use some NLP tricks to identify shared non-common options between different GMT modules. This would be helpful for ensuring that we aren't redundantly creating PyGMT documentation when helpers could be used and would generally make it easier to develop aliases in a consistent manner.

One of the ideas I had when I opened this issue (that requires a lot of work) was to have upstream GMT use standardized placeholders for parameters/arguments, e.g. using < -X > or { -X } (as with Jinja). My thinking was that this problem isn't restricted to just PyGMT, but also GMT.jl and potential wrappers in the future (e.g. GMT for R?). I do realize this is a near impossible task, but just would like to put this idea out here so that people don't keep reinventing the wheel for each GMT wrapper.

@maxrjones
Copy link
Member

Related to both this and #1042, I am interested in whether there is a way to use some NLP tricks to identify shared non-common options between different GMT modules. This would be helpful for ensuring that we aren't redundantly creating PyGMT documentation when helpers could be used and would generally make it easier to develop aliases in a consistent manner.

One of the ideas I had when I opened this issue (that requires a lot of work) was to have upstream GMT use standardized placeholders for parameters/arguments, e.g. using < -X > or { -X } (as with Jinja). My thinking was that this problem isn't restricted to just PyGMT, but also GMT.jl and potential wrappers in the future (e.g. GMT for R?). I do realize this is a near impossible task, but just would like to put this idea out here so that people don't keep reinventing the wheel for each GMT wrapper.

I'm not sure that I completely understand how you're envisioning the standardized placeholders, if you don't mind explaining a bit more. In GenericMappingTools/gmt#4915 I tried to make it so that all modules that use -F to put a box behind something include the same source file for the documentation (and am now noticing that I could have done this better). Is this similar to what you think would be useful for the wrappers?

@weiji14
Copy link
Member Author

weiji14 commented Apr 12, 2021

I'm not sure that I completely understand how you're envisioning the standardized placeholders, if you don't mind explaining a bit more. In GenericMappingTools/gmt#4915 I tried to make it so that all modules that use -F to put a box behind something include the same source file for the documentation (and am now noticing that I could have done this better). Is this similar to what you think would be useful for the wrappers?

Sure. The idea is to make it easy to search and replace parameters (that should be bold) and arguments (that should be italics). To be honest, I don't have this well thought out, but am taking inspiration from templating engines like Jinja and Liquid

Original text

**-F**\ [**l**\|\ **t**][**+c**\ *clearances*][**+g**\ *fill*][**+i**\ [[*gap*/]\ *pen*]][**+p**\ [*pen*]]\
[**+r**\ [*radius*]][**+s**\ [[*dx*/*dy*/][*shade*]]]

Template text

{{ -F | param }}\ [{{ l }}\|\ {{ t }}][{{ +c }}\ { clearances } ][{{ +g }}\ { fill }][{{ +i }}\ [[{ gap }/]\ { pen }]][{{ +p }}\ [{ pen }]]\
[{{ +r }}\ [{ radius }]][{{ +s }}\ [[{ dx }/{ dy }/][{ shade }]]]

At it's simplest, we could replace curly brackets '{' and '}' with '*' (double means bold **, single means italics *). The param filter in {{ -F | param }} would indicate that F is a parameter that needs to be replaced with a long alias (box in this case).

@seisman
Copy link
Member

seisman commented Apr 13, 2021

I'm afraid any automatic documentation-copying mechanism means that we have to use the GMT syntax in PyGMT, e.g., +gred+p1p,blue. However, we may want more Pythonic ways (#1082). Anyway, I think this PR has the lowest priority due to its high difficulty.

@maxrjones
Copy link
Member

Just a cross-reference that we are considering ways to automate some of the GMT documentation during the addition of long-options at GenericMappingTools/gmt#5561, which could be useful for PyGMT in the future. At the least, it will probably be helpful to add somewhere in PyGMT's contributing/maintenance information where to find the unstable long-option names for core GMT (e.g. https://github.com/GenericMappingTools/gmt/blob/86b39ff1dcee9ffc1b232b8c21a21965dd38dca5/src/blockmean.c#L43-L53).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Helping hands are appreciated question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants