Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Debugging PyGMT with GMT dev #4745

Closed
wants to merge 19 commits into from
Closed

WIP: Debugging PyGMT with GMT dev #4745

wants to merge 19 commits into from

Conversation

seisman
Copy link
Member

@seisman seisman commented Feb 4, 2021

Description of proposed changes

DO NOT MERGE

It seems that recent GMT changes break PyGMT on Windows (see GenericMappingTools/pygmt#801).

Some debugging efforts indicate the bug was introduced in these commits c94e83f...98bb060 (GenericMappingTools/pygmt#758 (comment)).

In this branch, we build different GMT versions, and run PyGMT tests, to find which commits introduced the bug.

This branch is for debugging only

c94e83f...98bb060

c94e83f works


$ git bisect start 98bb060828eeb6fe0e7f80d49223e45c567f0632 c94e83f3537c3a5ddadd3641d1b9aaec003b4bef
Bisecting: 58 revisions left to test after this (roughly 6 steps)
[c08e7146933519e5dc0e7d4814b1aa167f49a017] Fix typos

c08e714 works


git bisect good
Bisecting: 29 revisions left to test after this (roughly 5 steps)
[6d8d2f08bf16f4b030b35c487cc004ebe3f9ebc4] Allow Latex expressions in single-line titles and Cartesian axes labels (#4566)

6d8d2f0 works.


git bisect good
Bisecting: 14 revisions left to test after this (roughly 4 steps)
[8181c705379619379547b3dd55427eeab3b28c34] Make sure we free GMT items before memtrack_report is called (#4623)

8181c70 fails.


git bisect bad
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[8ddbb3fa923f3bebc192df5c3459bc65103f6d1d] Fix outline/filled font for quoted lines (straight) (#4610)

8ddbb3f works


git bisect good
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[b16cc28b4b62b22f6e6f45584d80347cec739f54] Enable writing of 3-D netCDF data cubes from greenspline and grdinterpolate (#4581)

b16cc28 fails.


git bisect bad
Bisecting: 1 revision left to test after this (roughly 1 step)
[74f61ab24c94eef4ccb445cfbcbe48b750d1fb42] Update gmt.conf.rst

74f61ab works.


git bisect good
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[bbe76012934d64655881d3cc0feaefbc41696f5d] Replace fancy+ by fancy-rounded (#4612)

bbe7601 works.

@seisman
Copy link
Member Author

seisman commented Feb 4, 2021

PyGMT works well with GMT bbe7601 but starts to fail with GMT b16cc28.

@seisman
Copy link
Member Author

seisman commented Feb 4, 2021

@PaulWessel PyGMT fails with recent GMT versions on Windows. Some debugging efforts find that commit b16cc28 (i.e., PR #4581) causes the failures.

A failing PyGMT test looks like this:

    with GMTTempFile(suffix=".nc") as tmpfile:
        result = grdcut("@earth_relief_01d", outgrid=tmpfile.name, region="0/180/0/90")
        assert result is None  # return value is None
        assert os.path.exists(path=tmpfile.name)  # check that outgrid exists
        result = grdinfo(tmpfile.name, C=True)
        assert result == "0 180 0 90 -8182 5651.5 1 1 180 90 1 1\n"

It's equivalent to the following script:

gmt grdcut @earth_relief_01d -R0/180/0/90 -Gtmp.grd
gmt grdinfo tmp.grd -C
rm tmp.grd

However, on Windows it gives the following error message:

E           PermissionError: [WinError 32] The process cannot access the file 
because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\pygmt-0t_f3bac.nc'

but it works on Linux and macOS.

Perhaps in #4581 you forgot to close the file? Could you review the changes in #4581 and see if you can find the bug?

@PaulWessel
Copy link
Member

PaulWessel commented Feb 4, 2021

Most (all) of that commit is related to the 3-D data cube, but your example is a regular 2-D grid. Which file do you suspect is not being closed? The tmp.grd output file?

@seisman
Copy link
Member Author

seisman commented Feb 4, 2021

The raw error message is:

The process cannot access the file because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\pygmt-0t_f3bac.nc'

in which C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\pygmt-0t_f3bac.nc is the temporary file, which PyGMT fails to delete.

I guess it means the temporary file is still used by GMT.

@joa-quim
Copy link
Member

joa-quim commented Feb 4, 2021

But it's PyGMT who writes that file, right? GMT does not write temporary files.

@seisman
Copy link
Member Author

seisman commented Feb 4, 2021

But it's PyGMT who writes that file, right? GMT does not write temporary files.

Yes, PyGMT writes that temporary file and then tries to delete it, then we have the permission error.

I should emphasize that I use the PyGMT master branch, and it fails with GMT b16cc28 but works with GMT bbe7601 (bbe7601 is the parent commit of b16cc28). So I'm afraid it must be changes in GMT that breaks.

@joa-quim
Copy link
Member

joa-quim commented Feb 4, 2021

I know it's not a solution to the root issue but, it's writing a temp file just to run grdinfo on it? Better to run a system call, no?

@seisman
Copy link
Member Author

seisman commented Feb 4, 2021

I know it's not a solution to the root issue but, it's writing a temp file just to run grdinfo on it? Better to run a system call, no?

We are avoiding system calls in PyGMT, and the PyGMT tests are testing different combinations of input/output data types (grid file, xarray array etc.), so it won't solve our problem.

@joa-quim
Copy link
Member

joa-quim commented Feb 4, 2021

Ok, that's a different discussion (as a user I would not like to lean that). But PyGMT always closes the session between GMT calls. Can you try to kill the session before attempting to delete the temp file? Because it's going to be very difficult to debug this from outside Windows.

@PaulWessel
Copy link
Member

I am deep in two-three branches right now, but once I come up for air I am willing to step through the debugger and confirm that the -Gtmp.grd file is indeed closed by the program. I cannot imagine that it would not do that. There must be something else subtle in that commit.

@joa-quim
Copy link
Member

joa-quim commented Feb 4, 2021

Yep. It's he netCDF lib who takes charge of closing the files it writes.

@PaulWessel
Copy link
Member

OK, I have stepped through the grdcut command and of course it closes the file at the end. Here we have completed that step in the debugger:

Untitled

There must be something else going on and if it is a Windows-only thing perhaps @joa-quim could try to reproduce it?

@joa-quim
Copy link
Member

joa-quim commented Feb 8, 2021

What about my suggestion to make sure the session is destroyed before deleting the temp file? After all PyGMT always destroys the sessions between commands.

@seisman
Copy link
Member Author

seisman commented Feb 8, 2021

After all PyGMT always destroys the sessions between commands.

As I understand it, PyGMT only uses one session for all commands.

@joa-quim
Copy link
Member

joa-quim commented Feb 8, 2021

That's what MEX and Julia do but from previous talks I have the impression that in PyGMT each command starts and ends a new session.

@PaulWessel
Copy link
Member

That was my understanding from what @leouieda explained to me: Each module is wrapped by a Create/Destroy session pair. There is no API void pointer carried around throughout the PyGMT session as a persistent variable, no?

@seisman
Copy link
Member Author

seisman commented Feb 8, 2021

That's what MEX and Julia do but from previous talks I have the impression that each command starts and ends a new session.

Just checked the codes. I think you're right.

@seisman
Copy link
Member Author

seisman commented Feb 8, 2021

That was my understanding from what @leouieda explained to me: Each module is wrapped by a Create/Destroy session pair. There is no API void pointer carried around throughout the PyGMT session as a persistent variable, no?

Yes.

@seisman
Copy link
Member Author

seisman commented Feb 8, 2021

What about my suggestion to make sure the session is destroyed before deleting the temp file?

I'm reading the codes. I think the session is destroyed first, and then PyGMT tries to delete the temp file.

@joa-quim
Copy link
Member

joa-quim commented Feb 8, 2021

Something must be holding a grip to the file. If it's not GMT, what else can it be? Can you try not to delete it with python and at the end delete it with a cmd command?

@seisman
Copy link
Member Author

seisman commented Feb 8, 2021

Something must be holding a grip to the file. If it's not GMT, what else can it be? Can you try not to delete it with python and at the end delete it with a cmd command?

Will try.

@seisman
Copy link
Member Author

seisman commented Feb 9, 2021

Here is a minimal example to reproduce the issue (using PyGMT latest + GMT dev). The equivalent bash command is gmt grdinfo ingrid.nc:

import os
import subprocess as sp
from pygmt import grdinfo

# Runing the following command in CLI to generate the sample grid
# gmt grdcut @earth_relief_01d -R0/180/0/90 -Gingrid.nc

result = grdinfo("ingrid.nc", verbose="d")
os.remove("ingrid.nc")
# sp.run("del ingrid.nc", shell=True)

Here I tried three different ways to delete the ingrid.nc file:

  1. os.remove("ingrid.nc"): fails
  2. sp.run("del ingrid.nc", shell=True): calling Windows del command, still fails
  3. Run del ingrid.nc after the script finishes: success.

The variable result is correct, so grdinfo successfully reads the grid. Is it possible that it doesn't close the file after reading?

Here are the debugging messages:

grdinfo [DEBUG]: Found readable file ingrid.nc
grdinfo [DEBUG]: Replace file ingrid.nc with path ingrid.nc
grdinfo [DEBUG]: Replace file ingrid.nc with ingrid.nc
grdinfo [DEBUG]: Will write output record(s) with 0 leading numerical columns and with trailing text
grdinfo [DEBUG]: Number of numerical output columns has been set to 0
grdinfo [DEBUG]: gmtapi_init_export: Passed family = Data Table and geometry = Non-Geographical
grdinfo [DEBUG]: Object ID 0 : Registered Data Table File C:\Users\seisman\AppData\Local\Temp\pygmt-tvfw18hx.txt as an Output resource with geometry Non-Geographical [n_objects = 1]
grdinfo [DEBUG]: gmtapi_init_export: Added 1 new destination
grdinfo [DEBUG]: GMT_Init_IO: Returned first Output object ID = 0
grdinfo [DEBUG]: GMT_Begin_IO: Initialize record-by-record access for Output
grdinfo [DEBUG]: gmtapi_next_io_source: Selected object 0
grdinfo [INFORMATION]: Writing Data Table to file C:\Users\seisman\AppData\Local\Temp\pygmt-tvfw18hx.txt
grdinfo [DEBUG]: GMT_Begin_IO: Output resource access is now enabled [record-by-record]
grdinfo [DEBUG]: Found readable file ingrid.nc
grdinfo [DEBUG]: Replace file ingrid.nc with path ingrid.nc
grdinfo [DEBUG]: Found readable file ingrid.nc
grdinfo [DEBUG]: Replace file ingrid.nc with path ingrid.nc
grdinfo [DEBUG]: Found readable file ingrid.nc
grdinfo [DEBUG]: Object ID 1 : Registered Grid File ingrid.nc as an Input resource with geometry Surface [n_objects = 2]
grdinfo [DEBUG]: gmtapi_begin_io: Input resource access is now enabled [container]
grdinfo [DEBUG]: gmtapi_import_grid: Passed ID = 1 and mode = 1
grdinfo [DEBUG]: Found readable file ingrid.nc
grdinfo [DEBUG]: Call gmtgrdio_doctor_geo_increments on a geographic grid
grdinfo [DEBUG]: Geographic input grid, longitudes span less than 360
grdinfo [DEBUG]: GMT_End_IO: Input resource access is now disabled
grdinfo [INFORMATION]: Processing grid ingrid.nc
grdinfo [DEBUG]: GMT_Destroy_Data: freed memory for a Grid for object 1
grdinfo [DEBUG]: gmtlib_unregister_io: Unregistering object no 1 [n_objects = 1]
grdinfo [DEBUG]: gmtlib_unregister_io: Object no 1 has non-NULL resource pointer
grdinfo [DEBUG]: GMT_End_IO: Output resource access is now disabled
grdinfo [DEBUG]: gmtlib_unregister_io: Unregistering object no 0 [n_objects = 0]
Traceback (most recent call last):
  File "C:\Users\seisman\Desktop\test.py", line 9, in <module>
    os.remove("ingrid.nc")
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'ingrid.nc'

@PaulWessel
Copy link
Member

I will see what grdinfo does in debug, first in Xcode from command line, then later via PyGMT. But hungry right now...

@PaulWessel
Copy link
Member

I stepped through grdinfo ingrid.nc in Xcode and it definitively calls nc_close at the end. I do not know what might be holding onto it in python.

@joa-quim
Copy link
Member

joa-quim commented Feb 9, 2021

I can reproduce it in Julia. If I just do

julia> result = grdinfo("ingrid.nc");

and try to delete the file from windows manager, I get a pop up window saying
image

@joa-quim
Copy link
Member

joa-quim commented Feb 9, 2021

And the issue is netCDF related (not only grdinfo, grdimage behaves the same). If I do the same but with a surfer grid then I can delete the grid.

More, this blocks the grid too but now is GDAL who does the job. We must be opening it somewhere else and not closing.

imshow("ingrid.nc=gd")

@seisman
Copy link
Member Author

seisman commented Feb 9, 2021

Good to know that you can reproduce it! b16cc28 is the commit to blame.

@PaulWessel
Copy link
Member

I acknowledge there is a problem, but not clear if we are not closing the file or not. My tests so far shows we are closing it when grdinfo has received the header. I am thinking of replacing nc_open and nc_close with gmt_nc_open and gmt_nc_close so I can issue a debug message and we can at least know for sure when these functions are called. OK with that?

@joa-quim
Copy link
Member

joa-quim commented Feb 9, 2021

OK. But dis you see my other message saying that imshow("ingrid.nc=gd") when reading by GDAL the problem also happens? It should not go through gmt_nc in this case.

@PaulWessel
Copy link
Member

Right... so that would argue that netCDF might cause this. @seisman, any releases of netCDF/HDF that coincides with our commit? Does building with an older netcdf library show the same problem?

@joa-quim
Copy link
Member

joa-quim commented Feb 9, 2021

I didn't change my netcdf lib, so can't be that either.

And to tell the truth, I've been witnessing this type of behavior for a long time in Mirone run from Matlab. Time to time I realized that I could not delete a nc file that was created by Mirone. Bu in Mirone I create nc with own code, or with GMT and never investigated the subject.

@PaulWessel
Copy link
Member

Using gmt_nc_*, here is grdinfo:

(base) pwessel@macnut:~-> gmt grdinfo t.nc 
grdinfo [NOTICE]: Calling nc_open on t.nc, ncid = 65536, err = 0
grdinfo [NOTICE]: Calling nc_open on t.nc, ncid = 131072, err = 0
grdinfo [NOTICE]: Calling nc_close on ncid 131072, err = 0
grdinfo [NOTICE]: Calling nc_open on t.nc, ncid = 131072, err = 0
grdinfo [NOTICE]: Calling nc_close on ncid 131072, err = 0
t.nc: Title: Produced by grdcut
t.nc: Command: grdcut @earth_relief_01d_p -R0/180/0/90 -Gt.nc
t.nc: Remark: Obtained by Gaussian Cartesian filtering (111.2 km fullwidth) from SRTM15+V2.1.nc [Tozer et al., 2019; http:https://dx.doi.org/10.1029/2019EA000658]
t.nc: Pixel node registration used [Geographic grid]
t.nc: Grid file format: nf = GMT netCDF format (32-bit float), CF-1.7
t.nc: x_min: 0 x_max: 180 x_inc: 1 name: longitude n_columns: 180
t.nc: y_min: 0 y_max: 90 y_inc: 1 name: latitude n_rows: 90
t.nc: v_min: -8182 v_max: 5651.5 name: elevation (m)
t.nc: scale_factor: 1 add_offset: 0
t.nc: format: classic

I am seeing three opens and two close... think we got our boy.

PaulWessel added a commit that referenced this pull request Feb 9, 2021
* Ensure we close file in gmt_nc_is_cube

Closes #4745.

* Update gmt_nc.c
@seisman seisman deleted the debug-pygmt branch February 9, 2021 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants