How to resolve flaky tests resulting from using a single GMT session #1242

weiji14 · 2021-04-27T01:44:27Z

Description of the problem

There's been instances of flaky tests in PyGMT's test suite reported in #1217 (comment). This likely stems from the fact that PyGMT uses a single GMT session (initiated during import pygmt) instead of separate GMT sessions for each figure (see #327 (comment)).

@meghanrjones asked about whether we should stick with using a single GMT session in #1217 (comment), or use independent sessions per figure

I understand the original logic behind a single GMT session for all tests in #327 (comment). Still, I don't expect that users will be attempting to use the entire PyGMT library in a single session, which is the goal of the test suite. So I think it would be worth revisiting this decision. Could it be possible to periodically test the examples/tutorials against baseline images to ensure that producing multiple plots in a single session is consistent and have the unit tests each use individual sessions?

Full code that generated the error

Flaky tests are hard to reproduce (that is their definition actually), but in PyGMT's case, can be found e.g. when a single test passing on pytest pygmt/tests/test_somemodule.py fails when ran using make test, or vice versa.

E.g. as reported by @meghanrjones in #1217 (comment)

edit: I have not yet been able to figure out a solution. The two makecpt tests fail if there is a docstring example that imports pygmt and instantiates a figure (e.g., extract_region() in pygmt/clib/session.py and pygmt/src/grdfilter.py) and is tested before pygmt/tests/test_makecpt.py.

Related issues affected by having a single GMT session:

System information

Please paste the output of python -c "import pygmt; pygmt.show_versions()":

PyGMT information:
  version: v0.3.2.dev117+g7466dc31
System information:
  python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37)  [GCC 9.3.0]
  executable: ~/username/miniconda3/envs/pygmt/bin/python
  machine: Linux-5.4.0-72-generic-x86_64-with-debian-bullseye-sid
Dependency information:
  numpy: 1.17.1
  pandas: 1.2.3
  xarray: 0.17.0
  netCDF4: 1.5.6
  packaging: 20.9
  ghostscript: 9.53.3
  gmt: 6.2.0rc1
GMT library information:
  binary dir: ~/username/miniconda3/envs/pygmt/bin
  cores: 6
  grid layout: rows
  library path: ~/username/miniconda3/envs/pygmt/lib/libgmt.so
  padding: 2
  plugin dir: ~/username/miniconda3/envs/pygmt/lib/gmt/plugins
  share dir: ~/username/miniconda3/envs/pygmt/share/gmt
  version: 6.2.0rc1

The text was updated successfully, but these errors were encountered:

weiji14 · 2021-04-30T23:37:14Z

Ok, the flakiness appears to have been an upstream GMT issue that was fixed in GenericMappingTools/gmt#3344. There are some tests that are wrong but currently passing (i.e. false positives) identified in #1217 (comment) and #1217 (comment) that need to be updated once we bump to GMT 6.2.0rc2.

pygmt/tests/test_subplot.py Update fig.basemap, fig.subplot and fig.wiggle baseline images for GMT 6.2.0rc2 #1291
pygmt/tests/test_text.py Update fig.text baseline images for GMT 6.2.0rc2 #1292
pygmt/tests/test_wiggle.py Update fig.basemap, fig.subplot and fig.wiggle baseline images for GMT 6.2.0rc2 #1291

maxrjones · 2021-05-29T22:03:05Z

The past few flaky tests revealing GMT bugs have convinced me of the usefulness of the current structure, even though it would be nice to have the option to run tests in parallel.

maxrjones · 2021-10-28T15:24:09Z

We seem to be semi-regularly getting failures on windows-latest - Python 3.7 / NumPy 1.18 with :
..\tests\test_sph2grd.py::test_sph2grd_outgrid FAILED [ 87%]
..\tests\test_sph2grd.py::test_sph2grd_no_outgrid FAILED [ 87%]
due to issues with the remote file.

weiji14 · 2021-10-28T22:54:39Z

We seem to be semi-regularly getting failures on windows-latest - Python 3.7 / NumPy 1.18 with : ..\tests\test_sph2grd.py::test_sph2grd_outgrid FAILED [ 87%] ..\tests\test_sph2grd.py::test_sph2grd_no_outgrid FAILED [ 87%] due to issues with the remote file.

Yes this has been popping up recently, but I don't think this is related to flakiness in a single GMT session since the error is Error: [ERROR]: Libcurl Error: Timeout was reached, so maybe open a separate issue for this.

weiji14 added bug Something isn't working question Further information is requested labels Apr 27, 2021

This was referenced Apr 27, 2021

Bump to GMT 6.2.0rc1 #1217

Closed

Wrap velo #525

Merged

weiji14 mentioned this issue May 16, 2021

Wrap grdgradient #1269

Merged

5 tasks

This was referenced May 25, 2021

Bump to GMT 6.2.0rc2 #1289

Closed

Update fig.text baseline images for GMT 6.2.0rc2 #1292

Merged

weiji14 added the longterm Long standing issues that need to be resolved label Jun 9, 2021

timrlaw mentioned this issue Oct 13, 2021

Strange behaviour when using Session.call_module(). #1582

Open

maxrjones mentioned this issue Nov 1, 2021

Windows CI tests regularly failing due to issues with cached remote files for sph2grd #1602

Closed

weiji14 modified the milestones: 0.6.0, 1.0.0 Nov 3, 2021

weiji14 mentioned this issue Dec 30, 2023

Improve performance by avoiding loading the GMT library repeatedly #2930

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to resolve flaky tests resulting from using a single GMT session #1242

How to resolve flaky tests resulting from using a single GMT session #1242

weiji14 commented Apr 27, 2021 •

edited

Loading

weiji14 commented Apr 30, 2021 •

edited

Loading

maxrjones commented May 29, 2021

maxrjones commented Oct 28, 2021

weiji14 commented Oct 28, 2021

How to resolve flaky tests resulting from using a single GMT session #1242

How to resolve flaky tests resulting from using a single GMT session #1242

Comments

weiji14 commented Apr 27, 2021 • edited Loading

weiji14 commented Apr 30, 2021 • edited Loading

maxrjones commented May 29, 2021

maxrjones commented Oct 28, 2021

weiji14 commented Oct 28, 2021

weiji14 commented Apr 27, 2021 •

edited

Loading

weiji14 commented Apr 30, 2021 •

edited

Loading