Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

Segfault on OS X when reading structure data #515

Open
cwardgar opened this issue Apr 5, 2016 · 1 comment
Open

Segfault on OS X when reading structure data #515

cwardgar opened this issue Apr 5, 2016 · 1 comment

Comments

@cwardgar
Copy link
Contributor

cwardgar commented Apr 5, 2016

On OS X, running the TestNc4IospWriting tests frequently (but not always) cause a segfault, e.g.

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000012de33890, pid=71436, tid=13315
#
# JRE version: Java(TM) SE Runtime Environment (8.0_65-b17) (build 1.8.0_65-b17)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.65-b01 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [libhdf5.10.dylib+0x90890]  H5FL_garbage_coll+0x120
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/cwardgar/dev/projects/thredds2/cdm-test/hs_err_pid71436.log
#
# If you would like to submit a bug report, please visit:
#   http:https://bugreport.sun.com/bugreport/

The error always occurs in the HDF5 native library, but strangely, not always in the same stack frame. I've seen H5SL_release_common, H5FL_fac_term, H5FL_blk_gc, H5SL_remove_first, and H5FL_garbage_coll. There are probably others. I've attached the error report file, for completeness.

I narrowed the problem to the writeNetcdf4Compound() test, which reads in NetCDF-4 files with structure data, writes them out using FileWriter2, and then compares the old and new files. For the comparison, the old file is read using H5iosp (pure-Java) and the new file is read using Nc4Iosp (delegates to the native C lib).

For some reason, running just the TestNc4IospWriting.writeNetcdf4Compound() test by itself does not reliably reproduce the error; you have to run all the tests in TestNc4IospWriting. This suggests some sort of interference among the tests. I've been using the Gradle command:

./gradlew :cdm-test:cleanTest :cdm-test:test --tests *TestNc4IospWriting 1>out.txt 2>&1

In https://github.com/cwardgar/thredds/commit/de5de008d1df3c8bc213e4ea40ea32d450a3f5e0, I disabled comparison of values when testing on OS X, which eliminates the problematic native read. I have no idea how to debug this, and it looks like the problem is in hdf5 anyway. But why is it only showing up on Mac OS X?

@cwardgar
Copy link
Contributor Author

cwardgar commented Apr 5, 2016

Here's the error report file.

hs_err_pid71436.txt

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant