Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG assigning file variable to field_tree in index #25

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

jsanchez-tiempo
Copy link

When _codes_index_add_file is invoked from codes_index_add_file, a new file
structure variable (newfile) is created and this variable is added to the
list of index files. However, later when the field_tree struct is
generated, the file variable obtained from file pool (grib_file_open) is assigned instead. The ID of file variable and newfile variable may not match. Later, this can produce errors when index file is read.

To reproduce an error:

Download and uncompress tar file examples.tar.gz:
tar xvfz examples.tar.gz

Compile both programs:

gcc -leccodes readindex.c -o readindex
gcc -leccodes writeindex.c -o writeindex

First execute:
./writeindex

Finally execute:
./readindex

@FussyDuck
Copy link

FussyDuck commented May 7, 2020

CLA assistant check
All committers have signed the CLA.

@shahramn
Copy link
Collaborator

shahramn commented Jul 18, 2020

Thank you for your contribution.
I tried your change locally and several tests failed:

        130 - eccodes_f_grib_index (Failed)
        160 - eccodes_p_grib_index_test (Failed)
        173 - eccodes_p_high_level_api_test (Failed)

For example the Fortran test examples/F90/grib_index.sh fails with

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0  0x7f393fbce94f in ???
#1  0x7f393fc159b4 in ???
#2  0x7f393fc0f3a9 in ???
#3  0x7f393fc04d5c in ???
#4  0x7f393fc0a924 in ???
#5  0x7f39420da584 in codes_index_get_handle
        at /tmp/eccodes_develop/src/grib_index.c:1666
#6  0x7f39420dbcc4 in codes_new_from_index
        at /tmp/eccodes_develop/src/grib_index.c:1926
#7  0x7f39420db3fd in grib_handle_new_from_index
        at /tmp/eccodes_develop/src/grib_index.c:1857
#8  0x7f394295e0be in grib_f_new_from_index_
        at /tmp/eccodes_develop/fortran/grib_fortran.c:1930
#9  0x7f39429727d2 in __grib_api_MOD_grib_new_from_index
        at /tmp/cmake_build/build-eccodes/fortran/grib_f90.f90:787
#10  0x7f394297cb49 in __eccodes_MOD_codes_new_from_index
        at /tmp/cmake_build/build-eccodes/fortran/eccodes_f90.f90:695
#11  0x401ca3 in index
        at /tmp/eccodes_develop/examples/F90/grib_index.f90:84
#12  0x402156 in main
        at /tmp/eccodes_develop/examples/F90/grib_index.f90:16
/tmp/eccodes_develop/examples/F90/grib_index.sh: line 14:
67663 Segmentation fault
(core dumped) ${examples_dir}/eccodes_f_grib_index > index_f90.out

And also I ran valgrind on the C example:

valgrind --log-file=vlog tests/grib_indexing data/index.grib

and many errors were issued. e.g.

==80468== Invalid read of size 4
==80468==    at 0x69F18AC: fseek (in /lib64/libc-2.22.so)
==80468==    by 0x53EF584: codes_index_get_handle (grib_index.c:1666)
==80468==    by 0x53F0CC4: codes_new_from_index (grib_index.c:1926)
==80468==    by 0x53F03FD: grib_handle_new_from_index (grib_index.c:1857)
==80468==    by 0x401A63: main (grib_indexing.c:120)
==80468==  Address 0x8964f90 is 0 bytes inside a block of size 552 free'd
==80468==    at 0x4C2A55B: free (vg_replace_malloc.c:540)
==80468==    by 0x69E99B8: fclose@@GLIBC_2.2.5 (in /lib64/libc-2.22.so)
==80468==    by 0x5482981: grib_file_close (grib_filepool.c:330)
==80468==    by 0x53EDCBF: _codes_index_add_file (grib_index.c:1253)
==80468==    by 0x53EC250: grib_index_add_file (grib_index.c:1064)
==80468==    by 0x4010F9: main (grib_indexing.c:45)
==80468==  Block was alloc'd at
==80468==    at 0x4C2932F: malloc (vg_replace_malloc.c:309)
==80468==    by 0x69EA1DC: __fopen_internal (in /lib64/libc-2.22.so)
==80468==    by 0x5482269: grib_file_open (grib_filepool.c:249)
==80468==    by 0x53EC3C1: _codes_index_add_file (grib_index.c:1099)
==80468==    by 0x53EC250: grib_index_add_file (grib_index.c:1064)
==80468==    by 0x4010F9: main (grib_indexing.c:45)

@shinji-s
Copy link
Contributor

shinji-s commented Aug 6, 2020

It appears to me that this PR is fine but triggers a latent bug in the underlying code set.
A file pointer gets duplicated by the assignment 'newfile->hanlde = file->handle' but I don't see any code that tries to avoid closing them individually. Once file->handle is closed, then newfile->handle is not valid any more. The file pointer should not be closed until all copies of the pointer are gone.

@shahramn
Copy link
Collaborator

shahramn commented Aug 6, 2020

It appears to me that this PR is fine but triggers a latent bug in the underlying code set.
A file pointer gets duplicated by the assignment 'newfile->hanlde = file->handle' but I don't see any code that tries to avoid closing them individually. Once file->handle is closed, then newfile->handle is not valid any more. The file pointer should not be closed until all copies of the pointer are gone.

Dear Shinji,
Do you have a fix in mind? we would welcome your valued contribution

@shinji-s
Copy link
Contributor

shinji-s commented Aug 6, 2020

I'm thiking about adding 'grib_file * dependents' member to the grib_file structure and queue 'newfile' in the chain so that grib_file_close_file (and grib_file_delete file too?) can avoid closing the file pointer as long as some 'dependents' exist. I will post a PR if I manage to cook a patch that sees ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants