-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccall test races with GC #12122
Comments
Pasting my gdb session for this issue from #12179: This is weird. (GDB session below, it's a concatenation of two sessions with address space randomization off so the addresses should be coherent but the gdb variable names might be off..). @carnaval Allocations: 1358172 (Pool: 1357200; Other: 972); GC: 1499
Hardware watchpoint 1: *(void**)0x7ffdf2dd8498
Old value = (void *) 0x7ffdf2dd84a8
New value = (void *) 0x0
__pool_alloc (end_offset=16360, osize=16, p=0x7ffff7dbf2f8 <norm_pools+24>)
at gc.c:1017
1017 p->nfree--;
(gdb) c
Continuing.
Hardware watchpoint 1: *(void**)0x7ffdf2dd8498
Old value = (void *) 0x0
New value = (void *) 0x7ffdf1364550
jl_box_int64 (x=553) at alloc.c:760
760 SIBOX_FUNC(int64, int64_t, 1)
(gdb) c
Continuing.
Hardware watchpoint 1: *(void**)0x7ffdf2dd8498
Old value = (void *) 0x7ffdf1364550
New value = (void *) 0x7ffdf2dd85a8
sweep_page (osize=16, sweep_mask=3, pfl=0x7ffdf2dd85a8, pg=0x7ffff137d3c0,
p=0x7ffff7dbf2f8 <norm_pools+24>) at gc.c:1240
1240 pfl_begin = pfl_begin ? pfl_begin : pfl;
(gdb) p jl_(0x7ffdf1364550)
Int64
$1 = void
(gdb) c
Continuing.
Hardware watchpoint 1: *(void**)0x7ffdf2dd8498
Old value = (void *) 0x7ffdf2dd85a8
New value = (void *) 0x7ffdf2dd84d8
sweep_page (osize=16, sweep_mask=3, pfl=0x7ffdf2dd84d8, pg=0x7ffff137d3c0,
p=0x7ffff7dbf2f8 <norm_pools+24>) at gc.c:1240
1240 pfl_begin = pfl_begin ? pfl_begin : pfl;
(gdb) c
Continuing.
GC error (probable corruption) :
<?:::cconvert>
Program received signal SIGABRT, Aborted.
0x00007ffff6441528 in raise () from /usr/lib/libc.so.6
(gdb) p jl_(0x7ffdf2dd85a8)
<?::svec(UInt64, UInt32)>
$2 = void
(gdb) up
#1 0x00007ffff644293a in abort () from /usr/lib/libc.so.6
(gdb) up
#2 0x00007ffff6b55db1 in push_root (v=v@entry=0x7ffdf2dd84a0, d=5, d@entry=4,
bits=0) at gc.c:1725
1725 abort();
(gdb) bt
#0 0x00007ffff6441528 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff644293a in abort () from /usr/lib/libc.so.6
#2 0x00007ffff6b55db1 in push_root (v=v@entry=0x7ffdf2dd84a0, d=5, d@entry=4,
bits=0) at gc.c:1725
#3 0x00007ffff6b5401a in gc_push_root (d=4, v=0x7ffdf2dd84a0) at gc.c:1441
#4 push_root (v=v@entry=0x7ffdf38f2590, d=4, d@entry=3, bits=3) at gc.c:1664
#5 0x00007ffff6b5388b in gc_push_root (d=3, v=0x7ffdf38f2590) at gc.c:1441
#6 push_root (v=v@entry=0x7ffdf34d5cf0, d=3, d@entry=2, bits=<optimized out>)
at gc.c:1714
#7 0x00007ffff6b5388b in gc_push_root (d=2, v=0x7ffdf34d5cf0) at gc.c:1441
#8 push_root (v=v@entry=0x7ffdf34d5e70, d=2, d@entry=1, bits=<optimized out>)
at gc.c:1714
#9 0x00007ffff6b56055 in gc_push_root (d=1, v=<optimized out>) at gc.c:1441
#10 gc_mark_stack (ta=ta@entry=0x7ffdf35007e0, s=0x7fffffffcb38, offset=0, d=1)
at gc.c:1478
#11 0x00007ffff6b5761f in gc_mark_task_stack (d=1, ta=0x7ffdf35007e0)
at gc.c:1540
#12 gc_mark_task (ta=ta@entry=0x3, d=d@entry=1) at gc.c:1558
#13 0x00007ffff6b54992 in push_root (v=0x3, d=1, d@entry=0,
bits=<optimized out>, bits@entry=1) at gc.c:1678
#14 0x00007ffff6b59608 in jl_gc_collect (full=<optimized out>) at gc.c:1993
#15 0x00007ffff6b5bec3 in jl_gc_alloc_2w () at gc.c:774
#16 0x00007ffff6b513e5 in newobj (nfields=2, type=0x7ffdf1364070)
at julia_internal.h:25
#17 jl_alloc_svec_uninit (n=1) at simplevector.c:56
#18 0x00007ffff6b51531 in jl_svec_append (a=0x7ffdf2716410, b=0x7ffdf1368010)
at simplevector.c:74
#19 0x00007ffff6b515ef in jl_svec_copy (a=<optimized out>) at simplevector.c:89
#20 0x00007ffff6ae65d0 in cache_method (mt=mt@entry=0x7ffdf2728790,
type=type@entry=0x7ffdf27251b0, method=0x7ffdf2716430, decl=0x7ffdf27251b0,
sparams=0x7ffdf1368010, isstaged=0 '\000') at gf.c:469
#21 0x00007ffff6ae7e54 in jl_mt_assoc_by_type (mt=mt@entry=0x7ffdf2728790,
tt=0x7ffdf27251b0, cache=cache@entry=1, inexact=inexact@entry=0) at gf.c:1027
#22 0x00007ffff6ae8542 in jl_apply_generic (F=0x7ffdf27163b0,
args=0x7fffffffcbb8, nargs=<optimized out>) at gf.c:1659
#23 0x00007ffdf123b3cc in julia_do_test_20851 (body=<optimized out>,
qex=<optimized out>) at test.jl:53
#24 0x00007ffff6ae848b in jl_apply (nargs=2, args=0x7fffffffcc98,
f=<optimized out>) at julia.h:1268
#25 jl_apply_generic (F=0x7ffdf2715f50, args=0x7fffffffcc98,
nargs=<optimized out>) at gf.c:1651
#26 0x00007ffff6b3a3a3 in jl_apply (nargs=2, args=0x7fffffffcc98,
f=0x7ffdf2715f50) at julia.h:1268
#27 do_call (f=f@entry=0x7ffdf2715f50, args=args@entry=0x7ffdf38f2328,
nargs=nargs@entry=2, eval0=eval0@entry=0x0, locals=locals@entry=0x0,
nl=nl@entry=0, ngensym=0) at interpreter.c:65
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) p jl_(0x7ffdf38f2590)
Array{Any, 1}[<?::<?:::cconvert>>, #<function>, 553]
$13 = void
(gdb) p jl_(0x7ffdf34d5cf0)
Expr(:comparison, <?::<?:::cconvert>>, #<function>, 553)::Any
$14 = void |
P.S. The command line I used was |
I've just reproduced the same error (with slightly different gc count because of the difference in configuration) on Edit: I'm not suggesting debugging it on a server though since it's not a good idea to turn off address space randomization there... |
FWIW, the |
Well, if I read the ccall test correctly this is supposed to happen right ? Couldn't we just manually root the IntLike instance by hoisting it into a variable ? |
But the |
Ahh, I see, so the echo function returns a pointer and that pointer is not valid anymore after |
yep, and even putting it in a variable is not technically safe if we become smarter about eliminating dead roots. At some point we may need a gc_root intrinsic in julia. It's not the case for now. |
Right ... I'm just creating a wrapper function that does the load for now and I realized this because it feels really wierd that putting it in a function can fix it. I'll leave a comment there to state this (which is because the function argument "arbitrarily" extended the lifetime of the object...)... |
agreed. one other idea i had for that was that any usage of a variable in value position (even if the compiler can eliminate any codegen from it) should suffice to ensure that the original object is still alive. this would allow you to place a statement that just lists the variables to keep alive at some point in the function and be certain that the gc-analysis algorithm will respect it. |
that would simplify things but it would remove latitude from the (future ?) optimizer. Or maybe I didn't understand your idea and you want to special case statements which consists only of a variable to mean that ? I think we might as well have some kind of intrinsic for that then |
agreed, it would be a bit more conservative (like stack scanning?) and may be better to just have the intrinsic from the start. |
I once saw this in MEMDEBUG mode. It depends on precise timing but it's pretty clear what's going on:
-4919131752989213765
is0xbbbbbbbbbbbbbbbb
, so theIntLike
value must have been freed before its contents could be loaded.The text was updated successfully, but these errors were encountered: