Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required patch in multi.jl causes a seg fault #3437

Closed
amitmurthy opened this issue Jun 18, 2013 · 1 comment
Closed

Required patch in multi.jl causes a seg fault #3437

amitmurthy opened this issue Jun 18, 2013 · 1 comment
Assignees

Comments

@amitmurthy
Copy link
Contributor

This required patch :

--- /tmp/ZAzh0S_multi.jl
+++ /home/amitm/Work/julia/julia/base/multi.jl
@@ -841,8 +841,7 @@
                 #TODO : Notify all RemoteRefs linked to this Worker who just died....
                 # How?

-                # FIXME: Without the below throw, the main process results in a segmentation fault.
-                throw("DisconnectedException")
+                return nothing
             else
                 # TODO : Treat any exception as death of node / major screw-up and cleanup?
                 rethrow(e)

causes a seg fault on:

addprocs(5)
rmprocs(2,5,6)

valgrind output:

amitm@amitm-laptop:~/Work/julia/julia$ valgrind julia
==9045== Memcheck, a memory error detector
==9045== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==9045== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==9045== Command: julia
==9045== 
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http:https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.2.0-2060.r57bcb34b.dirty
 _/ |\__'_|_|_|\__'_|  |  Commit 57bcb34bb3 2013-06-18 09:31:31*
|__/                   |  x86_64-linux-gnu

julia> addprocs(5)
==9084== Warning: noted but unhandled ioctl 0x5450 with no size/direction hints
==9084==    This could cause spurious value errors to appear.
==9084==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==9085== Warning: noted but unhandled ioctl 0x5450 with no size/direction hints
==9085==    This could cause spurious value errors to appear.
==9085==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==9086== Warning: noted but unhandled ioctl 0x5450 with no size/direction hints
==9086==    This could cause spurious value errors to appear.
==9086==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==9087== Warning: noted but unhandled ioctl 0x5450 with no size/direction hints
==9087==    This could cause spurious value errors to appear.
==9087==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==9088== Warning: noted but unhandled ioctl 0x5450 with no size/direction hints
==9088==    This could cause spurious value errors to appear.
==9088==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
:ok

julia> rmprocs(2,5,6)

julia> ==9045== Syscall param msync(start) points to uninitialised byte(s)
==9045==    at 0x506B3DD: ??? (syscall-template.S:81)
==9045==    by 0x5B62BE2: msync_validate (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B62D0F: validate_mem (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B62E4C: access_mem (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B60B18: dwarf_get (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B60DD2: _ULx86_64_access_reg (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B6069E: _ULx86_64_get_reg (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B65881: apply_reg_state (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B65F70: _ULx86_64_dwarf_find_save_locs (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B6231F: _ULx86_64_dwarf_step (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5B60F89: _ULx86_64_step (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5353068: rec_backtrace (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==  Address 0x7fefff000 is on thread 1's stack
==9045== 
Worker 2 terminated.
Worker 6 terminated.
==9045== Invalid read of size 4
==9045==    at 0x53811D8: uv__io_active (core.c:728)
==9045==    by 0x538C5CD: uv_read_stop (stream.c:1346)
==9045==    by 0x409B946: ???
==9045==    by 0x531AD86: jl_apply_generic (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x409708E: ???
==9045==    by 0x5352EB6: start_task (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5352F8F: start_task (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5352FE7: jl_switch_stack (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5351D48: julia_trampoline (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x403A9C: main (in /home/amitm/Work/julia/julia/usr/bin/julia-release-readline)
==9045==  Address 0x98 is not stack'd, malloc'd or (recently) free'd
==9045== 
==9045== 
==9045== Process terminating with default action of signal 11 (SIGSEGV)
==9045==  Access not within mapped region at address 0x28
==9045==    at 0x53811D8: uv__io_active (core.c:728)
==9045==    by 0x538C5CD: uv_read_stop (stream.c:1346)
==9045==    by 0x409B946: ???
==9045==    by 0x531AD86: jl_apply_generic (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x409708E: ???
==9045==    by 0x5352EB6: start_task (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5352F8F: start_task (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5352FE7: jl_switch_stack (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x5351D48: julia_trampoline (in /home/amitm/Work/julia/julia/usr/lib/libjulia-release.so)
==9045==    by 0x403A9C: main (in /home/amitm/Work/julia/julia/usr/bin/julia-release-readline)
==9045==  If you believe this happened as a result of a stack
==9045==  overflow in your program's main thread (unlikely but
==9045==  possible), you can try to increase the size of the
==9045==  main thread stack using the --main-stacksize= flag.
==9045==  The main thread stack size used in this run was 8388608.
==9045== 
==9045== HEAP SUMMARY:
==9045==     in use at exit: 52,029,910 bytes in 101,328 blocks
==9045==   total heap usage: 2,166,045 allocs, 2,064,717 frees, 716,242,485 bytes allocated
==9045== 
==9045== LEAK SUMMARY:
==9045==    definitely lost: 10,432 bytes in 21 blocks
==9045==    indirectly lost: 4 bytes in 4 blocks
==9045==      possibly lost: 1,572,080 bytes in 12,402 blocks
==9045==    still reachable: 50,447,394 bytes in 88,901 blocks
==9045==         suppressed: 0 bytes in 0 blocks
==9045== Rerun with --leak-check=full to see details of leaked memory
==9045== 
==9045== For counts of detected and suppressed errors, rerun with: -v
==9045== Use --track-origins=yes to see where uninitialised values come from
==9045== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 2 from 2)
Killed
@amitmurthy
Copy link
Contributor Author

cc @JeffBezanson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants