Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure jl_get_excstack can't read from concurrently running tasks #30899

Merged
merged 1 commit into from
Feb 5, 2019

Conversation

c42f
Copy link
Member

@c42f c42f commented Jan 30, 2019

A minor tweak extracted from #29901 to prevent users from accidentally calling catch_stack on a concurrently running task. I don't think this is something which makes sense to do and the current implementation would cause crashes once PARTR lands.

@c42f
Copy link
Member Author

c42f commented Jan 30, 2019

Looks like the FreeBSD build slave ran out of disk space.

@c42f c42f requested a review from vtjnash February 1, 2019 04:38
@c42f
Copy link
Member Author

c42f commented Feb 1, 2019

CI passes and this is a minor internal fix which forbids erroneous user code, so I'll merge this in a day or two if there's no objections.

src/stackwalk.c Show resolved Hide resolved
@c42f c42f force-pushed the cjf/disallow-concurrent-catch_stack branch from 9005be2 to 073a7d1 Compare February 3, 2019 01:24
This might not be a major problem right now, but could become a source
of crashes once PARTR lands.
@c42f c42f force-pushed the cjf/disallow-concurrent-catch_stack branch from 073a7d1 to bd78876 Compare February 5, 2019 04:47
@c42f
Copy link
Member Author

c42f commented Feb 5, 2019

OSX failure is #30949

@c42f c42f merged commit c396179 into master Feb 5, 2019
@c42f c42f deleted the cjf/disallow-concurrent-catch_stack branch February 5, 2019 23:51
@c42f c42f mentioned this pull request Feb 6, 2019
@Jutho
Copy link
Contributor

Jutho commented Apr 10, 2019

Could it be that this broke (testing of) functions that currently use @threads:

with export JULIA_NUM_THREADS = 4 (or larger than 1)

using Test, LinearAlgebra

function trmul!(a, b)
       Threads.@threads for i = 1:length(a)
           a[i] *= b
       end
       return a
end

a = randn(Float64, 1000);
b = randn(Float64);
a2 = copy(a);

trmul!(a2, b);
rmul!(a, b);
a2 == a # true
trmul!(a2, b) == rmul!(a, b) # sometimes works, returning true, sometimes hangs
@test rmul!(copy(a), b) == trmul!(copy(a), b)  # always hangs

and upon CTRL + C or any other keystroke:

fatal: error thrown and no exception handler available.
ErrorException("concurrency violation detected")
rec_backtrace at /usr/local/julia/src/stackwalk.c:94
record_backtrace at /usr/local/julia/src/task.c:210
jl_throw at /usr/local/julia/src/task.c:417
error at ./error.jl:33
assert_havelock at ./condition.jl:20 [inlined]
assert_havelock at ./condition.jl:43 [inlined]
assert_havelock at ./condition.jl:67 [inlined]
notify at ./condition.jl:118
#notify#463 at ./condition.jl:116 [inlined]
notify at ./condition.jl:116 [inlined]
notify at ./condition.jl:116 [inlined]
readcb_specialized at ./stream.jl:558
jfptr_readcb_specialized_4919 at /usr/local/julia/usr/lib/julia/sys.dylib (unknown line)
uv_readcb at ./stream.jl:599
jlcapi_uv_readcb_4129 at /usr/local/julia/usr/lib/julia/sys.dylib (unknown line)
uv__read at /workspace/srcdir/libuv/src/unix/stream.c:1179
uv__stream_io at /workspace/srcdir/libuv/src/unix/stream.c:1339
uv__io_poll at /workspace/srcdir/libuv/src/unix/kqueue.c:311
uv_run at /workspace/srcdir/libuv/src/unix/core.c:361
jl_task_get_next at /usr/local/julia/src/partr.c:303
poptaskref at ./task.jl:564
wait at ./task.jl:591
task_done_hook at ./task.jl:327
jl_apply at /usr/local/julia/src/./julia.h:1604 [inlined]
jl_finish_task at /usr/local/julia/src/task.c:167
start_task at /usr/local/julia/src/task.c:593

@Jutho
Copy link
Contributor

Jutho commented Apr 11, 2019

Bump, could this be looked at before 1.2 becomes final? I think it would be bad to break the current multithreading facilities?

@StefanKarpinski
Copy link
Sponsor Member

@vtjnash, can you take a look?

@c42f
Copy link
Member Author

c42f commented Apr 16, 2019

@Jutho superficially your error doesn't look related to this change. Is there a reason to believe that jl_get_excstack() is involved?

@Jutho
Copy link
Contributor

Jutho commented Apr 16, 2019

Probably not, I don't know or understand anything about the internals and the C code. I just noticed that this PR was one of the last to modify stackwalk.c. I assumed the @test macro tries to get backtraces, and that did maybe involved catching the stack, but I have no idea what that means :-). My apologies for the noise. There seem to be other issues already about the sort of behaviour I observe.

@c42f
Copy link
Member Author

c42f commented Apr 16, 2019

No problems. The stack trace looks like it's hitting some other problem maybe related to some larger concurrency improvements / refactorings that Jameson and Jeff have merged lately. But I've not had time to follow those developments closely so that's just a guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants