Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: a safepoint in yield #40473

Merged
merged 1 commit into from
Apr 14, 2021
Merged

RFC: a safepoint in yield #40473

merged 1 commit into from
Apr 14, 2021

Conversation

tkf
Copy link
Member

@tkf tkf commented Apr 14, 2021

Currently, to avoid deadlocks, we need to manually insert GC.safepoint() in non-allocating tasks and also yield if multiple tasks are using low-level concurrency communication primitives such as atomics. I find it confusing that yield() does not imply GC.safepoint() and so you have to write both. This is because my mental model of yield is to "let julia runtime do whatever it needs to do at the moment" and GC is a part of julia runtime. Since it is possible to write non-allocating concurrent tasks, maybe it makes sense to invoke GC.safepoint() at the locations that the programmers marking that it's OK to do context switch? As yield() is more than 100 times expensive than GC.safepoint(), I think the performance implication is negligible.

More specifically, this patch adds a single unconditional safepoint in jl_process_events.

MWE

An easy way to observe this is a very naive spin lock. (But this PR is not for advocating spin lock.)

core_flush(io) = ccall(:jl_uv_flush, Cvoid, (Ptr{Cvoid},), Core.io_pointer(io))

function mwe(; with_yield = true, with_safepoint = true)
    @assert Threads.nthreads() > 1
    println()
    println("with_safepoint = $with_safepoint with_yield = $with_yield")

    scheduled = Threads.Atomic{Bool}(false)
    started = Threads.Atomic{Bool}(false)
    done = Threads.Atomic{Bool}(false)
    n = Threads.nthreads(),
    @sync try
        Threads.@threads :static for _ in 1:Threads.nthreads()
            Threads.threadid() == 1 && continue
            Threads.@async begin
                if !Threads.atomic_xchg!(scheduled, true)
                    started[] = true
                    Core.print("$(Threads.threadid()): spinning...\n")
                    while !done[]
                        with_safepoint && GC.safepoint()
                        with_yield && yield()
                    end
                end
                Core.print("$(Threads.threadid()): DONE\n")
            end
        end
        while !started[]
            GC.safepoint()
            yield()
        end
        Core.print("GC.gc()...\n")
        core_flush(Core.stdout)
        GC.gc()
    catch err
        @error "root" exception = (err, catch_backtrace())
        rethrow()
    finally
        Core.print("stopping...\n")
        done[] = true
    end
end

mwe()
mwe(with_yield = false)
mwe(with_safepoint = false)

Running this script with 2 threads prints

with_safepoint = true with_yield = true
GC.gc()...
2: spinning...
stopping...
2: DONE

with_safepoint = true with_yield = false
2: spinning...
GC.gc()...
stopping...
2: DONE

with_safepoint = false with_yield = true
GC.gc()...
2: spinning...

and hangs. That is to say, yield alone is not enough to trigger the safepoint. This patch fixes the deadlock.

Benchmarking yield and GC.safepoint

julia> @btime yield()
  435.854 ns (0 allocations: 0 bytes)

julia> @btime GC.safepoint()
  1.809 ns (0 allocations: 0 bytes)

Invoking them in (more or less) parallel does not change the rough estimate:

julia> function foreach_thread(f, n = Threads.nthreads())
           ys = Vector{Any}(undef, n)
           Threads.@threads :static for i in 1:n
               ys[i] = f(i)
           end
           return ys
       end
foreach_thread (generic function with 2 methods)

julia> by = foreach_thread() do _
           @benchmark yield()
       end
8-element Vector{Any}:
 Trial(426.213 ns)
 Trial(460.337 ns)
 Trial(962.747 ns)
 Trial(604.731 ns)
 Trial(600.040 ns)
 Trial(596.795 ns)
 Trial(227.000 ns)
 Trial(963.750 ns)

julia> bs = foreach_thread() do _
           @benchmark GC.safepoint()
       end
8-element Vector{Any}:
 Trial(1.809 ns)
 Trial(1.809 ns)
 Trial(1.809 ns)
 Trial(1.809 ns)
 Trial(1.509 ns)
 Trial(1.509 ns)
 Trial(1.509 ns)
 Trial(1.509 ns)

@tkf tkf requested a review from vtjnash April 14, 2021 00:38
@tkf tkf added the domain:multithreading Base.Threads and related functionality label Apr 14, 2021
@StefanKarpinski
Copy link
Sponsor Member

Very naive opinion, this seems like a good idea to me.

Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this seems like a reasonable place to put it for now

@vtjnash vtjnash merged commit 7112c89 into JuliaLang:master Apr 14, 2021
@tkf tkf deleted the yield-safepoint branch April 14, 2021 06:09
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
yield is already a potential safepoint, this just ensures it is always one
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
yield is already a potential safepoint, this just ensures it is always one
johanmon pushed a commit to johanmon/julia that referenced this pull request Jul 5, 2021
yield is already a potential safepoint, this just ensures it is always one
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants