Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of racy @spawn in @sync #41927

Merged
merged 7 commits into from
Nov 14, 2021
Merged

Better handling of racy @spawn in @sync #41927

merged 7 commits into from
Nov 14, 2021

Conversation

tkf
Copy link
Member

@tkf tkf commented Aug 19, 2021

This "racy" use of isready

julia/base/task.jl

Lines 344 to 347 in 543386d

function sync_end(c::Channel{Any})
local c_ex
while isready(c)
r = take!(c)

can introduce a confusing behavior when @spawn or @async that are lexically in scope of @sync can be scheduled in a racy manner:

using Base.Threads: @spawn

hidden_spawn(f) = @spawn f()

function sync_end_race()
    y = Ref(:notset)
    local t
    @sync begin
        t = hidden_spawn() do
            Threads.@spawn y[] = :completed
        end
    end
    try
        wait(t)
    catch
        return :notscheduled
    end
    return y[]
end

I think it's reasonable to expect that the task Threads.@spawn y[] = :completed is either not scheduled or completed by the end of @sync block. I wouldn't recommend using a pattern like this. But, it's still nice to report an error from the lexically scoped tasks (if any) reliably via @sync as much as possible.

But, in current Julia, I can check that sync_end_race() can return :notset by running

function demo()
    for i in 1:1000
        sync_end_race() == :notset && return i
    end
    return nothing
end

in the REPL. (Getting a similar behavior in a script requires a bit more code. See the test included in the PR.)

@tkf tkf added the domain:multithreading Base.Threads and related functionality label Aug 19, 2021
base/task.jl Outdated
Comment on lines 376 to 381
close(c)
if @isdefined(c_ex)
# Racy `@spawn`/`@async` many manage to `put!` tasks in the channel `c`.
# Let's synchronize them as well, to report possible errors:
for r in c
c_ex = _sync1!(c_ex, r)
end
Copy link
Member Author

@tkf tkf Aug 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, maybe it's better to always throw an error when we find something is scheduled/@sync_added after close?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could somehow cheaply prevent the use of the channel from any task that is not (transitively) part of the @sync group. Probably not though (perhaps similar cost to determining it this causes a deadlock?) and so throwing an error in detected races sounds good to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheaply prevent the use of the channel from any task that is not (transitively) part of the @sync group

I guess we can lower @sync $body to

task_local_storage(SYNC_KEY, Channel(Inf)) do
    $body
end

and @spawn $body to

let ch = get(task_local_storage(), SYNC_KEY, nothing)
    function wrapper()
        task_local_storage(SYNC_KEY, ch)
        $body
    end
    t = Task(wrapper)
    t.sticky = false
    if ch !== nothing
        put!(ch, t)
    end
    schedule(t)
end

?

Not sure if you'd consider task_local_storage reasonably cheap, though :)

But maybe more importantly, it's a breaking change since I think it's reasonable to write something like

@sync begin
    Threads.foreach(xs) do x
        y = cpu_intensive_work(x)
        if rare_condition(y)
            @spawn io_intensive_work(y)
        end
    end
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tweaked sync_end in 19a946f. Now it won't wait for the racy tasks and it always throws when a racy task is found.

base/channels.jl Outdated Show resolved Hide resolved
base/channels.jl Outdated Show resolved Hide resolved
@tkf tkf added the status:merge me PR is reviewed. Merge when all tests are passing label Nov 12, 2021
base/channels.jl Outdated Show resolved Hide resolved
@tkf tkf merged commit 5665e8b into JuliaLang:master Nov 14, 2021
@tkf tkf deleted the sync_end branch November 14, 2021 01:02
@DilumAluthge DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label Nov 16, 2021
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants