Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue in DArray creation when myid() holds a chunk #206

Open
raminammour opened this issue Jun 6, 2019 · 0 comments
Open

Performance issue in DArray creation when myid() holds a chunk #206

raminammour opened this issue Jun 6, 2019 · 0 comments

Comments

@raminammour
Copy link
Contributor

raminammour commented Jun 6, 2019

Hello,

If the pid trying to create a DArray has to do enough work, we see a slow down. The work interferes with the asynchronous dispatch of work to the other pids.

using Distributed
addprocs(2)
using DistributedArrays

# a lot of work...
@everywhere function init(I)
    rr=rand(map(length,I)...)
    ss=0.
    for _ in 1:10
        ss+=sum(rr.^2)
    end
    exp.(ss*exp.(rr).^3)
end

n=2000
@btime d1=DArray(init,(n,n),workers()[1:2])
@btime d1=DArray(init,(n,n),procs()[1:2])
@btime d1=DArray(init,(n,n),procs()[2:-1:1]);

  126.553 ms (290 allocations: 22.45 KiB)
  156.219 ms (253 allocations: 213.64 MiB)
  112.563 ms (249 allocations: 213.64 MiB)

Note that in the last experiment, the same pids are included, but the pid==1 is last and the slowdown disappears (it dispatches the work to others before doing its own). Which suggests the easy fix:

@everywhere @eval DistributedArrays function DistributedArrays.DArray(id, init, dims, pids, idxs, cuts)
    localtypes = Vector{DataType}(undef,length(pids))
    
    pids=copy(pids)
    ind=findfirst(isequal(myid()),pids)
    if ind != nothing
        pids[end],pids[ind]=pids[ind],pids[end]
    end
    
    
    @sync begin
        for i = 1:length(pids)
            @async begin
                local typA
                if isa(init, Function)
                        typA = remotecall_fetch(construct_localparts, pids[i], init, id, dims, pids, idxs, cuts)
                else
                    # constructing from an array of remote refs.
                    typA = remotecall_fetch(construct_localparts, pids[i], init[i], id, dims, pids, idxs, cuts)
                end
                localtypes[i] = typA
            end
        end
    end

    if length(unique(localtypes)) != 1
        @sync for p in pids
            @async remotecall_fetch(release_localpart, p, id)
        end
        throw(ErrorException("Constructed localparts have different `eltype`: $(localtypes)"))
    end
    A = first(localtypes)

    if myid() in pids
        d = registry[id]
        d = isa(d, WeakRef) ? d.value : d
    else
        T = eltype(A)
        N = length(dims)
        d = DArray{T,N,A}(id, dims, pids, idxs, cuts, empty_localpart(T,N,A))
    end
    d
end

And after the fix:

@btime d1=DArray(init,(n,n),workers()[1:2])
@btime d1=DArray(init,(n,n),procs()[1:2])
@btime d1=DArray(init,(n,n),procs()[2:-1:1]);

  79.150 ms (296 allocations: 22.64 KiB)
  111.481 ms (258 allocations: 213.64 MiB)
  111.870 ms (258 allocations: 213.64 MiB)

Will submit a PR with the fix promptly :)

Cheers!

raminammour added a commit to raminammour/DistributedArrays.jl that referenced this issue Jun 6, 2019
Fixes issue JuliaParallel#206 , please see the issue description for an explanation of the fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant