Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setindex! not defined? #155

Closed
barche opened this issue Feb 18, 2018 · 3 comments
Closed

setindex! not defined? #155

barche opened this issue Feb 18, 2018 · 3 comments

Comments

@barche
Copy link

barche commented Feb 18, 2018

Hi,

Is it possible setindex! is not defined? I tried the following:

@everywhere using DistributedArrays
x = dzeros((20,), procs(), (nprocs(),))
x[20] = 3

But got

ERROR: indexing not defined for DistributedArrays.DArray{Float64,1,Array{Float64,1}}
Stacktrace:
 [1] setindex!(::DistributedArrays.DArray{Float64,1,Array{Float64,1}}, ::Int64, ::Int64) at ./abstractarray.jl:967

I could set a value using this workaround (assuming the above was on 4 processes):

@sync @spawnat 4 localpart(x)[end] = 3.0

So it seems a general setindex! would not be that difficult to implement (presumably using the same mechanism to get local indices as used in getindex), or am I missing something here?

@andreasnoack
Copy link
Member

The setindex! machinery in Base assumes that it is fine to set a single element at a time. That would be very expensive for distributed arrays which is why we don't define scalar setindex!. You could argue that it would be better to provide the functionality even though it is slow but in my experience, it is better to get an error than falling back to the many AbstractArray methods in Base for distributed arrays which are extremely slow for DistributedArrays. Then there is the question about setindex! for ranges. Feel free to give it a shot but I fear it might be complicated.

@barche
Copy link
Author

barche commented Feb 19, 2018

Aha, that makes sense indeed. I was thinking about implementing a parallel array type over MPI (using locked windows) and that would have the same problem (lock/unlock being expensive). I wonder if it would be possible to somehow queue all remote updates in a code block and send the data in one go when the block is closed. Possibly some kind of do-block method could be used to avoid forgetting to close. Some pseudo-code to make it a bit clearer:

darr = dzeros(...)
w = darraywriter(darr)
for i in some_global_index_collection
  w[i] = my_complicated_calculation(i, ...)
end
close(w) # this actually does the communication

@andreasnoack
Copy link
Member

We'd probably have to do something like that to make it efficient.

If you are considering an MPI based array type then you might want to take a look at https://github.com/kpamnany/Gasp.jl which we used for the big Celeste runs last year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants