Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC : process ids are independent of nprocs() and others #3394

Merged
merged 3 commits into from
Jun 17, 2013
Merged

RFC : process ids are independent of nprocs() and others #3394

merged 3 commits into from
Jun 17, 2013

Conversation

amitmurthy
Copy link
Contributor

This patch is a starting point for cleanly removing workers :

  • process ids are now independent of nprocs().
  • rmprocs() removes specified list of processes.
  • death of worker process results in it being cleanly removed
  • list_allprocs() gives the list of valid processors
  • list_workers() is the above minus the client process (pid 1), unless nprocs() is 1, then it is same as the client process.
  • application code written using for pid in 1:nprocs() will break. It needs to be changed to for pid in list_allprocs() or for pid in list_workers()

To be done:

  • RemoteRefs related to the exited worker are currently not being notified or cleaned up.
    I could not follow all the various data structures from where they have to be removed. Need some help here.
  • NOT throwing an exception when the socket connection breaks due to an exiting worker results in a seg fault in the client (console) process.

if (myid() == 1) println("Worker $iderr terminated.") end

#TODO : Notify all RemoteRefs linked to this Worker who just died....
# How?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some help here. I could not follow the various data structures that need to be cleaned up.

@samtkaplan
Copy link

cool! Can 'list_allprocs()' be shortened to 'procs()', and 'list_allworkers()' be shortened to 'workers()'? Note that procs(d::DArray) is used to list all process id's used by a distributed array. I guess it would be nice to keep consistant with that.

@ViralBShah
Copy link
Member

Being a huge fan of tab completion, I prefer all these functions to be procs*. It is just that procsadd does not have the same ring to it as addprocs.

In any case, the list suffixes should be dropped. We do need bikeshedding of the names here. It is not immediately obvious how procs and workers are related.

@ViralBShah
Copy link
Member

That said - this is probably the first in a series of PRs, and the bikeshedding does not have to block this one.

@JeffBezanson
Copy link
Sponsor Member

This is great. I can do the RemoteRef part and I'll look into the segfault. I also prefer the names procs and workers. The latter could perhaps be procs(client=false).

@amitmurthy
Copy link
Contributor Author

rebased and changed names to procs() and workers().

procs(client=false) will not work since the client process is also treated as a worker when nprocs() == 1

JeffBezanson added a commit that referenced this pull request Jun 17, 2013
RFC : process ids are independent of nprocs() and others
@JeffBezanson JeffBezanson merged commit a651fb4 into JuliaLang:master Jun 17, 2013
@amitmurthy amitmurthy deleted the amitm/multi branch June 18, 2013 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants