RFC: detach() and attach() from/to worker processes #3428
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It provides a means for the client REPL to be detached from the set of worker processes and reattached later. Typical uses cases:
Currently implemented are
detach
andattach
. Both can only be executed on the client (id =1) process.detach(connection_file::String)
safely removes the client from the process group and writes complete connection information to theconnection_file
.attach(connection_file::String)
uses the information as garnered during detach and reconnects to the cluster.Only one client process can be connected to the cluster at any time.
TO BE DONE
Since the above results in the client (id = 1) being detached, it leads to certain issues with the parallel computing infrastructure we currently have - for example, @parallel and pmap. In typical usage the client process (id =1) will be the controller processes for the entire distributed job execution and currently is meant to be interactive - in the sense the terminal (REPL) is expected to be kept open.
Since the client process can now be detached and closed, we should provide an alternative mechanism for the user to push computation work to the "background", query its status and retrieve results independent of the client process.
We can have the following new macros/functions:
@bg_exec(key::String, code_block)
- runs the block of code, code_block, in the background, i.e., on the worker with the lowest process id, (lowest, non pid=1 process). The result of the block of code will be stored in a Dict on the said worker with the specifiedkey
bg_clear()
- clears the dict on the worker where background jobs are controlled from.bg_take(key), bg_fetch(key)
- takes and fetches the responses from the dictbg_put(key)
- application code can add its own information to this Dict, for example progress information on long running computations that can be queried periodically.NOTE:
Any issues/ suggestions / different schemes for having a clean and consistent implementation of
client detach/attach
is welcome.