Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent warnings of forcibly interrupting busy workers #70

Open
trathi05 opened this issue Aug 21, 2020 · 1 comment
Open

intermittent warnings of forcibly interrupting busy workers #70

trathi05 opened this issue Aug 21, 2020 · 1 comment

Comments

@trathi05
Copy link

My code that adds processes using addprocs and subsequently performs parallelization using pmap sometimes terminates with the following warning. This doesn't affect my output of the code in any way, but this warning shows up in the end, esp. with scripts that run for significant amount of time (over an hour at least).

┌ Warning: Forcibly interrupting busy workers
│   exception = rmprocs: pids [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24] not terminated after 5.0 seconds.
â”” @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/cluster.jl:1234
┌ Warning: rmprocs: process 1 not removed
â”” @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/cluster.jl:1030

I am not sure if this is a machine related issue or it has something to do with the Distributed package.

@tlarock
Copy link

tlarock commented May 5, 2022

I am getting similar behavior to @trathi05 with some code using Distributed. I call Julia with julia -p 4 --project=. my_script.jl, some functions run independently in parallel using pmap with 4 workers and everything behaves as I expect (including the correct outputs). But when the script terminates I get the following warnings:

┌ Warning: Forcibly interrupting busy workers
│   exception = rmprocs: pids [3] not terminated after 5.0 seconds.
└ @ Distributed ~/julia/usr/share/julia/stdlib/v1.9/Distributed/src/cluster.jl:1253
┌ Warning: rmprocs: process 1 not removed
└ @ Distributed ~/julia/usr/share/julia/stdlib/v1.9/Distributed/src/cluster.jl:1049

One guess (without real evidence) is that once all of my tasks have been assigned, at least one of the workers that is no longer needed is still active for some reason while the other workers finish their tasks, and so it needs to be forcibly terminated because it is "hanging" (for lack of a more precise term/understanding of what might be happening).

Another guess is that process 1 refers to the host process, and for some reason it is not shutting down properly. I'm not sure if that is even possible, since I assume the host process is the one sending the warnings. Potentially relevant here could be that I am calling pmap from inside a function that is defined in the script.

Since it doesn't seem to be causing a problem with my code execution or performance, it is not a big deal at all. However, it is a concerning-looking warning nonetheless, especially if others use my code down the line.

I am running julia 1.9.0-DEV (2022-05-04, Commit 862018b20d) on Mac OS Montery with Apple Silicon.

@vtjnash vtjnash transferred this issue from JuliaLang/julia Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants