Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes #5091; Ensure we don't wait on an exited process on Linux #23743

Merged
merged 5 commits into from
Jul 1, 2024

Conversation

maleyva1
Copy link
Contributor

Fixes #5091.

Ensure we don't wait on an exited process on Linux

@alex65536
Copy link
Contributor

alex65536 commented Jun 25, 2024

I think that the proposed fix is kinda racy, because if the process is terminated between your check and blocking the signals, then we will not receive SIGCHLD for it and wait until the timeout expires.

A better approach would be to check for running() after SIGCHLD is blocked, I think.

@alex65536
Copy link
Contributor

Also, the fix seems to be incomplete. It doesn't make the situation worse, but there are still cases when wait() will hang. For example, when we have multiple threads, it is not guaranteed that we will receive SIGCHLD in the correct thread and we will hang because of this. Or, if we wait for more than one process simultaneously from multiple threads. Though, I haven't tried these scenarios myself and got these ideas just by looking at the code.

@maleyva1
Copy link
Contributor Author

I think that the proposed fix is kinda racy, because if the process is terminated between your check and blocking the signals, then we will not receive SIGCHLD for it and wait until the timeout expires.

A better approach would be to check for running() after SIGCHLD is blocked, I think.

That was my initial approach, but I don't see much difference between having the check after the pthread_sigmask or before. The expectation is that calling waitForExit on a terminated process should not wait. If the process is not yet terminated when waitForExit is invoked, we should wait (up to timeout) for the process' termination.

Invoking running() updates p's internal state if terminated, and allows the code already there to handle processes that have already terminated.

Also, the fix seems to be incomplete. It doesn't make the situation worse, but there are still cases when wait() will hang. For example, when we have multiple threads, it is not guaranteed that we will receive SIGCHLD in the correct thread and we will hang because of this. Or, if we wait for more than one process simultaneously from multiple threads. Though, I haven't tried these scenarios myself and got these ideas just by looking at the code.

The std/osproc library does not use wait, it uses waitpid which specifically requires a PID to wait on. Since PIDs are unique, I don't see how waitpid will not receive a change in the process' state. It is also invoked with WNOHANG which ensures it is non-blocking.

@alex65536
Copy link
Contributor

alex65536 commented Jun 26, 2024

That was my initial approach, but I don't see much difference between having the check after the pthread_sigmask or before. The expectation is that calling waitForExit on a terminated process should not wait. If the process is not yet terminated when waitForExit is invoked, we should wait (up to timeout) for the process' termination.

If the process terminates between your check and blocking the signals, then we will wait for the whole timeout even if the process had already terminated, as in the original bug.

The std/osproc library does not use wait, it uses waitpid which specifically requires a PID to wait on. Since PIDs are unique, I don't see how waitpid will not receive a change in the process' state. It is also invoked with WNOHANG which ensures it is non-blocking.

The code path without timeout uses waitpid() indeed. But, when there is a timeout, then waiting is implemented by blocking SIGCHLD and waiting synchronously until we receive it. In this case, all my concerns above apply, and waiting for more than one process simultaneously will result in a bug.

@maleyva1
Copy link
Contributor Author

If the process terminates between your check and blocking the signals, then we will wait for the whole timeout even if the process had already terminated, as in the original bug.

I understand, but your example assumes the process has not terminated before the invocation of waitForExit which is different from the original bug. More explicitly, consider two processes, one running and one terminated:

var running: Process
var terminated: Process

An invocation waitForExit(terminated) should return immediately. Meanwhile, an invocation waitForExit(running) should wait up to timeout. In your hypothetical, we are dealing with running and not with terminated at the point of invocation. So supposing that running terminates between the check and the sigprocmask (or the pthread_sigmask) within waitForExit is not the same as the original bug.

The code path without timeout uses waitpid() indeed. But, when there is a timeout, then waiting is implemented by blocking SIGCHLD and waiting synchronously until we receive it. In this case, all my concerns above apply, and waiting for more than one process simultaneously will result in a bug.

I understand your concern, but I think it's only valid if we were using wait. Since we are using waitpid, even if a SIGCHLD is broadcasted for a different process than the one we are waiting on, the subsequent invocation of waitpid ensures we check that the SIGCHLD we received is for the process we are waiting on. Furthermore, even supposing two or more processes terminated at the same time, subsequent invocations to waitpid ensures that the PID we're waiting on has terminated.

@alex65536
Copy link
Contributor

I understand, but your example assumes the process has not terminated before the invocation of waitForExit which is different from the original bug.

Maybe the bug is a bit different, but the problem remains the same: if the process terminates right before sigprocmask is called, then we will have to wait for the full timeout instead of returning immediately after calling sigprocmask.

I understand your concern, but I think it's only valid if we were using wait. Since we are using waitpid, even if a SIGCHLD is broadcasted for a different process than the one we are waiting on, the subsequent invocation of waitpid ensures we check that the SIGCHLD we received is for the process we are waiting on.

In case of timed waitForExit(), we don't wait on waitpid(), we use it in non-blocking mode only to ensure that the process has terminated. Instead, we block on sigtimedwait(). If we don't receive SIGCHLD because another thread has received it, then our sigtimedwait() will block until the timeout expires. Thus, for example, if the program terminates in 1 second after calling waitForExit() and timeout is 1000 seconds, then we will wait for 1000 seconds instead of 1 because of missed SIGCHLD. And no, waitpid() doesn't save us from such a long wait, because we block on sigtimewait() and cannot really unblock earlier because we have missed the signal.

@maleyva1
Copy link
Contributor Author

Maybe the bug is a bit different, but the problem remains the same: if the process terminates right before sigprocmask is called, then we will have to wait for the full timeout instead of returning immediately after calling sigprocmask.

I think understand where your coming from. This current change is reaping the zombie processes too early. It's not the same issue, but it is inefficient to wait the whole timeout milliseconds if we can easily not do that. By reaping just after blocking the signal, we can prevent a full timeout wait. Moving the check just after sigprocmask should address both issues.

In case of timed waitForExit(), we don't wait on waitpid(), we use it in non-blocking mode only to ensure that the process has terminated. Instead, we block on sigtimedwait(). If we don't receive SIGCHLD because another thread has received it, then our sigtimedwait() will block until the timeout expires. Thus, for example, if the program terminates in 1 second after calling waitForExit() and timeout is 1000 seconds, then we will wait for 1000 seconds instead of 1 because of missed SIGCHLD. And no, waitpid() doesn't save us from such a long wait, because we block on sigtimewait() and cannot really unblock earlier because we have missed the signal.

I think I may have misunderstood you. When you originally said "hang", I thought you meant it blocks indefinitely. Mea culpa!

I agree that we will wait timeout if the SIGCHLD is delivered to another thread than ours, but I don't see an optimal way around this on Linux. A naive approach is to busy wait on waitpid() up to timeout, but I'm not sure that's better.

@alex65536
Copy link
Contributor

alex65536 commented Jun 29, 2024

I agree that we will wait timeout if the SIGCHLD is delivered to another thread than ours, but I don't see an optimal way around this on Linux. A naive approach is to busy wait on waitpid() up to timeout, but I'm not sure that's better.

I also don't know a simple and good solution here, at least a portable one. New Linux kernels (since 5.3) have pidfd's, which may allow us to write timed wait for a process without using signals, but it's not portable to other UNIX-like systems and limited only to newer kernels.

@maleyva1
Copy link
Contributor Author

maleyva1 commented Jul 1, 2024

I also don't know a simple and good solution here, at least a portable one. New Linux kernels (since 5.3) have pidfd's, which may allow us to write timed wait for a process without using signals, but it's not portable to other UNIX-like systems and limited only to newer kernels.

pidfd looks great, but you're right, it limits the usage of waitForExit on other platforms. I'm not sure how else to make it more efficient while keeping it portable except for busy waiting on the PID using waitpid. My suggested solution is below. Let me know what you think.

import std/times
proc waitForExit(p: Process, timeout: int = -1): int =
  # Max MS delay
  const maxWait = 100
  if p.exitFlag:
    return exitStatusLikeShell(p.exitStatus)
  
  let wait = 
    if timeout <= 0:
      initDuration()
    else:
      initDuration(milliseconds = timeout)
  let deadline = getTime() + wait
  # starting ms delay
  var delay = 1
  
  while true:
    var status: cint = 0
    let pid = waitpid(p.id, status, WNOHANG)
    if p.id == pid :
      p.exitFlag = true
      p.exitSTatus = status
      break
    elif pid.int == -1:
      raiseOsError(osLastError())
    else:
      # Continue waiting if needed
      if getTime() >= deadline:
        # Previous version of `waitForExit`
        # foricibly killed the process.
        # We keep this so we don't break programs
        # that depend on this behavior
        if posix.kill(p.id, SIGKILL) < 0:
          raiseOSError(osLastError())
      else:
        let newWait = getTime() + initDuration(milliseconds = delay)
        var waitSpec: TimeSpec
        waitSpec.tv_sec = posix.Time(newWait.toUnix())
        waitSpec.tv_nsec = newWait.nanosecond.clong
        discard posix.clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, waitSpec, cast[var TimeSpec](nil))
        delay = min(delay * 2, maxWait)

  result = exitStatusLikeShell(p.exitStatus)

Note that this adds the TimeEffect tag to waitForExit, but it was already using POSIX time functions in it's body. This version makes it explicit that there are time-related functions being used. Tagging it with TimeEffect also lines up with how some std/net procs with timeouts are tagged as well (I'm looking at recv).

@Araq
Copy link
Member

Araq commented Jul 1, 2024

Looks good to me.

@Araq Araq merged commit 288d5c4 into nim-lang:devel Jul 1, 2024
18 checks passed
Copy link
Contributor

github-actions bot commented Jul 1, 2024

Thanks for your hard work on this PR!
The lines below are statistics of the Nim compiler built from 288d5c4

Hint: mm: orc; opt: speed; options: -d:release
179043 lines; 8.395s; 664.703MiB peakmem

@alex65536
Copy link
Contributor

alex65536 commented Jul 6, 2024

My suggested solution is below. Let me know what you think.

@maleyva1 I am unsure that programming on sleep's is a good idea, since it means at least 100ms delay for every process, which can dramatically affect performance for short-lived processes.

One of the alternative ideas is to make a global SIGCHLD handler and maintain a set of unfinished PIDs globally. Then, make that SIGCHLD handler notify everyone that one of the processes has terminated. Unfortunately, this solution is quite complex and requires global state and extra care while writing signal handlers (async-signal-safety and all this stuff).

Unluckily, there is AFAIK no good generic solution for UNIX systems. See also: https://catern.com/process.html#orgc15165e.

@maleyva1
Copy link
Contributor Author

maleyva1 commented Jul 7, 2024

One of the alternative ideas is to make a global SIGCHLD handler and maintain a set of unfinished PIDs globally. Then, make that SIGCHLD handler notify everyone that one of the processes has terminated. Unfortunately, this solution is quite complex and requires global state and extra care while writing signal handlers (async-signal-safety and all this stuff).

I am aware of this approach. I did not suggest it not only because of the aforementioned issues but also because SIGCHLD can still coalesce with signal handlers, leading to the same problem as before (you have 2 or more processes terminate at the same time, but only receive 1 SIGCHLD).

I am unsure that programming on sleep's is a good idea, since it means at least 100ms delay for every process, which can dramatically affect performance for short-lived processes.

A valid concern, but we can adjust the sleep intervals to whatever optimal period we need (with nanosecond precision) or even expose it to users as an argument like timeout for finer control. It would be useful to have real-world performance measurements regarding this approach.
For what it's worth, Python's subprocess library implements a similar approach.

@alex65536
Copy link
Contributor

For what it's worth, Python's subprocess library implements a similar approach.

Wow, I didn't know that before :)

But, unlike your idea, it seems that Python's subprocess doesn't wait with a fixed timeout, but implements some kind of exponential backoff instead, starting from 500µs and doubling the sleep interval until it reaches 50ms. Such waiting pattern looks much better than waiting for 100ms every time.

@maleyva1
Copy link
Contributor Author

But, unlike your idea, it seems that Python's subprocess doesn't wait with a fixed timeout, but implements some kind of exponential backoff instead, starting from 500µs and doubling the sleep interval until it reaches 50ms. Such waiting pattern looks much better than waiting for 100ms every time.

Actually, if you look back at the suggested rewrite, we similarly do what Python's subprocess library does: sleep until timeout, double that timeout, and cap off at some max amount. The difference in the above rewrite is that it starts from 1 ms and moves up to 100ms.

Though you're right; perhaps waiting up to 100ms is not the right choice.

@alex65536
Copy link
Contributor

Actually, if you look back at the suggested rewrite, we similarly do what Python's subprocess library does: sleep until timeout, double that timeout, and cap off at some max amount. The difference in the above rewrite is that it starts from 1 ms and moves up to 100ms.

Oops, sorry, you are right, I haven't noticed that your suggested rewrite also contains the same exponential backoff. Then, maybe your solution is good enough, considering that all the others are either very complex or don't work in some cases.

Araq pushed a commit that referenced this pull request Jul 12, 2024
)

Addresses #23825 by using the approaching described in
#23743 (comment).

This takes the approach from Python's `subprocess` library which calls
`waitid` in loop, while sleeping at regular intervals.

CC @alex65536
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

waitForExit() with timeout hangs on already dead processes on Linux
3 participants