Possible hang with Async_detach #569

gabelevi · 2018-03-30T22:52:51Z

I have been debugging a hang in my project. I'll try and describe the symptoms and what I think is happen.

Symptoms

When I run my test suite, it runs my binary hundreds of times. On most test suite runs, one or two tests get stuck
I have figured out that the binary gets stuck after a Lwt_unix.write call to a pipe. I have confirmed that the fd is ready to be written and someone is listening on the other side.
When I strace a stuck binary, I observe the main thread calling select, waiting for various fds to be readable. I also observe a single system thread worker, performing a blocking read call.
My binary is a server. I can get the binary unstuck by sending it a request.
Upon receiving the request, the binary creates another system thread worker and finally performs the write call.

My guess at a cause

(I'm assuming async_detach since that's the default and I don't ever specify an async method)

I poked around in lwt_unix_stubs.c a bit. I'm guessing I'm hitting a deadlock that looks like this:

The binary starts with 0 system threads
The read system call either A) sees 0 system threads and creates a new one, or B) sees 1 free system thread and adds itself to the pool
Before the system thread starts on the read and can decrement thread_waiting_count, the main thread adds the write to the pool too
The system thread starts on the read job and blocks indefinitely
The write job is stuck in the pool
When I send a request to get it unstuck, the request sends another write command, which creates a second system thread. In my strace, I see two write system calls on the new system thread - one to an fd and one to eventfd
The new second system thread runs the blocked write, and the binary is now unhung

Things to do

Avoid long blocking system calls

Before lwt, I would use the select & read pattern. With Lwt, I've just been using read. Perhaps I should always use wait_read and wait_write before a read or write, to avoid tying up a system thread for a long time.

If this is a best practice, it might be worth documenting too.

Fix the deadlock

Perhaps the number of jobs in the pool_queue should never exceed thread_waiting_count. If we assume that every job might take a Very Long Time, then putting N+1 jobs in the queue when there are only N free system workers will block that last job for a Very Long Time

So we could do something like

Keep a pool_queue_size
If pool_queue_size >= thread_waiting_count || thread_waiting_count == 0, create a new system thread

The text was updated successfully, but these errors were encountered:

gabelevi · 2018-03-30T23:02:31Z

It's worth mentioning that all the fds I'm using are blocking. Switching to using non-blocking fds would probably also help. Might also explain why I don't see other reports of this behavior.

aantron · 2018-03-30T23:10:16Z

I haven't fully understood the issue yet, but I want to immediately note that:

Indeed, pipes are non-blocking by default in Lwt, so interacting with them does not, by default, involve the thread pool. The above can/does apply if you have deliberately opened the pipe in blocking mode.
IIRC, Lwt internally does wait_write before a blocking write, so it should not be necessary for the user to do that.

gabelevi · 2018-03-30T23:18:01Z

pipes are non-blocking by default in Lwt

Unfortunately, my project has been incrementally adopting Lwt. So the process which creates the pipe and spawns the lwt-using-server doesn't use Lwt. So we have a blocking pipe. (Which I can fix with Unix.set_nonblock

IIRC, Lwt internally does wait_write before a blocking write, so it should not be necessary for the user to do that.

Yep, you're right! Interestingly enough, the wait_read before a blocking read was commented out back in October (c285495). Do you remember why?

aantron · 2018-03-30T23:18:10Z

If I understood correctly, at that time when the binary gets stuck, you would expect to see the main thread and two system threads, not one, with one system thread waiting in read, and the other tasked with performing a potentially-blocking write (which we hope doesn't block because Lwt, IIRC, did wait_write first). However, the read and write task are "racing" for the same thread from the thread pool, due to a race condition in how the free threads count is managed. As a result, the write is never initiated, and if there is nothing to read either, the server is stuck forever. Is that a correct understanding?

gabelevi · 2018-03-30T23:18:44Z

Yep! That's exactly what I think is happening!

aantron · 2018-03-30T23:21:37Z

Unfortunately, my project has been incrementally adopting Lwt. So the process which creates the pipe and spawns the lwt-using-server doesn't use Lwt. So we have a blocking pipe. (Which I can fix with Unix.set_nonblock

No problem. You may want to set it to non-blocking mode as a quick workaround, but if this is accurate, then this is a classic type of bug that definitely needs to be fixed on the Lwt side as well :)

I don't remember why that was commented off the top of my head. Have to look into it. Given I have a rule never to leave any code commented without commenting why, I suspect that was supposed to be temporary for debugging, and I missed restoring it during self-review.

aantron · 2018-03-30T23:24:33Z

Also, to make it non-blocking from Lwt's point of view, you should use the functions in Lwt_unix. If you are using of_unix_file_descr, I think that can be ensured with ~blocking:false ~set_flags:true. I also recommend checking Lwt_unix.blocking on the fd, to see if Lwt thinks it is blocking or not. One source of problems is when you have a blocking fd that Lwt thinks is non-blocking, and Lwt happily blocks on it in the main thread. Given your description, though, I don't think that is happening.

aantron · 2018-03-30T23:35:42Z

...and this suggests another potential workaround, to try Lwt 3.1.0 for the time being. It should have this bug and/or the even more incorrect thread count management Lwt had last year, but Lwt actually doing wait_read will make it very unlikely that read will block in the worker thread, creating this condition. You could also call wait_read manually for now.

I will look into this issue in detail shortly :)

aantron · 2018-03-30T23:39:42Z

...and, confirmed, that call shouldn't be commented out. I remember clearly now, that I did that in order to be able to debug the other issue (linked to the commit).

However, Lwt should work whether wait_read is commented or not, just not as well when it is commented. So, there is still the underlying issue exposed by commenting it.

aantron · 2018-03-31T02:19:28Z

@gabelevi Your analysis was correct. See the attached commit.

Can you please pin Lwt to https://github.com/ocsigen/lwt.git#569 and see if that fixes your tests? That branch is based on Lwt 4.0.0, which isn't merged into opam yet, but it should be backwards-compatible with 3.x.x, except for the previously-announced breaking changes (#453).

The short description of the problem is that worker threads were decrementing the idle thread count, after being awakened from the main thread. The main thread should instead itself decrement the idle worker count, to ensure thread wakeup and counting is done atomically.

With the previous code, when the main thread saw one worker idle, it would awaken it. The worker would now be racing with the rest of the main thread, which was going on to submit another job to the thread pool. If this next job submission happened before the worker thread started running, the main thread would still see the count say that one worker thread is idle, even though that would not be the case.

I was able to reproduce this (or a similar) issue on my end with this program:

open Lwt.Infix

let () =
  (* This blocking pipe is just a way to get Lwt_unix to make a worker thread
     block indefinitely. Without wait_read, Lwt_unix hands a worker thread the
     job of reading from a pipe that will never become readable. *)
  let read_from, _ = Unix.pipe () in
  let read_from =
    read_from |> Lwt_unix.of_unix_file_descr ~blocking:true ~set_flags:false in

  (* A writable blocking pipe. *)
  let _, write_to = Unix.pipe () in
  let write_to =
    write_to |> Lwt_unix.of_unix_file_descr ~blocking:true ~set_flags:false in

  (* Perform any blocking operation, so that Lwt spawns one worker thread. *)
  let _ = Lwt_unix.stat "." |> Lwt_main.run in

  (* Check we have one worker thread, and it is waiting for work. *)
  assert (Lwt_unix.thread_count () = 1);
  assert (Lwt_unix.thread_waiting_count () = 1);

  (* We want the highest chance that the write is queued before the read worker
     starts running, so do both allocations first to reduce the amount of code
     that has to run between Lwt_unix.read and Lwt_unix.write. *)
  let read_buffer = Bytes.create 1 in
  let write_buffer = Bytes.create 1 in

  Lwt_unix.read read_from read_buffer 0 1
  |> ignore;

  Lwt_unix.write write_to write_buffer 0 1
  |> Lwt_main.run
  |> ignore

(* ocamlfind opt -linkpkg -package lwt.unix race.ml && ./a.out *)

This sometimes exits after a successful write, but it usually hangs.

It's difficult to make this into a permanent test because I am going to fix the commented-out wait_read next, which will make this race condition difficult to reproduce. There are no other Lwt_unix calls that we can use to reliably block in a worker, so I'll probably need to later write a custom one for that using Lwt_unix's bindings APIs.

I didn't fully fix the code for Async_switch, because that is deprecated.

Summary: This is the result of a week long investigation into Flow occasionally hanging. Here's what you should know: * A read/write of a blocking fd will block until the fd is readable/writable * A read/write of a nonblocking fd will return -1 immediately if the fd is not readable/writable * Lwt super extra hates running blocking system calls. It will create system threads and run the blocking system call on them * There's a bug in Lwt in scheduling the system threads: ocsigen/lwt#569 From our point of view, there's very little difference in using blocking or non-blocking fds. We had just been using blocking fds since I couldn't really tell if one was better than the other, However, using non-blocking keeps us from needing system threads and works around the bug, so let's use them! Reviewed By: samwgoldman Differential Revision: D7464034 fbshipit-source-id: e0ba602381a8bef7dd374ee1cd5fb0fdef9ad7d9

See #569. The readability check was accidentally commented out in c285495.

Summary: I reported the Lwt issue here: ocsigen/lwt#574. The long story short is that reading from non-blocking fds on Windows doesn't seem to yield to other threads, even when the fd isn't ready to read. I'm not sure the cause. While we could just revert back to using blocking fds, we'd need to either workaround ocsigen/lwt#569 or ask for a lwt release and update lwt. Reviewed By: ljw1004 Differential Revision: D7603653 fbshipit-source-id: d6f9b4ed256cfa5b1fb286f9bd41c75748c28ddc

Summary: I reported the Lwt issue here: ocsigen/lwt#574. The long story short is that reading from non-blocking fds on Windows doesn't seem to yield to other threads, even when the fd isn't ready to read. I'm not sure the cause. While we could just revert back to using blocking fds, we'd need to either workaround ocsigen/lwt#569 or ask for a lwt release and update lwt. Reviewed By: ljw1004 Differential Revision: D7603653 fbshipit-source-id: 4af7e806e68cf188c16d199e98801be3b430ac54

aantron added a commit that referenced this issue Apr 4, 2018

Lwt_unix.read: restore readability check

86a6baf

See #569. The readability check was accidentally commented out in c285495.

aantron closed this as completed in a5ae15f Apr 4, 2018

gabelevi mentioned this issue Apr 12, 2018

Lwt_unix.read on Windows with non-blocking fd seems to block Lwt #574

Open

aantron mentioned this issue Apr 13, 2018

Lwt 4.0.1 ocaml/opam-repository#11785

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible hang with Async_detach #569

Possible hang with Async_detach #569

gabelevi commented Mar 30, 2018

gabelevi commented Mar 30, 2018

aantron commented Mar 30, 2018 •

edited

Loading

gabelevi commented Mar 30, 2018

aantron commented Mar 30, 2018

gabelevi commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 31, 2018

Possible hang with Async_detach #569

Possible hang with Async_detach #569

Comments

gabelevi commented Mar 30, 2018

Symptoms

My guess at a cause

Things to do

Avoid long blocking system calls

Fix the deadlock

gabelevi commented Mar 30, 2018

aantron commented Mar 30, 2018 • edited Loading

gabelevi commented Mar 30, 2018

aantron commented Mar 30, 2018

gabelevi commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 30, 2018

aantron commented Mar 31, 2018

aantron commented Mar 30, 2018 •

edited

Loading