Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements to readdir, readdir_n #292

Merged
merged 6 commits into from
Nov 23, 2016
Merged

Some improvements to readdir, readdir_n #292

merged 6 commits into from
Nov 23, 2016

Conversation

aantron
Copy link
Collaborator

@aantron aantron commented Nov 21, 2016

This eliminates the unnecessary allocations in readdir_n mentioned in #290. It also fixes several bugs, including buffer overflows.

With a slightly expanded version of the reproduction case from #290, the timings on my machine are

                              before    after
Lwt_unix.files_of_directory    4.2s      3.8s      
Lwt_unix.readdir_n             2.2s      1.8s
Sys.readdir                    1.4s      1.4s
Unix.readdir                   1.0s      1.0s
Lwt_unix.readdir              45.4s     45.4s

So there is some performance improvement, but more would be better. I will look into what else can be done, for a separate potential PR.


It occurs to me that readdir_n and files_of_directory shouldn't be in the interface at all, however.

The underlying C library's readdir_r internally retrieves multiple directory entries by system call, and stores them in a buffer. Subsequent calls to readdir_r just read this buffer until it is drained.

However, Lwt always runs readdir_r in separate worker threads, because Lwt does not know when readdir_r will only read the buffer, and when it will run a potentially-blocking system call. It is expensive to synchronize with those worker threads. As a result, Lwt's readdir re-introduces the same kind of performance problem that C's readdir_r already tries to solve by buffering.

So, I think Lwt's readdir should just read ahead and buffer the results of the underlying readdir_r on the Lwt side. When the Lwt buffer has unread entries remaining, it will not be necessary to run anything in a separate thread.

Lwt_unix.readdir_n is basically there for the user to do manually what should be done by Lwt_unix.readdir internally. And, files_of_directory is a wrapper around readdir_n, because readdir_n is awkward.

If changing the implementation of readdir works, and is fast enough, I will deprecate readdir_n and files_of_directory. I will look into the performance of Lwt_stream itself at a later date, as Lwt_stream has a lot of other problems.


cc @samoht, @yallop

@aantron aantron modified the milestone: 2.7.0 Nov 21, 2016
@yallop
Copy link
Contributor

yallop commented Nov 21, 2016

Thanks for looking at this!

The code now performs fewer allocations, but allocates more space for reads of small directories, since it always allocates an array of 1024 dirents regardless of how many entries are actually read. Do you see any performance changes for directories with few entries?

@aantron
Copy link
Collaborator Author

aantron commented Nov 21, 2016

Good question. There seems to be no significant change in performance. I tried 50000 reads of a 5-file directory:

  • Sys.readdir does this in about 1.0 seconds,
  • files_of_directory from both master and this PR take about 4.1 seconds,
  • readdir_n with count set to 10000 also takes about 4.1 seconds.
  • readdir_n with count set to 10 takes about 4.1 seconds.

So, it would seem that the size of the initial allocation is not really important, and my guess is that readdir_n and files_of_directory have similar performance on small directories because the main slowdown for both comes from dealing with the worker thread. This is only a guess, however.

Note that the total number of directory entries read during this test is about ten times less than in the test in the description of this PR, where there were 500 reads of a 5000-file directory. So, all these functions are 5-20 times slower on small directories.

I should also note that there is an additional sense in which the code now allocates more space, at least on my OS X system – this is due to the buffer overflow fix commit. Before, on my system, 277 bytes were allocated for each dirent. Now, 1048 bytes are allocated.

Before this commit, a double call to Lwt_unix.closedir resulted in a
double call to free in my system's C library's implementation of
closedir.

Calls to Lwt_unix.readdir, etc., after Lwt_unix.closedir resulted in
prolonged hangs and/or segfaults.

For comparison, the corresponding functions in OCaml's stdlib raise
Unix.Unix_error (EBADF, ...).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants