-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BugFix: fs_posix.cc GetFreeSpace uses wrong value non-root users #8370
Conversation
env/fs_posix.cc
Outdated
*free_space = | ||
((uint64_t)sbuf.f_bsize * (geteuid() ? sbuf.f_bavail : sbuf.f_bfree)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you break this up into an if/else and add a comment as to what you are doing in the different cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrambacher Done. Please let me know if more edits desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Just curious if there is a reason not to always use "avail" rather than "free". That appears to be the way C++ is leaning (https://en.cppreference.com/w/cpp/filesystem/space)
@mrambacher I speak only for me. Given that many rental environments default to the 5% reserve space on even terabyte storage areas ... I would want the different treatment if I could run as root. That is an extra 50G of storage on a terabyte rental. (Ok, at home all my datasets run on mounts that do not have reserve space for root ... but have hit this error so many times with customers in the last 7 or so years, even before rental servers became popular) |
@mrambacher Do I need to do anything about build-windows-vs2019-cxx20? Looks unrelated to me. |
if (geteuid()) { | ||
// non-zero user is unprivileged, or -1 if error. take more conservative | ||
// size | ||
*free_space = ((uint64_t)sbuf.f_bsize * sbuf.f_bavail); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use f_frsize
instead of f_bsize
? See cockroachdb/pebble#1072 and https://en.cppreference.com/w/cpp/filesystem/space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbowens: I am using f_bsize because that is what the original Facebook code used in this routine. I am not going to advocate strongly in either direction since I am not staring at the Linux source code, much less any other operating system's source code. But my last reference below seems the most strong and supports f_bsize.
I did research and found several things that basically say "seems like both are the same".
This link leans toward f_frsize with some Linux source references:
https://stackoverflow.com/questions/54823541/what-do-f-bsize-and-f-frsize-in-struct-statvfs-stand-for
This link leans toward f_frsize implying fragments that can be less than block size, but also says definitions are vague:
https://unix.stackexchange.com/questions/463369/what-can-f-bsize-be-used-for-is-it-similar-to-st-blksize
Google with search key of "linux f_frsize" list low on the page a link to show a page image from "The Linux Programming Interface: A Linux and UNIX System Programming Handbook". I have to manual type the paragraph from bottom of page 276: "For most Linux file systems, the values of f_bsize and f_frsize are the same. However, some file systems support the notion of block fragments, which can be used to allocate a smaller unit of storage at the end of a file if a full block is not required. This avoids the waste of space that would otherwise occur if a full block was allocated. On such file systems, f_frsize is the size of a fragment, and f_bsize is the size of a whole block. (The notion of fragments in UNIX file system first appeared in early 1980s with the 4.2BSD Fast File System described in [McKusick et al., 1984].)"
This last quote strongly points me in the direction of f_bsize. Your mileage may vary.
Some environments reserve free disk blocks for root users, and a non-root user has less disk space available than the value calculated from `Bfree` indicates. Note that CockroachDB already uses `Bavail` for computing store capacity metrics, although it relies on the github.com/elastic/gosigar module for the calculation. https://github.com/elastic/gosigar/blob/v0.14.1/sigar_unix.go#L23 See facebook/rocksdb#8370.
@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@jay-zhuang merged this pull request in 5a2b4ed. |
Some environments reserve free disk blocks for root users, and a non-root user has less disk space available than the value calculated from `Bfree` indicates. Note that CockroachDB already uses `Bavail` for computing store capacity metrics, although it relies on the github.com/elastic/gosigar module for the calculation. https://github.com/elastic/gosigar/blob/v0.14.1/sigar_unix.go#L23 See facebook/rocksdb#8370.
Some environments reserve free disk blocks for root users, and a non-root user has less disk space available than the value calculated from `Bfree` indicates. Note that CockroachDB already uses `Bavail` for computing store capacity metrics, although it relies on the github.com/elastic/gosigar module for the calculation. https://github.com/elastic/gosigar/blob/v0.14.1/sigar_unix.go#L23 See facebook/rocksdb#8370.
fs_posix.cc GetFreeSpace() calculates free space based upon a call to statvfs(). However, there are two extremely different values in statvfs's returned structure: f_bfree which is free space for root and f_bavail which is free space for non-root users. The existing code uses f_bfree. Many disks have 5 to 10% of the total disk space reserved for root only. Therefore GetFreeSpace() does not realize that non-root users may not have storage available.
This PR detects whether the effective posix user is root or not, then selects the appropriate available space value.