Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix copy_file_range to use ABI specified len types #499

Merged
merged 1 commit into from
Jan 8, 2023
Merged

Fix copy_file_range to use ABI specified len types #499

merged 1 commit into from
Jan 8, 2023

Conversation

SUPERCILEX
Copy link
Contributor

This is an 0.37 breaking change. The lengths are specified as size_ts which means everything will break if we ever get 128 bit machines. I think this is trying to emulate the stdlib copy APIs, but they have different requirements: their APIs always copy the full buffer and just return the number of bytes copied as a convenience. I'm pretty sure you can easily overflow the returned length by doing an io::copy between pipes for example. In rustix land, people need to do the looping themselves. On that note, I believe using a u64 also breaks 32 bit implementations:

        let mut total_copied = 0;
        loop {
            let byte_copied = copy_file_range(&from, None, &to, None,
                // This will stay usize::MAX which will cause an EOVERFLOW
                u64::MAX - total_copied)?;
            if byte_copied == 0 {
                return Ok(());
            }
            total_copied += byte_copied;
        }

Using usize also makes this API consistent with splice.

@sunfishcode
Copy link
Member

Great find!

This is an 0.37 breaking change. The lengths are specified as size_ts which means everything will break if we ever get 128 bit machines.

Can you be more specific about what would break? You might not be able to copy as much as the syscall theoretically could in a single call, but I don't see how it breaks.

I think this is trying to emulate the stdlib copy APIs, but they have different requirements: their APIs always copy the full buffer and just return the number of bytes copied as a convenience. I'm pretty sure you can easily overflow the returned length by doing an io::copy between pipes for example.

I don't think io::copy's return value is just a convenience; if it overflows, I'd assume the io::copy call should fail.

In rustix land, people need to do the looping themselves. On that note, I believe using a u64 also breaks 32 bit implementations:

        let mut total_copied = 0;
        loop {
            let byte_copied = copy_file_range(&from, None, &to, None,
                // This will stay usize::MAX which will cause an EOVERFLOW
                u64::MAX - total_copied)?;
            if byte_copied == 0 {
                return Ok(());
            }
            total_copied += byte_copied;
        }

I haven't tried it (and it might take a few years to overflow a u64 ;-)) but it looks like it should work. Rustix's current code saturates the len to usize::MAX, so copy_file_range will do usize::MAX and the loop will iterate and do the next usize::MAX and so on until it completes.

Using usize also makes this API consistent with splice.

That's a great point. I think we have two options here:

  • Change splice's len to also be a u64, also with a saturating conversion to usize.
  • Change copy_file_range's len to be a usize.

I can see the argument to make len be usize: these are Linux-specific APIs, and Linux itself does use size_t, so using usize is more transparent.

The argument for u64 is that copying data from one file/pipe to another doesn't conceptually have anything to do with the size of the virtual address space of the process doing the copying, and users will often already be using u64 for the values holding the sizes of the things they want to copy, so this makes it easier for users to write code that correctly handles 64-bit I/O sizes on 32-bit platforms.

@SUPERCILEX SUPERCILEX closed this Jan 7, 2023
@SUPERCILEX
Copy link
Contributor Author

shit, bumped the button

@SUPERCILEX SUPERCILEX reopened this Jan 7, 2023
@SUPERCILEX
Copy link
Contributor Author

Can you be more specific about what would break? You might not be able to copy as much as the syscall theoretically could in a single call, but I don't see how it breaks.

Bad wording, I meant that all the APIs will be broken: rustix will have to force everyone over to u128s whether they want to or not.

I don't think io::copy's return value is just a convenience; if it overflows, I'd assume the io::copy call should fail.

I generally agree, but I don't see any overflow checks of the len variables in these implementations: https://github.com/rust-lang/rust/blob/ee0412d1ef81efcfabe7f66cd21476ca85d618b1/library/std/src/io/copy.rs#L89.

will do usize::MAX and the loop will iterate and do the next usize::MAX and so on until it completes.

That's actually the problem. If you write this:

loop {
  if copy_file_range(&from, None, &to, None, u64::MAX)? == 0 {
    return Ok(());
  }
}

It will fail on the second iteration of the loop with EOVERFLOW since the file offset would overflow if it did actually copy offset + MAX bytes. I just don't know if 32-bit linux uses a 32 bit offset or not. Though looking at it from the 128-bit side, a rustix binary that uses u128s in its copy_file_range will definately be broken on present day 64 bits machines.

so this makes it easier for users to write code that correctly handles 64-bit I/O sizes on 32-bit platforms.

That's true, and I think I would agree if it wasn't for the EOVERFLOW issue. Maybe I can muck around with CI to find out whether or not 32-bit machines are broken? If they aren't, then I'm probs ok with going the u64 route (though it does still feel weird to lie about the kernel API).

@sunfishcode
Copy link
Member

Can you be more specific about what would break? You might not be able to copy as much as the syscall theoretically could in a single call, but I don't see how it breaks.

Bad wording, I meant that all the APIs will be broken: rustix will have to force everyone over to u128s whether they want to or not.

Do you mean that rustix would need to make the len argument 128-bit? copy_file_range's off_in and off_out arguments already hard-code u64 as the file offset type, so it could never copy more than 16 exbibytes anyway.

I don't think io::copy's return value is just a convenience; if it overflows, I'd assume the io::copy call should fail.

I generally agree, but I don't see any overflow checks of the len variables in these implementations: https://github.com/rust-lang/rust/blob/ee0412d1ef81efcfabe7f66cd21476ca85d618b1/library/std/src/io/copy.rs#L89.

That may be a bug. It'd be a bug that few people would ever be able to get anywhere near, but a bug nonetheless.

will do usize::MAX and the loop will iterate and do the next usize::MAX and so on until it completes.

That's actually the problem. If you write this:

loop {
  if copy_file_range(&from, None, &to, None, u64::MAX)? == 0 {
    return Ok(());
  }
}

It will fail on the second iteration of the loop with EOVERFLOW since the file offset would overflow if it did actually copy offset + MAX bytes. I just don't know if 32-bit linux uses a 32 bit offset or not. Though looking at it from the 128-bit side, a rustix binary that uses u128s in its copy_file_range will definately be broken on present day 64 bits machines.

Ironically, that program fails for me on 64-bit x86_64, and passes on 32-bit i686. On x86_64, it appears Linux doesn't like that the second iteration requests u64::MAX on a file which by that point has a non-zero offset, because the offset calculation would overflow, even when the input has no more bytes left and there wouldn't be an overflow if it did the I/O first. On i686, rustix saturates the u64::MAX to 32-bit usize::MAX, and Linux's 64-bit file offset calculation doesn't overflow.

Hmm, and that's actually an ugly difference. Ideally, i686 should fail in the same way as x86_64 does, but rustix has to saturate before Linux does the overflow check, and there's no way for rustix to do the overflow check itself (without doing an lseek to read the position, but that'd be unacceptable overhead).

I'll need to think it through a little more, but that may be the thing that convinces me that len should be usize.

@SUPERCILEX
Copy link
Contributor Author

Do you mean that rustix would need to make the len argument 128-bit? copy_file_range's off_in and off_out arguments already hard-code u64 as the file offset type, so it could never copy more than 16 exbibytes anyway.

Huh, you're right. Why did they go with a usize then? Weird.

I'll need to think it through a little more, but that may be the thing that convinces me that len should be usize.

Since your experiments suggest that 32-bit is still using 64-bit offsets (which duh now that I noticed 64 bit offsets in the function signature), I'm a little less worried about using u64s as I thought 32-bit architectures were totally broken right now.

But I do agree that the inconsistency is troubling. I'm especially worried about the 32-bit to 64-bit transition. We're obviously developing on 64-bit machines, but if someone was still using a 32-bit machine for some reason, they would think their code is working fine with u64::MAX and then be confused when nothing works on other people's 64-bit machines.

@sunfishcode
Copy link
Member

Ok, let's do this. I don't think of the kinds of things rustix does as lying per se; the idea is that there's often a conceptual API that the actual kernel API is an approximation of, within the constraints of the C language, backwards compatibility, and the syscall calling convention, and rustix seeks to expose the conceptual API, in Rust terms.

But with this overflow behavior, it feels like rustix crossed the line. It's not just that we just copy fewer bytes than requested sometimes, which is already part of the API, it's that overflow behavior is different.

@sunfishcode sunfishcode merged commit d8afee9 into bytecodealliance:main Jan 8, 2023
@SUPERCILEX
Copy link
Contributor Author

the idea is that there's often a conceptual API that the actual kernel API is an approximation of, within the constraints of the C language, backwards compatibility, and the syscall calling convention, and rustix seeks to expose the conceptual API, in Rust terms.

That's a really elegant way of framing rustix' approach to syscall adaptation, hadn't thought of it that way. Thanks for sharing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants