Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadPointerAsBytes error in ::std::str::from_utf8() #190

Closed
dwrensha opened this issue Jun 8, 2017 · 13 comments · Fixed by #324
Closed

ReadPointerAsBytes error in ::std::str::from_utf8() #190

dwrensha opened this issue Jun 8, 2017 · 13 comments · Fixed by #324
Labels
C-bug Category: This is a bug.

Comments

@dwrensha
Copy link
Contributor

dwrensha commented Jun 8, 2017

// test.rs

fn main() {
    let _ = ::std::str::from_utf8(b"a");
}
$ cargo run --bin miri -- --sysroot ~/.xargo/HOST test.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/miri --sysroot /home/dwrensha/.xargo/HOST test.rs`
WARN:miri::terminator: ignoring C ABI call: pthread_attr_init
WARN:miri::terminator: ignoring C ABI call: pthread_self
WARN:miri::terminator: ignoring C ABI call: pthread_getattr_np
WARN:miri::terminator: ignoring C ABI call: pthread_attr_getstack
WARN:miri::terminator: ignoring C ABI call: pthread_attr_destroy
WARN:miri::terminator: ignoring C ABI call: pthread_mutex_lock
WARN:miri::terminator: ignoring C ABI call: pthread_mutex_unlock
WARN:miri::terminator: ignoring C ABI call: pthread_mutexattr_init
WARN:miri::terminator: ignoring C ABI call: pthread_mutexattr_settype
WARN:miri::terminator: ignoring C ABI call: pthread_mutex_init
WARN:miri::terminator: ignoring C ABI call: pthread_mutexattr_destroy
WARN:miri::terminator: ignoring C ABI call: pthread_condattr_init
WARN:miri::terminator: ignoring C ABI call: pthread_condattr_setclock
WARN:miri::terminator: ignoring C ABI call: pthread_cond_init
WARN:miri::terminator: ignoring C ABI call: pthread_condattr_destroy
WARN:miri::terminator: ignoring C ABI call: pthread_mutex_lock
WARN:miri::terminator: ignoring C ABI call: pthread_mutex_unlock
error: a raw memory access tried to access part of a pointer value as raw bytes
     |
note: inside call to core::str::run_utf8_validation
note: inside call to std::str::from_utf8
    --> test.rs:4:13
     |
4    |     let _ = ::std::str::from_utf8(b"a");
     |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside call to main
note: inside call to <fn() as std::ops::FnOnce<()>>::call_once - shim(fn())
note: inside call to std::panicking::try::do_call::<fn(), ()>
note: inside call to std::panicking::try::<(), fn()>
note: inside call to std::panic::catch_unwind::<fn(), ()>
note: inside call to std::rt::lang_start

error: aborting due to previous error(s)

thread 'main' panicked at 'Box<Any>', /checkout/src/librustc_errors/lib.rs:514
note: Run with `RUST_BACKTRACE=1` for a backtrace.
@dwrensha
Copy link
Contributor Author

dwrensha commented Jun 8, 2017

@dwrensha
Copy link
Contributor Author

dwrensha commented Jun 8, 2017

Indeed, this program hits a similar error:

// test_ptr.rs

fn main() {
    let x: &[u8] = &[1,2,3,4,5];

    let ptr = x.as_ptr();
    let _align = (ptr as usize) & (::std::mem::size_of::<usize>() - 1);
}
$ cargo run --bin miri -- test_ptr.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/miri test_ptr.rs`
error: a raw memory access tried to access part of a pointer value as raw bytes
 --> test_ptr.rs:7:18
  |
7 |     let _align = (ptr as usize) & (::std::mem::size_of::<usize>() - 1);
  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
note: inside call to main
 --> test_ptr.rs:3:1
  |
3 | / fn main() {
4 | |     let x: &[u8] = &[1,2,3,4,5];
5 | |
6 | |     let ptr = x.as_ptr();
7 | |     let _align = (ptr as usize) & (::std::mem::size_of::<usize>() - 1);
8 | | }
  | |_^

error: aborting due to previous error(s)

thread 'main' panicked at 'Box<Any>', /checkout/src/librustc_errors/lib.rs:514

@oli-obk
Copy link
Contributor

oli-obk commented Jun 8, 2017

Yea... that's a known limitation. We can make it work by making such computations work on the pointer offset + the allocation alignment. It would be deterministic, but different from what real world code is doing. Unfortunately you still couldn't read u16 even after fixing the alignment of the pointer, because in miri the allocation makes the alignment...

@eddyb do you think it would be OK to support reading u16 from an alloc with u8 alignment if the pointer is aligned?

We can simulate maximal misalignment by making every allocation bigger than requested and offsetting its inner value by the alignment

@eddyb
Copy link
Member

eddyb commented Jun 8, 2017

We can make it work by making such computations work on the pointer offset + the allocation alignment

I am against doing anything of the sort - it's possible to support modifying bits under the alignment without having different semantics from runtime, but that won't make this code work.

Don't think there's much we can do here, that's not SMT-ish.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 8, 2017

What we could do is add an intrinsic to rustc: fn is_aligned<T>(ptr: * const T, align: usize), which replaces all such hacks from the stdlib. Miri will simply check the allocation's alignment instead of the offset.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 9, 2017

I opened an issue on the rustc repo: rust-lang/rust#42561

@dwrensha
Copy link
Contributor Author

dwrensha commented Jun 9, 2017

We can make it work by making such computations work on the pointer offset + the allocation alignment

I am against doing anything of the sort

I gather that the problem here is that the execution under miri could diverge from the execution under rustc.

@oli-obk, it seems to me that your proposed intrinsic has the same problem. E.g. with it, miri would never execute this branch.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 9, 2017

That branch is just an optimization as I understand the code.

@eddyb
Copy link
Member

eddyb commented Jun 9, 2017

Oh, I see, the way it is done you can pretend every byte is unaligned and it will use the outer loop.
Makes me a bit twitchy because it's more unknowns leaking as approximates but it's better than nothing.

@RalfJung
Copy link
Member

So all of this should "just work" if/when miri gets an implementation of the intptrcast-model, right? Or is the plan to support some of this even when doing CTFE?

@oli-obk
Copy link
Contributor

oli-obk commented Jul 10, 2017

I'd like to support some of it in ctfe. But this issue will be solved by the rfc I opened and then trivially work and improve many homebrewn alignment optimizations.

@RalfJung
Copy link
Member

Ah, I see. Makes sense.

@oli-obk oli-obk added the C-bug Category: This is a bug. label Aug 30, 2017
@oli-obk
Copy link
Contributor

oli-obk commented Aug 30, 2017

The align_offset intrinsic is now in the compiler. This should be trivial to implement with the next nightly (just run your above code, find the failing intrinsic, and implement it as writing 0xFFFFFFFF to the destination, done)

We could make it smarter, but there's no real need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants