Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix some uses of pointer intrinsics with invalid pointers #53804

Merged
merged 2 commits into from
Sep 16, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions src/liballoc/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2410,9 +2410,8 @@ impl<T> Iterator for IntoIter<T> {
// same pointer.
self.ptr = arith_offset(self.ptr as *const i8, 1) as *mut T;

// Use a non-null pointer value
// (self.ptr might be null because of wrapping)
Some(ptr::read(1 as *mut T))
Copy link
Contributor

@gnzlbg gnzlbg Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible for ptr::read to contain a debug_assert_eq!(ptr.align_offset(align_of::<T>()), 0) ?

If that assert fails, it would currently be UB, so we might as well just panic in debug builds to at least catch these inside of std on CI. Maybe someday users will benefit from these without having to use miri to detect these issues.

Copy link
Member Author

@RalfJung RalfJung Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan was to open an E-easy issue (once #53783 lands) to decorate all these functions with a debug_assert! testing non-NULL and being aligned.

I would however use ptr as usize % mem::align_of<T>() == 0, because your test will always fail on miri. Remember, align_offset can spuriously "fail" (return usize::max).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea.

Remember, align_offset can spuriously fail.

I didn't know this :/ whyyyy

Copy link
Member Author

@RalfJung RalfJung Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we cannot implement it completely in miri/CTFE -- allocations do not actually have "real" base addresses there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. This is getting off topic, but couldn't miri track the alignment requested for an allocation and use that in align_offset ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, and it could give a better return value in that case.

But align_offset is also used e.g. on a large u8 allocation to find a subslice that is 256bit-aligned for use with SSE. There is no way for miri to provide an alignment stronger than what the allocation promises.

So maybe it could be made good enough for your test, not sure... the % works today though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit worrying that we might prefer slower run-time code because it works on miri to fast run-time code. Does the % lower to the same LLVM-IR than align_offset, at least after running opt ?

An alternative is to use conditional compilation, and use % for miri and align_offset for run-time rust.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how % could be any slower than the intrinsic. If anything, align_offset is doing the harder job (computing which offset to add, not just testing it for 0) so I'd expect it to be slower.

Currently, LLVM optimizes the to the same code to the extend that they get deduplicated.^^

// Read from a properly aligned pointer to make up a value of this ZST.
Some(ptr::read(NonNull::dangling().as_ptr()))
} else {
let old = self.ptr;
self.ptr = self.ptr.offset(1);
Expand Down Expand Up @@ -2451,9 +2450,8 @@ impl<T> DoubleEndedIterator for IntoIter<T> {
// See above for why 'ptr.offset' isn't used
self.end = arith_offset(self.end as *const i8, -1) as *mut T;

// Use a non-null pointer value
// (self.end might be null because of wrapping)
Some(ptr::read(1 as *mut T))
// Read from a properly aligned pointer to make up a value of this ZST.
Some(ptr::read(NonNull::dangling().as_ptr()))
} else {
self.end = self.end.offset(-1);

Expand Down
4 changes: 3 additions & 1 deletion src/libstd/collections/hash/table.rs
Original file line number Diff line number Diff line change
Expand Up @@ -742,7 +742,9 @@ impl<K, V> RawTable<K, V> {
) -> Result<RawTable<K, V>, CollectionAllocErr> {
unsafe {
let ret = RawTable::new_uninitialized_internal(capacity, fallibility)?;
ptr::write_bytes(ret.hashes.ptr(), 0, capacity);
if capacity > 0 {
Copy link
Contributor

@gnzlbg gnzlbg Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this workaround, I would prefer if here:

const EMPTY: usize = 1;

we would change EMPTY from a const to a const fn empty<T>() -> *mut T; that just returns NonNull::dangling().as_ptr(). I think EMPTY is only used twice in the whole module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Note: when the pointer is initialized to EMPTY `.ptr()` will return
/// null and the tag functions shouldn't be used.

The table states this, but there are no debug_assert!s preventing misuse AFAICT, might be worth it to add these as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid of touching HashMap, so I'd prefer if someone who knows anything about that code does those changes...

The pointer here was previously NULL, so I don't think it came from EMPTY.

Copy link
Contributor

@gnzlbg gnzlbg Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If capacity == 0, then new_uninitialized_internal can only return (

if capacity == 0 {

) hashes with this value:

hashes: TaggedHashUintPtr::new(EMPTY as *mut HashUint),

which basically calls (

unsafe fn new(ptr: *mut HashUint) -> Self {
):

Unique::new_unchecked(EMPTY as *mut HashUint)

Unique::ptr() will then just return EMPTY as *mut HashUint which is never null, so I have no idea how this pointer there could ever be NULL. It should be EMPTY. That is still wrong because of incorrect alignment, but that's a different type of wrong. Or what am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table states this, but there are no debug_assert!s preventing misuse AFAICT, might be worth it to add these as well.

Well, the tag functions cannot assert anything reasonable because the last bit can be either 0 or 1.

I think this also uses values as sentinel values in ways it should not. But that really needs someone to actually dig in the code. I can open an issue about that if you want, right now I only have time to do the most obvious thing.

I have no idea how this pointer there could ever be NULL. It should be EMPTY.

As the comment you quoted says, ptr() will return NULL for EMPTY.

If EMPTY was properly aligned and ptr did not do anything, we would not even need any test here on write_bytes because non-NULL aligned zero-sized writes are perfectly fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can open an issue about that if you want, right now I only have time to do the most obvious thing.

Yeah, this is already an improvement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I opened #53853.

Anything left to do in this PR?

ptr::write_bytes(ret.hashes.ptr(), 0, capacity);
}
Ok(ret)
}
}
Expand Down