Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tidb_query_vec_executors: Fixed stack-borrowing undefined-behavior #7709

Merged
merged 3 commits into from
Apr 30, 2020

Conversation

brson
Copy link
Contributor

@brson brson commented Apr 30, 2020

What problem does this PR solve?

This fixes undefined behavior in tidb_query_vec_executors. Miri reports a stacked borrowing error as described below.

Note that stacked borrowing is a way to model Rust's use of unsafe pointers, but is not officially part of Rust's memory model. Thus, this is not truly undefined behavior today, but could be eventually.

I don't understand the stacked borrowing rules enough to understand why the
existing code is a violation. It is creating a mutable unsafe pointer; then
taking some immutable borrows to the same data while doing nothing with that
pointer; then later using the pointer.

Simply moving the creation of the unsafe pointer to immediately before it
is used removes the error.

cc @RalfJung not sure if you like being tagged on miri stacked borrow reports. If you have time though maybe you could explain what's going on here.

Found with miri, which reports:

error: Undefined Behavior: trying to reborrow for SharedReadOnly, but parent tag <1121039> does not have an appropriate item in the borrow sta
ck
   --> components/tidb_query_vec_executors/src/slow_hash_aggr_executor.rs:281:32
    |
281 |             let offset_begin = self.group_key_buffer.len();
    |                                ^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadOnly, but parent tag <1121039> does not have an ap
propriate item in the borrow stack
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information

Signed-off-by: Brian Anderson [email protected]

I don't understand the stacked borrowing rules enough to understand why the
existing code is a violation. It is creating a mutable unsafe pointer; then
taking some immutable borrows to the same data while doing nothing with that
pointer; then later using the pointer.

Simply moving the creation of the unsafe pointer to immediately before it
is used removes the error.

Found with miri, which reports:

```
error: Undefined Behavior: trying to reborrow for SharedReadOnly, but parent tag <1121039> does not have an appropriate item in the borrow sta
ck
   --> components/tidb_query_vec_executors/src/slow_hash_aggr_executor.rs:281:32
    |
281 |             let offset_begin = self.group_key_buffer.len();
    |                                ^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadOnly, but parent tag <1121039> does not have an ap
propriate item in the borrow stack
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
```

Signed-off-by: Brian Anderson <[email protected]>
Copy link
Contributor

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is UB because the old line 280 takes a reference to self.group_key_buffer, but then inside the for loop self.group_key_buffer is mutated. This is equivalent to holding two &mut pointers, which is of course not allowed. This is only possible at all because the into creates a NonNull, which is basically a *mut pointer. It is not unsafe to have multiple *mut pointers as long as they are not dereferenced, but I guess miri is still treating it as UB.

Anyway, the fix LGTM

Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for both @nrc @brson !

The code is trying to inserting string keys to a HashMap. Strings are generated dynamically. To avoid allocating memory for each string, strings are appended to a memory buffer first. And then, the pointer to the buffer, as well as the appending offset are inserted to the HashMap instead.

In this pattern, a non-mut reference (however implemented by using NonNull) to the buffer is always kept and stored in HashMap, while the buffer is meanwhile mutated to append more strings.

@nrc
Copy link
Contributor

nrc commented Apr 30, 2020

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

Your auto merge job has been accepted, waiting for:

  • 7705

@sre-bot sre-bot added the status/can-merge Status: Can merge to base branch label Apr 30, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

@brson merge failed.

@nrc
Copy link
Contributor

nrc commented Apr 30, 2020

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

@brson merge failed.

@breezewish breezewish merged commit 9f1a97a into tikv:master Apr 30, 2020
@breezewish breezewish added the needs-cherry-pick-release-4.0 Type: Need cherry pick to release 4.0 label Apr 30, 2020
@breezewish
Copy link
Member

/run-cherry-picker

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

cherry pick to release-4.0 in PR #7714

@RalfJung
Copy link

RalfJung commented Apr 30, 2020

I believe it is UB because the old line 280 takes a reference to self.group_key_buffer, but then inside the for loop self.group_key_buffer is mutated. This is equivalent to holding two &mut pointers, which is of course not allowed. This is only possible at all because the into creates a NonNull, which is basically a *mut pointer. It is not unsafe to have multiple *mut pointers as long as they are not dereferenced, but I guess miri is still treating it as UB.

self.group_key_buffer is not a *mut though, is it? I have a hard time understanding what the code does here, but from your summary it sounds like you have a *mut and an &mut and both are being used. That is definitely UB because &mut must not have any aliases being used.

If you truly have two *mut, then indeed they can alias and both be used, and Miri should not complain about that.

(Actually isn't buffer_ptr a *const? It is created from a shared reference, so it must not be used for mutation, unless there are UnsafeCell.)

@nrc
Copy link
Contributor

nrc commented Apr 30, 2020

Thanks @RalfJung !

self.group_key_buffer is Box<Vec<u8>>. self is &mut, so self.group_key_buffer is being mutated via an implicit &mut.

buffer_ptr is a NonNull, which aiui is equivalent to *mut for this sort of thing (although I think you are right that there should be an UnsafeCell somewhere).

@RalfJung
Copy link

RalfJung commented Apr 30, 2020

buffer_ptr is a NonNull, which aiui is equivalent to *mut for this sort of thing

It is, but you are creating it from a shared reference it looks like, which means you must not use it for mutation.

*(&x as *const _ as *mut _) = 5 is UB.

OTOH, Miri catches those bugs, so I am not entirely sure what is actually happening here. Unfortunately the into() makes it impossible to see which types are involved.

(although I think you are right that there should be an UnsafeCell somewhere).

UnsafeCell is only needed if you truly want to mutate data behind a shared reference. If that is not what you want, then just create your NonNull from an &mut.

@RalfJung
Copy link

RalfJung commented Apr 30, 2020

buffer_ptr is a NonNull, which aiui is equivalent to *mut for this sort of thing

I should clarify: Raw pointers, once created, do not care about their type.

But how you create a raw pointer initially (from a safe reference) matters. Casting a reference to *const is very different from casting a reference to *mut. The former creates a raw pointer that is immutable like shared references are, the latter creates a raw pointer that may be used for mutation.

NonNull::from(&x) creates a read-only raw pointer (modulo interior mutability, aka UnsafeCell). This example demonstrates the UB. The UB arises because this takes a shared reference, casts it around a bit, and then writes to it -- whereas the aliasing rules state that you must under no circumstances use shared references for mutation (outside UnsafeCell).

@nrc
Copy link
Contributor

nrc commented Apr 30, 2020

Thanks @RalfJung !

c1ay pushed a commit to c1ay/tikv that referenced this pull request May 9, 2020
brson added a commit to brson/tikv that referenced this pull request May 10, 2020
MyonKeminta pushed a commit that referenced this pull request May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-4.0 Type: Need cherry pick to release 4.0 status/can-merge Status: Can merge to base branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants