Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it. How can it be an undefined behavior? I think accessing the data it points to is undefined, but get the pointer address is pure a math calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the rules enough to say for sure, but here's what miri says about this code: https://gist.github.com/brson/26d2111ce4c1a555d0e9a817c37ff646
I take this to mean that even creating the pointer is UB.
I can't find language in https://doc.rust-lang.org/beta/reference/behavior-considered-undefined.html to support the idea that even doing pointer math to create out-of-bounds pointers is undefined. The strongest it says is that dereferencing dangling pointers is undefined, which is obvious to me.
cc @oli-obk @RalfJung sorry for pinging you so much, but maybe you can explain why it's not ok to do pointer math that produces a bad pointer here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah doing pointer arithmetic on out of bounds pointers is UB in LLVM, see for example this blog post: https://blog.regehr.org/archives/1518 - "it is UB to create a pointer that lies more than one element outside of an allocated object. It is UB merely to create such a pointer, it does not need to be dereferenced. Also, it is still UB even if the overflowed pointer happens to refer to some other allocated object.". This is about C, but applies equally to Rust
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it. So it's an undefined behavior because the calculation can produce an overflow value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That list only enumerates UB caused by language primitives. When you call a method, the method docs mention the UB conditions for that method. In the case of
add
, it says:So no, it's not just overflow that is a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are asking "why does
add
say this is UB", the short answer is "because this is what LLVM requires forgetelementptr inbounds
, and the long answer is that this helps the compiler analyze which pointers could potentially point where.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should report targetted (to the intrinsic) errors instead of generic errors when they are caused by the intrinsic's shim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intrinsic is an unstable internal implementation detail though -- how it is called and which intrinsic is used for what can change any time. Seems a bit strange to surface that in error messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right :/ the calling function's docs are important, not the bare intrinsic. Yea not sure what we can do to improve the miri error then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the help @RalfJung @oli-obk