-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
const-eval: detect more pointers as definitely not-null #133700
base: master
Are you sure you want to change the base?
Conversation
r? @davidtwco rustbot has assigned @davidtwco. Use |
Some changes occurred to the CTFE machinery cc @rust-lang/wg-const-eval Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri |
r? @lcnr |
r=me after lang approval (idk if it needs a full FCP, it is observable by users after all) |
@RalfJung @JakobDegen Could you elaborate on the motivation? I agree it would be nice if #133523 compiled but I find myself asking "how clever is clever enough" for these checks. Do you feel comfortable with writing this behavior into the language spec? |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Oh dear. @rfcbot cancel @labels -T-compiler |
what am I doing :) |
@nikomatsakis proposal cancelled. |
@rfcbot fcp merge Based on discussion in the lang-team meeting we felt this needed an FCP. We discussed a few points we'd like to see clarified
but it is still complicating the spec, and it's not obvious when this function (or any other) will be "smart enough", so @tmandry was looking for better motivation than an issue (does this represent a real-world pattern?). The other question came from @pnkfelix who was wondering if the logic could be invalidated by people casting unaligned pointers or doing other things that don't respect alignment. |
Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
@rustbot labels +T-compiler |
Yeah this does complicate the spec, but not unduly so I would say. As you say, the change is entirely local, only making better use of information we already have. So it seemed like an easy win to me. We can of course also wait until someone shows up with a compelling real-world usecase. (@JakobDegen maybe you already have one?) If/when we ever get more clever niches for references (i.e., based on their alignment), we'll need to add CTFE logic of this sort (probably more complicated) to ensure that CTFE can still determine the active enum variant.
I don't see how. All we do is look at the pointer value. We don't "trust" how the value got computed or anything. Given a pointer value of offset X into an allocation with alignment A, we know that the absolute address will be The one thing we do trust is that CTFE allocations (turning into LLVM globals) truly end up with the alignment that they are declared with. But we already rely on that anyway (and we have some open soundness issues because some platforms don't always get this right for alignments on the order of a page or larger). |
So I do think the issue is representative of a somewhat plausible real world pattern: Storing a one-past-the-end-pointer which additionally packs extra data into the low bits, in a const context. Wanting to do all three of those at once is probably somewhat rare, but none of them are unusual on their own. With regards to this complicating the spec, I think it'd be good to write down what this check even really is (if someone has a different mental model, please share). It looks to me like it stems from the observation that the following code: let x = 0_u16;
NonNull::new_unchecked((&raw x).wrapping_byte_add(4)) Is unconditionally UB, both at runtime and at const time (albeit only with suitable non-deterministic choices). This check appears to me to be an attempt to shift detection of that programming error left, and as a result is somewhat similar in spirit to alignment checks on raw pointer derefs in debug mode. To me, this indicates that our basic operating principle here should be that to the extent that this check is imprecise in detecting UB, it should have false negatives, not false positives. A couple consequences/reasons for that:
Regardless, I don't actually think the new version of the check is imprecise in either direction in detecting the UB it intends to catch, and I agree with Ralf that I don't see a reason we can't maintain that property going forward. |
What @RalfJung and @JakobDegen said makes sense to me. @rfcbot reviewed |
That's a good example, thanks.
I can't follow this part of your post. I don't see what this proposal has to do with your example, nor with the arguments that follow. (I wouldn't say it is unconditional UB, it is UB if the address happens to be This is about the function that CTFE uses to determine whether a pointer may be null. This function is used in multiple situations:
This is a "may be null" since we don't know the absolute address of the pointer, so we can only do an approximation based on incomplete information. This PR makes the logic determining that approximation a bit smarter. This is basically a standard symbolic evaluation / abstract interpretation situation: we have partial knowledge about the address the pointer will have, and have to determine whether "null" is in the set of possible values. |
Ah, sorry, I hadn't looked at the code change in a great amount of detail so I missed this. Hopefully what I said makes more sense if we imagined this was only about const validation or other removable UB checks - given that this is also about |
In terms of just the UB checks, one could argue that we are overeager reporting UB when a pointer might be null or might be misaligned. But that's pre-existing before this PR. In your example, raising UB seems justified if we go with the "compiler resolves non-determinism" interpretation, agreed. I am not sure if UB is justified in all cases that "may be null" returns |
4df3570
to
b438e46
Compare
The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future.
The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future.
Oh wait Felix isn't on the team any more oops. I'll tick their box, then. |
The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future.
Correctly document CTFE behavior of is_null and methods that call is_null. The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future. This is beta-nominated since `const fn is_null` stabilization is in beta already but the docs there are wrong, and it seems better to have the docs be correct at the time of stabilization.
Correctly document CTFE behavior of is_null and methods that call is_null. The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future. This is beta-nominated since `const fn is_null` stabilization is in beta already but the docs there are wrong, and it seems better to have the docs be correct at the time of stabilization.
Correctly document CTFE behavior of is_null and methods that call is_null. The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future. This is beta-nominated since `const fn is_null` stabilization is in beta already but the docs there are wrong, and it seems better to have the docs be correct at the time of stabilization.
Rollup merge of rust-lang#134325 - theemathas:is_null-docs, r=RalfJung Correctly document CTFE behavior of is_null and methods that call is_null. The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future. This is beta-nominated since `const fn is_null` stabilization is in beta already but the docs there are wrong, and it seems better to have the docs be correct at the time of stabilization.
The "panic in const if CTFE doesn't know the answer" behavior was discussed to be the desired behavior in rust-lang#74939, and is currently how the function actually behaves. I intentionally wrote this documentation to allow for the possibility that a panic might not occur even if the pointer is out of bounds, because of rust-lang#133700 and other potential changes in the future. (cherry picked from commit 9388917)
@joshtriplett @scottmcm @tmandry have the answers above resolved your concerns, or is the plan that this will be discussed at some future lang team meeting before we proceed with FCP? |
Thanks for the discussion. Reading #133523 (comment) (which I missed before) was helpful for understanding a real-world use case. Making const eval less willing to give up in a case where we know a pointer can't be null makes sense to me. @rfcbot reviewed Do you think the behavior should be documented in the reference somewhere? I guess it stays out of detailed CTFE mechanics like this? |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
AFAIK we don't say anything about |
Looking at how it works, I feel good about doing this. Adding the modulo check is straight-forward to explain, and fits into the existing data well. If it took a whole bunch of new complicated tracking I'm be more skeptical, but this check seems fine. |
This fixes #133523 by making the
scalar_may_be_null
check smarter: for instance, an odd offset in any 2-aligned allocation can never be null, even if it is out-of-bounds.More generally, if an allocation with unknown base address B is aligned to alignment N, and a pointer is at offset X inside that allocation, then we know that
(B + X) mod N = B mod N + X mod N = X mod N
. Since0 mod N
is definitely 0, if we learn thatX mod N
is not 0 we can deduce thatB + X
is not 0.This is immediately visible on stable, via
ptr.is_null()
(and, more subtly, by not raising a UB error when such a pointer is used somewhere that a non-null pointer is required). Therefore nominating for @rust-lang/lang.