Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIR sanity check: validate types on assignment #72796

Merged
merged 12 commits into from
Jun 28, 2020
44 changes: 18 additions & 26 deletions src/librustc_mir/interpret/eval_context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ use rustc_middle::mir::interpret::{
};
use rustc_middle::ty::layout::{self, TyAndLayout};
use rustc_middle::ty::{
self, fold::BottomUpFolder, query::TyCtxtAt, subst::SubstsRef, Ty, TyCtxt, TypeFoldable,
self, query::TyCtxtAt, subst::SubstsRef, ParamEnv, Ty, TyCtxt, TypeFoldable,
};
use rustc_span::{source_map::DUMMY_SP, Span};
use rustc_target::abi::{Align, HasDataLayout, LayoutOf, Size, TargetDataLayout};
Expand All @@ -24,6 +24,7 @@ use super::{
Immediate, MPlaceTy, Machine, MemPlace, MemPlaceMeta, Memory, OpTy, Operand, Place, PlaceTy,
ScalarMaybeUninit, StackPopJump,
};
use crate::transform::validate::equal_up_to_regions;
use crate::util::storage::AlwaysLiveLocals;

pub struct InterpCx<'mir, 'tcx, M: Machine<'mir, 'tcx>> {
Expand Down Expand Up @@ -220,11 +221,14 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> LayoutOf for InterpCx<'mir, 'tcx,
/// This test should be symmetric, as it is primarily about layout compatibility.
pub(super) fn mir_assign_valid_types<'tcx>(
tcx: TyCtxt<'tcx>,
param_env: ParamEnv<'tcx>,
src: TyAndLayout<'tcx>,
dest: TyAndLayout<'tcx>,
) -> bool {
if src.ty == dest.ty {
// Equal types, all is good.
// Equal types, all is good. Layout will also be equal.
// (Enum variants would be an exception here as they have the type of the enum but different layout.
// However, those are never the type of an assignment.)
return true;
}
if src.layout != dest.layout {
Expand All @@ -234,35 +238,23 @@ pub(super) fn mir_assign_valid_types<'tcx>(
return false;
}

// Type-changing assignments can happen for (at least) two reasons:
// 1. `&mut T` -> `&T` gets optimized from a reborrow to a mere assignment.
// 2. Subtyping is used. While all normal lifetimes are erased, higher-ranked types
// with their late-bound lifetimes are still around and can lead to type differences.
// Normalize both of them away.
let normalize = |ty: Ty<'tcx>| {
ty.fold_with(&mut BottomUpFolder {
tcx,
// Normalize all references to immutable.
ty_op: |ty| match ty.kind {
ty::Ref(_, pointee, _) => tcx.mk_imm_ref(tcx.lifetimes.re_erased, pointee),
_ => ty,
},
// We just erase all late-bound lifetimes, but this is not fully correct (FIXME):
// lifetimes in invariant positions could matter (e.g. through associated types).
// We rely on the fact that layout was confirmed to be equal above.
lt_op: |_| tcx.lifetimes.re_erased,
// Leave consts unchanged.
ct_op: |ct| ct,
})
};
normalize(src.ty) == normalize(dest.ty)
// Type-changing assignments can happen when subtyping is used. While
// all normal lifetimes are erased, higher-ranked types with their
// late-bound lifetimes are still around and can lead to type
// differences. So we compare ignoring lifetimes.
//
// Note that this is not fully correct (FIXME):
// lifetimes in invariant positions could matter (e.g. through associated types).
// We rely on the fact that layout was confirmed to be equal above.
Copy link
Contributor

@lcnr lcnr Jun 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of surprising for me, afaik any kind of dispatch based on lifetimes is unsound and should not be possible.
i.e. one either has to implement something only for 'static or for all lifetimes. Am I missing something here?

edit: even if an impl is bound on an outlives relation ('a, 'b: 'a) it should still be unique

Do you have a somewhat concrete example where this can be relevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh you'll need to ask @eddyb for that.^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, when I check equal_up_to_regions first and then just debug-assert that the layouts are equal, all tests pass. There might be a latent ICE there but I am not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, when I check equal_up_to_regions first and then just debug-assert that the layouts are equal, all tests pass. There might be a latent ICE there but I am not sure.

I personally would love to do this by default, but it's kind of bad to cause ICE just to satisfy my own curiosity 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important that we understand our compiler.^^ So I changed this now to ICE on layout mismatch. If we get a bug report, at least we have a testcase. ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dispatch is on things like for<'a> fn(&'a T) which are distinct from fn(&'x T) (where 'x gets erased).

Besides associated types, type_name and TypeId can observe it. This goes as deep as symbol names: the v0 mangling has to encode bound lifetimes in types because there's nothing else telling them apart.

@nikomatsakis may have a better explanation as to why things are the way they are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddyb what would be a concrete example of this? If the current code here can ICE, would be nice to have a testcase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have something that behaves differently for for<'a> fn(&'a ()) and fn(&'a ()). It does give a coherence_leak_check back-compat warning though:

trait A {
    fn a(&self);
}

impl<'a> A for fn(&'a ()) {
    fn a(&self) {
        println!("fn(&'_ ())");
    }
}

impl A for for<'a> fn(&'a ()) {
    fn a(&self) {
        println!("for<'a> fn(&'a ())");
    }
}

fn main() {
    ((|_| {}) as fn(&'_ ())).a();
    ((|_| {}) as fn(&'static ())).a();
}
warning: conflicting implementations of trait `A` for type `fn(&())`:
  --> src/main.rs:11:1
   |
5  | impl<'a> A for fn(&'a ()) {
   | ------------------------- first implementation here
...
11 | impl A for for<'a> fn(&'a ()) {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `fn(&())`
   |
   = note: `#[warn(coherence_leak_check)]` on by default
   = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
   = note: for more information, see issue #56105 <https://github.com/rust-lang/rust/issues/56105>
   = note: this behavior recently changed as a result of a bug fix; see rust-lang/rust#56105 for details

warning: 1 warning emitted

    Finished dev [unoptimized + debuginfo] target(s) in 0.49s
     Running `target/debug/playground`
for<'a> fn(&'a ())
fn(&'_ ())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is somewhat historical, and indeed I do have some hopes of changing things here, although I think I've realized that we can't make all the changes I had hoped for. This example I think will eventually be accepted, despite the future compatibility warning, since we really need to accept that to accept existing patterns in the wild. But the fact that these impls are accepted therefore implis the importance of binding for e.g. symbol names.

equal_up_to_regions(tcx, param_env, src.ty, dest.ty)
}

/// Use the already known layout if given (but sanity check in debug mode),
/// or compute the layout.
#[cfg_attr(not(debug_assertions), inline(always))]
pub(super) fn from_known_layout<'tcx>(
tcx: TyCtxtAt<'tcx>,
param_env: ParamEnv<'tcx>,
known_layout: Option<TyAndLayout<'tcx>>,
compute: impl FnOnce() -> InterpResult<'tcx, TyAndLayout<'tcx>>,
) -> InterpResult<'tcx, TyAndLayout<'tcx>> {
Expand All @@ -271,7 +263,7 @@ pub(super) fn from_known_layout<'tcx>(
Some(known_layout) => {
if cfg!(debug_assertions) {
let check_layout = compute()?;
if !mir_assign_valid_types(tcx.tcx, check_layout, known_layout) {
if !mir_assign_valid_types(tcx.tcx, param_env, check_layout, known_layout) {
span_bug!(
tcx.span,
"expected type differs from actual type.\nexpected: {:?}\nactual: {:?}",
Expand Down Expand Up @@ -475,7 +467,7 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
// have to support that case (mostly by skipping all caching).
match frame.locals.get(local).and_then(|state| state.layout.get()) {
None => {
let layout = from_known_layout(self.tcx, layout, || {
let layout = from_known_layout(self.tcx, self.param_env, layout, || {
let local_ty = frame.body.local_decls[local].ty;
let local_ty =
self.subst_from_frame_and_normalize_erasing_regions(frame, local_ty);
Expand Down
4 changes: 3 additions & 1 deletion src/librustc_mir/interpret/operand.rs
Original file line number Diff line number Diff line change
Expand Up @@ -472,6 +472,7 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
// Sanity-check the type we ended up with.
debug_assert!(mir_assign_valid_types(
*self.tcx,
self.param_env,
self.layout_of(self.subst_from_current_frame_and_normalize_erasing_regions(
place.ty(&self.frame().body.local_decls, *self.tcx).ty
))?,
Expand Down Expand Up @@ -554,7 +555,8 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
// documentation).
let val_val = M::adjust_global_const(self, val_val)?;
// Other cases need layout.
let layout = from_known_layout(self.tcx, layout, || self.layout_of(val.ty))?;
let layout =
from_known_layout(self.tcx, self.param_env, layout, || self.layout_of(val.ty))?;
let op = match val_val {
ConstValue::ByRef { alloc, offset } => {
let id = self.tcx.create_memory_alloc(alloc);
Expand Down
5 changes: 3 additions & 2 deletions src/librustc_mir/interpret/place.rs
Original file line number Diff line number Diff line change
Expand Up @@ -652,6 +652,7 @@ where
// Sanity-check the type we ended up with.
debug_assert!(mir_assign_valid_types(
*self.tcx,
self.param_env,
self.layout_of(self.subst_from_current_frame_and_normalize_erasing_regions(
place.ty(&self.frame().body.local_decls, *self.tcx).ty
))?,
Expand Down Expand Up @@ -855,7 +856,7 @@ where
) -> InterpResult<'tcx> {
// We do NOT compare the types for equality, because well-typed code can
// actually "transmute" `&mut T` to `&T` in an assignment without a cast.
if !mir_assign_valid_types(*self.tcx, src.layout, dest.layout) {
if !mir_assign_valid_types(*self.tcx, self.param_env, src.layout, dest.layout) {
span_bug!(
self.cur_span(),
"type mismatch when copying!\nsrc: {:?},\ndest: {:?}",
Expand Down Expand Up @@ -912,7 +913,7 @@ where
src: OpTy<'tcx, M::PointerTag>,
dest: PlaceTy<'tcx, M::PointerTag>,
) -> InterpResult<'tcx> {
if mir_assign_valid_types(*self.tcx, src.layout, dest.layout) {
if mir_assign_valid_types(*self.tcx, self.param_env, src.layout, dest.layout) {
// Fast path: Just use normal `copy_op`
return self.copy_op(src, dest);
}
Expand Down
157 changes: 145 additions & 12 deletions src/librustc_mir/transform/validate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,11 @@ use rustc_middle::{
BasicBlock, Body, Location, Operand, Rvalue, Statement, StatementKind, Terminator,
TerminatorKind,
},
ty::{self, ParamEnv, TyCtxt},
ty::{
self,
relate::{Relate, RelateResult, TypeRelation},
ParamEnv, Ty, TyCtxt,
},
};

#[derive(Copy, Clone, Debug)]
Expand All @@ -28,6 +32,93 @@ impl<'tcx> MirPass<'tcx> for Validator {
}
}

/// Returns whether the two types are equal up to lifetimes.
/// All lifetimes, including higher-ranked ones, get ignored for this comparison.
/// (This is unlike the `erasing_regions` methods, which keep higher-ranked lifetimes for soundness reasons.)
///
/// The point of this function is to approximate "equal up to subtyping". However,
/// the approximation is incorrect as variance is ignored.
pub fn equal_up_to_regions(
tcx: TyCtxt<'tcx>,
param_env: ParamEnv<'tcx>,
src: Ty<'tcx>,
dest: Ty<'tcx>,
) -> bool {
struct LifetimeIgnoreRelation<'tcx> {
tcx: TyCtxt<'tcx>,
param_env: ty::ParamEnv<'tcx>,
}

impl TypeRelation<'tcx> for LifetimeIgnoreRelation<'tcx> {
fn tcx(&self) -> TyCtxt<'tcx> {
self.tcx
}

fn param_env(&self) -> ty::ParamEnv<'tcx> {
self.param_env
}

fn tag(&self) -> &'static str {
"librustc_mir::transform::validate"
}

fn a_is_expected(&self) -> bool {
true
}

fn relate_with_variance<T: Relate<'tcx>>(
&mut self,
_: ty::Variance,
a: &T,
b: &T,
) -> RelateResult<'tcx, T> {
// Ignore variance, require types to be exactly the same.
self.relate(a, b)
}

fn tys(&mut self, a: Ty<'tcx>, b: Ty<'tcx>) -> RelateResult<'tcx, Ty<'tcx>> {
if a == b {
// Short-circuit.
return Ok(a);
}
ty::relate::super_relate_tys(self, a, b)
}

fn regions(
&mut self,
a: ty::Region<'tcx>,
_b: ty::Region<'tcx>,
) -> RelateResult<'tcx, ty::Region<'tcx>> {
// Ignore regions.
Ok(a)
}

fn consts(
&mut self,
a: &'tcx ty::Const<'tcx>,
b: &'tcx ty::Const<'tcx>,
) -> RelateResult<'tcx, &'tcx ty::Const<'tcx>> {
ty::relate::super_relate_consts(self, a, b)
}

fn binders<T>(
&mut self,
a: &ty::Binder<T>,
b: &ty::Binder<T>,
) -> RelateResult<'tcx, ty::Binder<T>>
where
T: Relate<'tcx>,
{
self.relate(a.skip_binder(), b.skip_binder())?;
Ok(a.clone())
}
}

// Instantiate and run relation.
let mut relator: LifetimeIgnoreRelation<'tcx> = LifetimeIgnoreRelation { tcx: tcx, param_env };
relator.relate(&src, &dest).is_ok()
}

struct TypeChecker<'a, 'tcx> {
when: &'a str,
source: MirSource<'tcx>,
Expand Down Expand Up @@ -81,6 +172,31 @@ impl<'a, 'tcx> TypeChecker<'a, 'tcx> {
self.fail(location, format!("encountered jump to invalid basic block {:?}", bb))
}
}

/// Check if src can be assigned into dest.
/// This is not precise, it will accept some incorrect assignments.
fn mir_assign_valid_types(&self, src: Ty<'tcx>, dest: Ty<'tcx>) -> bool {
if src == dest {
// Equal types, all is good.
return true;
}
// Normalize projections and things like that.
// FIXME: We need to reveal_all, as some optimizations change types in ways
// that require unfolding opaque types.
let param_env = self.param_env.with_reveal_all();
let src = self.tcx.normalize_erasing_regions(param_env, src);
let dest = self.tcx.normalize_erasing_regions(param_env, dest);
// It's worth checking equality again.
if src == dest {
return true;
}

// Type-changing assignments can happen when subtyping is used. While
// all normal lifetimes are erased, higher-ranked types with their
// late-bound lifetimes are still around and can lead to type
// differences. So we compare ignoring lifetimes.
equal_up_to_regions(self.tcx, param_env, src, dest)
}
}

impl<'a, 'tcx> Visitor<'tcx> for TypeChecker<'a, 'tcx> {
Expand All @@ -99,20 +215,37 @@ impl<'a, 'tcx> Visitor<'tcx> for TypeChecker<'a, 'tcx> {
}

fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
// The sides of an assignment must not alias. Currently this just checks whether the places
// are identical.
if let StatementKind::Assign(box (dest, rvalue)) = &statement.kind {
match rvalue {
Rvalue::Use(Operand::Copy(src) | Operand::Move(src)) => {
if dest == src {
self.fail(
location,
"encountered `Assign` statement with overlapping memory",
);
match &statement.kind {
StatementKind::Assign(box (dest, rvalue)) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow up we should also check function calls and DropAndReplace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetDiscriminant can also get checked as to whether the type has the mentioned discriminant id and whether the local's type is actually an enum

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's collect these ideas in #73832

// LHS and RHS of the assignment must have the same type.
let left_ty = dest.ty(&self.body.local_decls, self.tcx).ty;
let right_ty = rvalue.ty(&self.body.local_decls, self.tcx);
if !self.mir_assign_valid_types(right_ty, left_ty) {
self.fail(
location,
format!(
"encountered `Assign` statement with incompatible types:\n\
left-hand side has type: {}\n\
right-hand side has type: {}",
left_ty, right_ty,
),
);
}
// The sides of an assignment must not alias. Currently this just checks whether the places
// are identical.
match rvalue {
Rvalue::Use(Operand::Copy(src) | Operand::Move(src)) => {
if dest == src {
self.fail(
location,
"encountered `Assign` statement with overlapping memory",
);
}
}
_ => {}
}
_ => {}
}
_ => {}
}
}

Expand Down