Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit and explicit HRTB are not equivalent #38714

Closed
vickenty opened this issue Dec 30, 2016 · 14 comments
Closed

Implicit and explicit HRTB are not equivalent #38714

vickenty opened this issue Dec 30, 2016 · 14 comments
Assignees
Labels
A-type-system Area: Type system C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Milestone

Comments

@vickenty
Copy link
Contributor

A Fn trait bound with implicit lifetimes is not equivalent to the trait bound with explicit lifetimes.

fn foo<F>(f: F) where F: for<'a> Fn(&'a str) -> &'a str {}
fn bar<F>(f: F) where F: Fn(&str) -> &str {}

fn main() {
    foo(|a: &str| a); // Fails
    bar(|a: &str| a); // Works
}

(playpen: https://is.gd/nKZx1O)

I expected that foo and bar would be equivalent and that both calls would compile (or fail to compile the same way).

instead, calling bar compiles fine, but calling foo results in the following error:

rustc 1.14.0 (e8a012324 2016-12-16)
error[E0312]: lifetime of reference outlives lifetime of borrowed content...
 --> <anon>:6:19
  |
6 |     foo(|a: &str| a); // Fails
  |                   ^
  |
note: ...the reference is valid for the lifetime 'a as defined on the block at 6:18...
 --> <anon>:6:19
  |
6 |     foo(|a: &str| a); // Fails
  |                   ^
note: ...but the borrowed content is only valid for the anonymous lifetime #1 defined on the block at 6:18
 --> <anon>:6:19
  |
6 |     foo(|a: &str| a); // Fails
  |                   ^
help: consider using an explicit lifetime parameter as shown: fn main()
 --> <anon>:5:1
  |
5 | fn main() {
  | ^

error: aborting due to previous error

Somewhat related (but different) problem about a closure with annotated argument was reported in #22557

@JDemler
Copy link
Contributor

JDemler commented Feb 20, 2017

I think I ran into the same problem. Alas, for me, it is not possible to use implicit lifetimes:
playpen

struct UsizeRef<'a> {
    a: &'a usize
}

type RefTo = Box<for<'r> Fn(&'r Vec<usize>) -> UsizeRef<'r>>;

//Compiles
fn ref_to<'a>(vec: &'a Vec<usize>) -> UsizeRef<'a> {
    UsizeRef{ a: &vec[0]}
}

fn main() {
    //Does not compile
    let a: RefTo = Box::new(|vec: &Vec<usize>| {
            UsizeRef{ a: &vec[0] }
        }
    );
}

results in:

rustc 1.15.1 (021bd294c 2017-02-08)
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
  --> <anon>:15:27
   |
15 |             UsizeRef{ a: &vec[0] }
   |                           ^^^^^^
   |
help: consider using an explicit lifetime parameter as shown: fn main()
  --> <anon>:12:1
   |
12 |   fn main() {
   |  _^ starting here...
13 | |     //Does not compile
14 | |     let a: RefTo = Box::new(|vec: &Vec<usize>| {
15 | |             UsizeRef{ a: &vec[0] }
16 | |         }
17 | |     );
18 | | }
   | |_^ ...ending here

error: aborting due to previous error

@vickenty
Copy link
Contributor Author

@JDemler This problem only shows up if the closure parameter is typed explicitly. Your example compiles if the type annotation is removed: https://is.gd/e5ILB5

@JDemler
Copy link
Contributor

JDemler commented Feb 20, 2017

Aha. Thank you. Didn't know that was possible!

@Mark-Simulacrum Mark-Simulacrum added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-type-system Area: Type system labels May 23, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 26, 2017
@F001
Copy link
Contributor

F001 commented Sep 2, 2017

I ran into this issue too. The compiler panicked when I was testing @vickenty 's test case.

rustc -V
rustc 1.22.0-nightly (f861b6ee4 2017-09-01)

For the good case, if I declare the closure as a local variable, it also leads to compile error.

fn foo<F>(f: F) where F: for<'a> Fn(&'a str) -> &'a str {}
fn bar<F>(f: F) where F: Fn(&str) -> &str {}

fn main() {
//    foo(|a: &str| a); // Compiler panic
    bar(|a: &str| a); // Works

    let local = |a: &str| a;
    bar(local);  // compile failed
}

@arielb1 arielb1 added I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Sep 3, 2017
@arielb1
Copy link
Contributor

arielb1 commented Sep 3, 2017

This causes ICE: no representative region on beta.

The code not working with a local would be expected, because of the way higher-ranked inference works.

@nikomatsakis
Copy link
Contributor

In what sense is this a regression? That is, which code specifically works and where...?

@arielb1
Copy link
Contributor

arielb1 commented Sep 7, 2017

fn foo<F>(f: F) where F: for<'a> Fn(&'a str) -> &'a str {}
fn bar<F>(f: F) where F: Fn(&str) -> &str {}

fn main() {
    foo(|a: &str| a); // Compiler panic
    bar(|a: &str| a); // Works

    let local = |a: &str| a;
    bar(local);  // compile failed
}

Causes an ICE in beta, none in stable.

@nikomatsakis
Copy link
Contributor

triage: P-high

Assigning to myself to investigate.

@rust-highfive rust-highfive added the P-high High priority label Sep 7, 2017
@nikomatsakis nikomatsakis self-assigned this Sep 7, 2017
@Mark-Simulacrum
Copy link
Member

@nikomatsakis This is a reminder ping for you to investigate if that hasn't happened yet!

@alexcrichton alexcrichton added this to the 1.21 milestone Sep 13, 2017
@nikomatsakis
Copy link
Contributor

OK, so I've tracked down the problem. The problem is that astconv produces a kind of "incoherent" signature for the closure by combining the expected return type (which is expressed in terms of some regions) with the explicit annotations that the user gave (which are expressed in terms of some other regions). In the case where for<'a> was used explicitly, those do not match. But in the case where elision was used, it happens to work out.

This has probably always been a bug, but I think that the reason the behavior changed recently was that we removed some of the code in the projection cache that "compensated" for this kind of bad behavior (that code was needed because we were keeping some backwards compat hacks around for a while to give people a chance to update).

Specifically, what happens is this. In the case of for<'a> Fn(&'a str) -> &'a str, the "expected signature" is something like: &'a str as argument and &'a str as return type. The user, on the other hand, wrote &str (an anonymous region). We prefer the thing the user wrote (the anonymous region). But then, since no explicit return type was given, we take the return type from the expected signature: &'a str. The final signature is thus fn(&str) -> &'a str, where both are late-bound. This is actually an illegal signature -- late-bound regions must appear in the parameters of the function, but 'a does not. This makes the projection cache do bad things and ICE.

I'm not sure of how best to fix this. The scenario gets more complex with multiple arguments. Imagine an expected signature of for<'a> fn(&'a u32, &'a u32) -> &'a u32, but the user wrote foo(|x: &str, y| ...). Here are some strategies I can imagine:

  • The most conservative thing would be to ignore the expected return type if any arguments had an explicit type provided. Presuming we still use the expected type for other arguments, that would give us fn(&u32, &'a u32) -> _ (we'd infer the return type from the body).
  • Ignore the expected type if it contains late-bound regions that don't appear in the inputs. That still kind of permits "weirdness", though... e.g., We would wind up with a signature of fn(&u32, &'a u32) -> &'a u32.
  • Another way to go would be to track regions that are invalidated by an explicit type -- that is, that appear in the expected type of an input but are overridden -- in which case we would wind up with a signature like fn(&u32, &'a u32) -> _ again.
  • Maybe the best thing would be to make ty_of_arg steal the region from the expected type, so that we wind up with fn(&'a u32, 'a u32) -> &'a u32. That would require some careful coding. It would also alter the current behavior of something like foo(|x: &str, y: &str| ...), but that would probably only make more code compile in practice.

Right now, the signatures that can be manually specified in closures are kind of impoverished anyway: one cannot for example write foo(for<'a> |&'a u32, &'a u32| -> &'a u32 { ... }), though you probably should be able to.

@arielb1
Copy link
Contributor

arielb1 commented Sep 27, 2017

@nikomatsakis

I'm tempted to just do some sort of LUB of the expected type and the user-specified type. Of course, we can't just replace the regions with normal "outside-universe" _ and merge them because that'll cause an error, but there's maybe something we can do.

The sort of "frankenmerging" bound regions between the expected return type with the user is certainly broken and I'll want none of it.

@nikomatsakis
Copy link
Contributor

@arielb1

I don't think "LUB" precisely makes sense, but it does seem pretty clear that the intent when you write foo(|x: &str| ...) -- and an expected type exists -- is always going to be "use the regions from the expected type". In particular, I think that anonymous regions in closure signatures seem like they want to come from the expected type.

Would it be too hacky to do something like this:

  • if there is an expected signature S
  • and all the regions in the user-provided signature are late-bound and anonymous
  • and no user-provided types include _ (e.g., nothing like |x: Vec<_>|)
  • and the expected argument types + user-provided types are unifiable, once the (late-bound, anonymous) regions in the user-provided types are replaced with fresh inference variables

then we would assign the types from the expected signature and ignore the user-provided types. In the example given here, the closure |x: &str| would thus be considered to be for<'a> |x: &'a str| -> &'a str (if that were legal syntax), since that comes from the expected type (and &'_ str is unifiable with &'a str). In terms of my "franken" example where the user provides |x: &str, y|, we would wind up with the signature from the expected type again, because the one argument that the user gave is unifiable.

These sorts of cases would use the signatures that the user gave:

// non-anonymous region appears in `x`, so keep both `x` and `y` just as they wrote it
foo(|x: &'a str, y: &str| ..)

// `_` in the user's signature is just weird, ignore the expected type, which I think we would also do today.
bar(|x: Vec<_>, y| ..)

// signatures not unifiable, just use what they wrote
fn baz<F>(f: F) where F: FnOnce(&str)
baz(|x: &u8| ...)

Actually that seems kind of ok.

@nikomatsakis
Copy link
Contributor

We backported a fix for this to beta. I'm going to adjust the tags to mark as regression only on nightly.

@nikomatsakis nikomatsakis added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. and removed regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Oct 9, 2017
bors added a commit that referenced this issue Nov 5, 2017
new rules for merging expected and supplied types in closure signatures

As uncovered in #38714, we currently have some pretty bogus code for combining the "expected signature" of a closure with the "supplied signature". To set the scene, consider a case like this:

```rust
fn foo<F>(f: F)
where
  F: for<'a> FnOnce(&'a u32) -> &'a u32
  // ^ *expected* signature comes from this where-clause
{
    ...
}

fn main() {
    foo(|x: &u32| -> &u32 { .. }
     // ^^^^^^^^^^^^^^^^^ supplied signature
     // comes from here
}
```

In this case, the supplied signature (a) includes all the parts and (b) is the same as the expected signature, modulo the names used for the regions. But often people supply only *some* parts of the signature. For example, one might write `foo(|x| ..)`, leaving *everything* to be inferred, or perhaps `foo(|x: &u32| ...)`, which leaves the return type to be inferred.

In the current code, we use the expected type to supply the types that are not given, but otherwise use the type the user gave, except for one case: if the user writes `fn foo(|x: _| ..)` (i.e., an underscore at the outermost level), then we will take the expected type (rather than instantiating a fresh type variable). This can result in nonsensical situations, particularly with bound regions that link the types of parameters to one another or to the return type. Consider `foo(|x: &u32| ...)` -- if we *literally* splice the expected return type of `&'a u32` together with what the user gave, we wind up with a signature like `for<'a> fn(&u32) -> &'a u32`. This is not even permitted as a type, because bound regions like `'a` must appear also in the arguments somewhere, which is why #38714 leads to an ICE.

This PR institutes some new rules. These are not meant to be the *final* set of rules, but they are a kind of "lower bar" for what kind of code we accept (i.e., we can extend these rules in the future to be smarter in some cases, but -- as we will see -- these rules do accept some things that we then would not be able to back off from).

These rules are derived from a few premises:

- First and foremost, anonymous regions in closure annotation are mostly requests for the code to "figure out the right lifetime" and shouldn't be read too closely. So for example when people write a closure signature like `|x: &u32|`, they are really intended for us to "figure out" the right region for `x`.
    - In contrast, the current code treats this supplied type as being more definitive. In particular, writing `|x: &u32|` would always result in the region of `x` being bound in the closure type. In other words, the signature would be something like `for<'a> fn(&'a u32)` -- this is derived from the fact that `fn(&u32)` expands to a type where the region is bound in the fn type.
    - This PR takes a different approach. The "binding level" for reference types appearing in closure signatures can be informed in some cases by the expected signature. So, for example, if the expected signature is something like `(&'f u32)`, where the region of the first argument appears free, then for `|x: &u32|`, the new code would infer `x` to also have the free region `'f`.
        - This inference has some limits. We don't do this for bindings that appear within the selected types themselves. So e.g. `|x: fn(&u32)|`, when combined with an expected type of `fn(fn(&'f u32))`, would still result in a closure that expects `for<'a> fn(&'a u32)`. Such an annotation will ultimately result in an error, as it happens, since `foo` is supplying a `fn(&'f u32)` to the closure, but the closure signature demands a `for<'a> fn(&'a u32)`. But still we choose to trust it and have the user change it.
        - I wanted to preserve the rough intuition that one can copy-and-paste a type out of the fn signature and into the fn body without dramatically changing its meaning. Interestingly, if one has `|x: &u32|`, then regardless of whether the region of `x` is bound or free in the closure signature, it is also free in the region body, and that is also true when one writes `let x: &u32`, so that intuition holds here. But the same would not be true for `fn(&u32)`, hence the different behavior.
- Second, we must take either **all** the references to bound regions from the expected type or **none**. The current code, as we saw, will happily take a bound region in the return type but drop the other place where it is used, in the parameters. Since bound regions are all about linking multiple things together, I think it's important not to do that. (That said, we could conceivably be a bit less strict here, since the subtyping rules will get our back, but we definitely don't want any bound regions that appear only in the return type.)
- Finally, we cannot take the bound region names from the supplied types and "intermix" them with the names from the expected types.
    - We *could* potentially do some alpha renaming, but I didn't do that.
- Ultimately, if the types the user supplied do not match expectations in some way that we cannot recover from, we fallback to deriving the closure signature solely from those expected types.
    - For example, if the expected type is `u32` but the user wrote `i32`.
    - Or, more subtle, if the user wrote e.g. `&'x u32` for some named lifetime `'x`, but the expected type includes a bound lifetime (`for<'a> (&'a u32)`). In that case, preferring the type that the user explicitly wrote would hide an appearance of a bound name from the expected type, and we try to never do that.

The detailed rules that I came up with are found in the code, but for ease of reading I've also [excerpted them into a gist](https://gist.github.com/nikomatsakis/e69252a2b57e6d97d044c2f254c177f1). I am not convinced they are correct and would welcome feedback for alternative approaches.

(As an aside, the way I think I would ultimately *prefer* to think about this is that the conversion from HIR types to internal types could be parameterized by an "expected type" that it uses to guide itself. However, since that would be a pain, I opted *in the code* to first instantiate the supplied types as `Ty<'tcx>` and then "merge" those types with the `Ty<'tcx>` from the expected signature.)

I think we should probably FCP this before landing.

cc @rust-lang/lang
r? @arielb1
@nikomatsakis
Copy link
Contributor

Believed fixed for good by #45072.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-type-system Area: Type system C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants