Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve import resolution #31726

Merged
merged 6 commits into from
Mar 5, 2016
Merged

Conversation

jseyfried
Copy link
Contributor

This PR improves the import resolution algorithm.

First, it records that an import succeeded or failed for one namespace (by calling decrement_outstanding_references_for and try_define_child if successful) even if it is still indeterminate in the other namespace, fixing #31444.

Second, it starts importing bindings from globs as soon as the glob path is determined.
It maintains links from imported modules to their importers so that when a resolution becomes successful in an imported module, a corresponding binding will be added to the importer module.
It also maintains links from importer modules to imported modules so that we can determine if an undefined name is indeterminate or failing by recursively checking this in the imported modules.
This allows, for example:

mod foo {
    pub mod baz {}
    pub use bar::baz::*;
}

mod bar {
    pub use foo::*;
}

It also allows cycles of pub glob imports, although by to the current shadowing rules, the only way for such a cycle to compile is if each participating module defines no names. Incidentally, this PR lays the groundwork for more permissive feature-gated shadowing rules.

Finally, this PR encapsulates almost all implementation details of import resolution in resolve_imports (some of which used to be in lib.rs) and refactors reexport recording, shadowed trait collecting, some duplicate checking, and the private_in_public lint out of the core import resolution algorithm and into a post-processing pass in resolve_imports.

r? @nrc

@jseyfried
Copy link
Contributor Author

cc @nikomatsakis @petrochenkov

@jseyfried jseyfried force-pushed the improve_import_resolution branch 4 times, most recently from 76d980f to 1e7de92 Compare February 17, 2016 11:49
@retep998
Copy link
Member

It also allows glob import cycles, although by to the current shadowing rules, the only way for a cycle of glob imports to compile is if each participating module defines no names.

I am migrating towards this in winapi and it currently compiles:

pub mod foo {
    use super::*;
    pub struct Bar { ... }
}
pub use foo::*;
// Plus a hundred more modules like this

Will this continue to work with these changes?

@jseyfried
Copy link
Contributor Author

@retep998 good point -- yes, this will continue to work.

I should have said that this PR allows cycles of pub glob imports since as your example shows, we already allow cycles of glob imports in general (and the modules don't need to be empty).

@bors
Copy link
Contributor

bors commented Feb 19, 2016

☔ The latest upstream changes (presumably #31716) made this pull request unmergeable. Please resolve the merge conflicts.

@jseyfried
Copy link
Contributor Author

@nrc thanks for the feedback, I'll address these comments in the rebase to keep the history clean.

@jseyfried jseyfried force-pushed the improve_import_resolution branch 5 times, most recently from 398f868 to 5fcbf39 Compare February 20, 2016 06:48
@bors
Copy link
Contributor

bors commented Feb 20, 2016

☔ The latest upstream changes (presumably #31674) made this pull request unmergeable. Please resolve the merge conflicts.

@jseyfried jseyfried force-pushed the improve_import_resolution branch from 5fcbf39 to d5e8a97 Compare February 20, 2016 22:18
@jseyfried
Copy link
Contributor Author

Rebased and addressed the comments

@nikomatsakis
Copy link
Contributor

r? @nikomatsakis

@rust-highfive rust-highfive assigned nikomatsakis and unassigned nrc Feb 23, 2016
@nikomatsakis
Copy link
Contributor

(@nrc is on vacation)

@bors
Copy link
Contributor

bors commented Feb 25, 2016

☔ The latest upstream changes (presumably #31882) made this pull request unmergeable. Please resolve the merge conflicts.

@jseyfried jseyfried force-pushed the improve_import_resolution branch 2 times, most recently from 1e463bb to a7cc1cc Compare February 26, 2016 18:47
@bors
Copy link
Contributor

bors commented Mar 3, 2016

☔ The latest upstream changes (presumably #31824) made this pull request unmergeable. Please resolve the merge conflicts.

@jseyfried jseyfried force-pushed the improve_import_resolution branch 2 times, most recently from 5ab30e9 to 45e2709 Compare March 3, 2016 12:30
@jseyfried
Copy link
Contributor Author

@petrochenkov

I'd prefer prelude and primitive types to not exist at build_reduced_graph/resolve_imports stage at all

Primitive types already don't exist at this stage; I agree about the prelude.

We already do the "two stage lookup" you describe (well, actually an "n stage lookup") in resolve_identifier_in_local_ribs, where the "normal lookup" is in an anonymous module (i.e. a block with items) and the "secondary lookups" are in the anonymous module's parent scopes. I was planning on treating the prelude as the parent scope of all non-anonymous, non-no_implicit_prelude modules.

I'm not too familiar with how primitive types are resolved, but the first two lines of main look troubling to me. If we put primitive types in the prelude, both those uses of u8 would resolve to the module.

@jseyfried
Copy link
Contributor Author

Oh, I see -- you want to do secondary lookup even if the first segment of the path succeeds but the path still fails later on. If we treated secondary lookups in all parent scopes this way, the following would compile (for example):

fn f() {
    mod foo { pub fn f() {} }
    fn g() {
        mod foo { pub fn g() {} }
        foo::f();
    }
}

I'm not sure this is a good idea.

@arielb1
Copy link
Contributor

arielb1 commented Mar 8, 2016

@petrochenkov

\Why would that be different from

mod m {
    use std::vec::Vec;
    mod n {
        type A = super::Vec<u8>;
    }
}

I feel that these should be basically equivalent.

@jseyfried
Copy link
Contributor Author

I want it to be different because I think we should consider the prelude not as something that gets imported into all modules' scopes but instead as its own scope that is the parent of all modules' scopes. @nrc suggests this interpretation in his sets of scopes blog post.

@arielb1
Copy link
Contributor

arielb1 commented Mar 8, 2016

I feel that normal private imports should behave just like prelude imports. BTW, I think that primitive types should behave like normal imports too - I think what we do currently is a crazy backward compatibility thing.

@petrochenkov
Copy link
Contributor

@arielb1
In your example name Vec is intentionally defined in module m, in my example it's more like leaking details of the prelude implementation. Prelude names are supposed to be found during name lookup first of all, they are not supposed to be defined in every module of a crate.

@arielb1
Copy link
Contributor

arielb1 commented Mar 8, 2016

But in practice use gets used like a prelude import - you add it to the file whenever you need it.

@petrochenkov
Copy link
Contributor

@jseyfried

Oh, I see -- you want to do secondary lookup even if the first segment of the path succeeds but the path still fails later on. ... I'm not sure this is a good idea.

It's not a good idea in general, but this mechanism is necessary for primitive types due to backward compatibility, because modules std::u8/std::u16/etc were stabilized in 1.0 (I still see this stabilization as a major f**k up). But we can turn this mistake in our favor by developing a proper "perfect shadowing" mechanism for primitive types and it may be reasonable to apply the same mechanism to prelude names to "harden" them. With such mechanism we can add primitive names (like much requested u128 or f16) and names to the prelude without fear of breaking something.

@arielb1
Copy link
Contributor

arielb1 commented Mar 8, 2016

That "perfect shadowing" is an ugly hack. I prefer to keep it to as small a set of things as possible, and remove it in Rust 2.0 when we release that.

@jseyfried
Copy link
Contributor Author

I agree with @arielb1 on "perfect shadowing" -- if we want to add u128 or f16, I think we should do it in the prelude, even if it ends up not being perfectly analogous to the existing primitive types.

@petrochenkov
Copy link
Contributor

That "perfect shadowing" is an ugly hack. I prefer to keep it to as small a set of things as possible, and remove it in Rust 2.0

Well, my position is "do it properly or don't do it at all", so the "minimize it and remove it in Rust 2.0" variant also works for me.
I wonder how much the scope of the current hack can be reduced while keeping the breakage minimal.

use std::u8;
fn f(arg: u8) {}

should certainly be supported, but I'm not so sure about

let a = u8::max_value(); // <type u8>::max_value
let b = u8::MAX; // <mod u8>::MAX

I'll try to do some experiments.

bors added a commit that referenced this pull request Mar 10, 2016
…sakis

Fix a regression in import resolution

This fixes #32089 (caused by #31726) by deducing that name resolution has failed (as opposed to being determinate) in more cases.

r? @nikomatsakis
bors added a commit that referenced this pull request Mar 13, 2016
…chton

Fix import resolution bug

This fixes #32222, which was introduced in #31726.
@jseyfried jseyfried deleted the improve_import_resolution branch March 25, 2016 22:58
@nikomatsakis
Copy link
Contributor

@jseyfried

Sorry for taking a while to get back to you on this comment, but here is my response. :)

Expanding a macro invocation (taking hygiene into account) a pure refactoring.

There is (I think) a strong and a weak form of this condition. I propose we adopt the weaker form:

  • If the program compiles, then hand-expanding a macro (modulo hygiene) should not stop it from compiling.

In particular, this form allows programs that fail to compile to start compiling after something is hand-expanded.

The algorithm that the prototype currently implements does not satisfy the second condition.

I agree. In fact, I was just talking to @nrc about why this was a problem when he pointed me at your comment, and I realized I had forgotten to come back and digest it. I am not happy with that particular rule (the placeholder rule) in my prototype for a variety of reasons -- and the fact that it is sort of insufficient around macro expansion is part of it.

Don't allow the resolution of a glob-imported name to succeed while there are unexpanded macros in its module since their expansions could end up shadowing the glob-imported name.

I think this is a non-starter because if forbids programs like:

use prelude::*; // imports macro m
m! { .. }

Don't allow a macro-expanded item to shadow a glob-imported name so that the resolution of a glob-imported name can succeed even while there are unexpanded macros.

I think this is a good idea. I'm not sure why I found it unappealing at some point. It seems like a better compromise to me now. The way I envision it, we would walk back over after we've finished expansion and look at every glob. For each glob use path::*, we consider the set of eligible names in path (ignoring whether or not they are shadowed locally). We then check whether there exists some eligible name X where there is also a macro-defined identifier named X that shadows it and report an error. (It is possible to have a case where there are two shadows of the glob, one explicit, and hence legal, and one from a macro, and hence illegal, but that's ok. That's just two separate errors -- or else maybe we suppress one of the two.)

Does that sound good? @nrc had a big chat about it today and seemed to feel it was a good approach. I think he is going to try and write something up, perhaps on the RFC itself.

Don't allow a macro-expanded item to shadow a glob-imported name so that the resolution of a glob-imported name can succeed even while there are unexpanded macros. I

@jseyfried
Copy link
Contributor Author

I propose we adopt the weaker form

Agreed, I think this is the form I had in mind (for both conditions).

Does that sound good?

Yeah. I'd probably do the glob-imported/macro-expanded conflict checking eagerly, but that doesn't really matter.

I think this is a non-starter because it forbids programs like [given example]

Even after disallowing glob-imported/macro-expanded conflicts, we would still have to forbid similar programs like

use prelude::*; // imports macro m
fn f() {
    m! { ... }
}

since m! { ... } could define another macro m, which would shadow the prelude import without being a conflict.
This seems unavoidable though; while it's tempting to allow this and then check for time travel errors, it would similarly break the (weak) second coherence condition (I can provide examples if you'd like).

bors added a commit that referenced this pull request Apr 13, 2016
…nikomatsakis

resolve: Improve duplicate glob detection

This fixes a bug introduced in #31726 in which we erroneously allow multiple imports of the same item under some circumstances.

More specifically, we erroneously allow a module that is in a cycle of glob re-exports to have other re-exports (besides the glob from the cycle).
For example,
```rust
pub fn f() {}
mod foo {
    pub use f; // (1) This defines `foo::f`.
    pub use bar::*; // (3) This also defines `foo::f`, which should be a duplicate error but is currently allowed.
}
mod bar {
    pub use foo::*; // (2) This defines `bar::f`.
}
```

A module in a glob re-export cycle can still have `pub` items after this PR. For example,
```rust
mod foo {
    pub fn f() {}; // (1) This defines `foo::f`.
    pub use bar::*; // (3) This is not a duplicate error since items shadow glob-imported re-exports (cf #31337).
}
mod bar {
    pub use foo::*; // (2) This defines `bar::f`.
}
```
r? @nikomatsakis
@nikomatsakis
Copy link
Contributor

@jseyfried

This seems unavoidable though; while it's tempting to allow this and then check for time travel errors, it would similarly break the (weak) second coherence condition (I can provide examples if you'd like).

Hmm. Please do provide some examples. This affects my opinion. I think it absolutely must work that you can import from the prelude and use it, and it should only be a violation if you actually do shadow something that came from a glob.

In other words, this program snippet:

use prelude::*; // imports macro m
fn f() {
    m! { ... }
}

would only be an error if the expansion of m actually defined a macro m (or some other name that shadows something in the prelude). Making it an error simply because m might define such a name doesn't seem necessary.

@jseyfried
Copy link
Contributor Author

Here is one example -- this would compile:

macro_rules! foo {
    macro_rules! m {}
}
mod bar {
    ::foo! {} // This gets expanded in round 1
}
use bar::*;
fn f() {
    m! {} // After round 1, this will resolve to `f::m`, which shadows the glob-imported `bar::m`
    ::foo! {} // This gets expanded in round 1
}

... but manually expanding the first invocation of foo! would break it:

macro_rules! foo {
    macro_rules! m {}
}
mod bar {
    macro_rules! m {}
}
use bar::*;
fn f() {
    m! {} // This resolves to the glob imported `bar::m` and gets expanded in round 1
    ::foo! {} // This gets expanded in round 1, after which the above is a time travel error
}

@nikomatsakis
Copy link
Contributor

@jseyfried

I guess the question is whether having a macro-defined item that shadows a glob from a parent scope is still an error. I would think that, to go with the spirit of the rule, the answer is probably yes. If that were the case, then your example would never have compiled to begin with, because foo! would expand to a macro m that shadows the macro available from the glob.

@jseyfried
Copy link
Contributor Author

@nikomatsakis

I would think that, to go with the spirit of the rule, the answer is probably yes.

Thinking about this some more, I agree. I amended rule (2) from my comment to reflect this.

I think we should require that any implementation be backwards compatible with (amended) rule (2) (in addition to being coherent) as a way to formalize the requirement that your two examples should work.

While implementing rule (2) itself would probably be OK, the macro-expanded item / glob-import conflicts are annoying and the macro-expanded item / ancestor scope conflicts are troubling since they violate the spirit of lexical scopes.

I came up with two independent improvements on naively implementing rule (2) that would allow the above conflicts except in uncommon, usually fixable cases while still satisfying the requirements. After implementing naive rule (2), we could decide to implement either one or both improvements.

The first improvement is to first resolve and expand as much as we can using rule (1). After hitting fixpoint, we would switch to rule (2) and only disallow additional macro-expanded conflicts with globs or parent scopes.

This would allow such a conflict to be fixed by using a fully qualified path in the macro invocation (if such a path exists, i.e. if the macro is not defined in an anonymous block).
For example, this would be a conflict error:

use upstream_crate::*; // Suppose `foo` was added upstream
macro_rules! m {
    ($i:ident) => { mod $i {} }
}

fn main() {
    m!(foo); // This would expand to `mod foo {}` in the rule (2) phase,
             // which would conflict with the glob-imported `foo` in the parent scope.
}

but this would compile:

use upstream_crate::*;
macro_rules! m {
    ($i:ident) => { mod $i {} }
}

fn main() {
    ::m!(foo); // This would expand during the rule (1) phase, so the conflict wouldn't matter.
}

The second improvement is to allow a macro-expanded item in a module to conflict with a glob-import or an ancestor scope (only) if we never try to resolve the item's name in the module during macro expansion (if we're using the first improvement, only during the rule (2) phase). I'm assuming a recursive resolution algorithm, so that the following would be a conflict error:

macro_rules! m {
    ($i:ident) => {
        mod $i {
            macro_rules! n {}
        }
    }
}
mod foo {}
fn main() {
    {
        foo::n! {} //< resolving `foo` in this scope would recursively resolve `foo` in ...
    }
    m!(foo); //< this parent scope, so that this expanded `foo` would conflict with `::foo`
}

Note the similarity to your "no time-travel" requirement, in which we don't allow the resolution of a name in a module to change (for which it is necessary but not sufficient that a macro-expanded item shadow a glob-import or ancestor scope, modulo standard duplicate errors) unless the resolution is unused during macro expansion. I believe the second improvement is the least strengthening of the time-travel requirement needed to achieve coherence.

@nrc
Copy link
Member

nrc commented Apr 15, 2016

While implementing rule (2) itself would probably be OK, the macro-expanded item / glob-import conflicts are annoying and the macro-expanded item / ancestor scope conflicts are troubling since they violate the spirit of lexical scopes.

This doesn't seem too annoying to me - while I see the conflict with ancestor scope, the rule (to me) exists at a level above the lexical scoping rules and so this makes sense.

I would prefer to have a simple rule that gives more errors, rather than a complicated rule that gives fewer errors. I think the errors will be easier to understand in that case, and I believe they can always be worked around (by deglobbing the glob import).

Your second suggestion seems reasonable - I think the spirit of the proposed name resolution changes is that we allow clashes if they do no harm, and this seems like another case of that. I'm not sure if there should be an error if we try to resolve a conflicting name in the later name resolution phase - this would never have time travel error issues, but it seems odd to have slightly different rules here for macros (and imports) and non-macros.

I believe the second improvement is the least strengthening of the time-travel requirement needed to achieve coherence.

What do you mean by "coherence" here?

@jseyfried
Copy link
Contributor Author

jseyfried commented Apr 15, 2016

My primary motivation for wanting to allow the conflicts is so that user-defined "item constructor" macros, for example:

macro_rules! my_newtype {
    ($name:ident) => {
        struct $name(i32);
        impl $name { ... }
    }
}

are as "first class" as possible (compared to mod, struct, etc., which we could view as the built-in item constructors by analogy).

That being said, I can see that it might not be worth the complexity. I think we should implement and land naive rule (2) first, after which we can assess how much complexity the extensions would add and if the conflicts are an issue in practice.

it seems odd to have slightly different rules here for macros (and imports) and non-macros

Agreed -- we definitely wouldn't want to have a macro-expanded item become a conflict error only after the user imports it.

I originally thought we would exempt imports by resolving them conservatively (i.e. not allowing a glob-imported name to resolve in a module with unexpanded macros). While I suppose this technically violates backwards compatibility with naive rule (2), the only breaking examples I could come up with were unnaturally circular and (imo) very unlikely be an issue in practice.
This way, the conflict rules would only be enforced for the macro-expanded items which are directly used in the path of a macro invocation, which I imagine would be a small fraction of all macro-expanded items in practice.

Another option would be to exempt the value namespace, which is never relevant to macro expansion, while enforcing the conflict rules strictly for macro-expanded types and macros.

What do you mean by "coherence" here?

The second coherence condition from this comment (second bullet point).

@nikomatsakis
Copy link
Contributor

I definitely like @jseyfriend's improvement that only gives errors if the
macro generates an item that is later used to resolve another macro (but
which shadows an item in a glob). That said, it is certainly the case
though that this is one a more complex rule that will require more
explaining -- but it will require explaining far less often. I feel like
we've often found this is a trade-off worth making. Another way to say this
is that it would be very nice if refactoring a bunch of hand-written struct
definitions into a local macro worked more-or-less all the time.

The "absolute path" rule is also somewhat appealing to me. In particular,
it suggests that the workaround would not be "manually expand your glob",
which is a pain, but rather "we had trouble resolve this macro, please
write out an absolute path".

That said, I think a good exercise in both of these cases is to write out
what the error message would be in the end. @nrc and I did this when
formulating the original (conservative) rule, and we imagined something
like:

  • macro generates an item x that is also available through a glob import

With a suggested fix of removing the glob, and presumably an extended help
that explains why this matters (with an example maybe).

Honestly, these error messages ought to be part of the RFC. :) Also, maybe
we should move this conversation to that thread?

Also, to what extent can we phase these rules in and experiment? This feels
like a place it would be nice to take our time a bit.

On Fri, Apr 15, 2016 at 2:23 AM, Jeffrey Seyfried [email protected]
wrote:

My primary motivation for wanting to allow the conflicts is so that
user-defined item constructor macros, for example:

macro_rules! my_newtype {
($name:ident) => {
struct $name(i32);
impl $name { ... }
}
}

would be as "first class" as possible (compared to mod, struct, etc.,
which we could view as the built-in item constructors by analogy).

That being said, I can see that it might not be worth the complexity. I
think we should implement and land naive rule (2) first, after which we can
assess how much complexity the extensions would add and if the conflicts
are an issue in practice.

it seems odd to have slightly different rules here for macros (and
imports) and non-macros

Agreed -- we definitely wouldn't want to have a macro-expanded item become
a conflict error only after the user imports it.

I originally thought we would exempt imports by resolving them
conservatively (i.e. not allowing a glob-imported name to resolve in a
module with unexpanded macros). While I suppose this technically violates
backwards compatibility with naive rule (2), the only breaking examples I
could come up with were unnaturally circular and (imo) very unlikely be an
issue in practice.
This way, the conflict rules would only be enforced for the macro-expanded
items which are directly used in the path of a macro invocation, which I
imagine would be a small fraction of all macro-expanded items in practice.

Another option would be to exempt the value namespace, which is never
relevant to macro expansion, while enforcing the conflict rules strictly
for macro-expanded types and macros.

What do you mean by "coherence" here?

The second coherence condition (second bullet point) from this comment
#31726 (comment).


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#31726 (comment)

@jseyfried
Copy link
Contributor Author

to what extent can we phase these rules in and experiment?

I think we should start as conservatively as possible -- that is, disallow all macro-expanded items from shadowing globs and parent scopes and also when resolving imports, don't allow a glob-imported name to resolve in a module with unexpanded macros. From here, we could backwards-compatibly implement any of the ideas discussed so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants