-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Ident
s in HIR and remove emulation of hygiene with gensyms
#51492
Conversation
r? @eddyb |
(rust_highfive has picked a reviewer for you, use r? to override) |
Thanks for making progress here, @petrochenkov! Let's do perf-run because this could potentially affect incr. comp. re-use rates. @bors try |
☀️ Test successful - status-travis |
I've queued the perf build. |
} | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
src/librustc/hir/mod.rs
Outdated
} | ||
} | ||
|
||
pub fn modern(&self) -> LifetimeName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked and "modern" doesn't seem to be documented anywhere. Might be a good idea to point everything to Span
(or even SyntaxContext
)'s modern
method, and have that documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will document.
The name itself is terribly generic, but it does a relatively intuitive thing - normalizes an identifier "up to macros 2.0", i.e. identifiers coming from different macro_rules!
macros or from the non-macro context become same, but identifiers coming from different macro
macros still stay different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so this normalizes for "global" or "item" hygiene, as opposed to "local" hygiene?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some ideas how to make this less error-prone (so that .modern()
is never forgotten accidentally) by removing PartialEq
impl for Ident
and introducing a couple of comparable newtypes, but this requires some prerequisite work.
r=me (with or without documentation for |
Perf is not quite done yet (~6 hours), https://perf.rust-lang.org/compare.html?start=18a00bd9855421a74f40a29fcb4dd2e0c8bea59f&end=f006dd490d005f53043b9c3e12b6a5286d882e6b&stat=instructions%3Au is the URL. |
src/librustc/ich/impls_ty.rs
Outdated
@@ -1133,7 +1133,7 @@ impl<'a, 'gcx> HashStable<StableHashingContext<'a>> for ty::CratePredicatesMap<' | |||
|
|||
impl_stable_hash_for!(struct ty::AssociatedItem { | |||
def_id, | |||
name, | |||
ident, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelwoerister
A suspicious detail: for ty::FieldDef
I had to hash only ident.name
instead of full ident
(in #51072), otherwise I got some linking errors while building standard library crates, but for ty::AssociatedItem
hashing the full ident
was still okay for some reason.
Perf looks quite bad: 50-100% regressions in incremental println compile times. |
@michaelwoerister |
@Mark-Simulacrum But is it a fixed cost, that larger crates do not get hit too badly by? |
Can we make hashing spans faster, perhaps? What could that entail? I wonder if there's a good "concatenative hash" we could use to cache "partial hashes" more often (and never hash hashes, among other things). I'm not sure if we really need a guarantee that EDIIT: I thought about maybe replacing stable hashing where collisions are risky with something similar to what we do for |
It doesn't appear to be fixed cost, I think - the absolute diffs are also nonconstant. |
The problem is not that we hash more spans, the problem is that more data contains spans and therefore changing spans (e.g. when adding a newline somewhere) invalidates more cache entries. (At least I think that's the problem) |
@michaelwoerister I'm not sure what is covered by incremental compilation right now, but if name resolution parts for methods/fields/lifetimes happen before incremental compilation starts working, then incremental compilation can ignore these spans. |
@eddyb: So, you skipped a few steps in your message - the operation you described isn't "reinventing Merkle trees", but it is from the same kind of solution space. Some further information: First - what you described above is known as an "associative" hash. There are currently no constructions for associative hashes with sufficient review to be cryptographically trustworthy. Monahan et. al. have proposed an approach called HashFusion, but it's reasonably likely that its additional structure (left/right inverses and identity hold for the combiner, not merely associativity) renders it insecure. The reason to want an associative hash is because you can hash the smallest units, and then merge them in parallel from any set of points in the sequences. This allows for computing hashes of components, and recomputing only the changed components, regardless of the size of components. (For those who've studied ahead, this basically allows the root hash of a Merkle tree to be independent of chunk size.) The closest anything currently gets to this is Merkle trees. However, Merkle trees have to decide a chunk size up front, necessitating padding for small objects and splitting for large ones. In addition, there are a variety of ways to screw up with Merkle trees, and become vulnerable to attacks that cause distinct sequences of input to produce the same Merkle tree - Bitcoin, among others, has been bitten by this in the past. In addition, none of these approaches would actually fix the issue here - the reason is that a change that inserted or removed bytes would offset the underlying blocks, resulting in every hash afterwards changing. That's where rolling hashes are used currently, but they are not cryptographically strong. If you were to try and make something for the actual problem being encountered, you'd need to do something like find the change boundaries using a rolling hash, and check a full cryptographic hash (or direct equality) for any ranges it claims are equal. However, this too has serious drawbacks:
|
@eternaleye note that in the "EDIT" of #51492 (comment), I mention interning - which doesn't require collision resistance, just rare-enough collisions (for performance), as the keys are always compared, even if the hash matches. The main benefit comes from doing this at the "node level" instead of on entire trees, so the keys you always compare, have a fixed size (for each type of node). |
@bors try |
⌛ Trying commit e1158a897cb796fa4245e8b342237d4d8482cf24 with merge a9f9140f04b5741ff0bdf7090086c1ad567935c2... |
☀️ Test successful - status-travis |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
1 similar comment
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Namely: labels, type parameters, bindings in patterns, parameter names in functions without body. All of these do not need hygiene after lowering to HIR, only span locations.
Remove emulation of hygiene with gensyms
@bors r=eddyb |
📌 Commit d7072b5 has been approved by |
☀️ Test successful - status-appveyor, status-travis |
continuation of #51072, part of #49300
Not all
Name
s in HIR are replaced withIdent
s, only those needed for hygiene or already having attached spans.