Use `Ident`s in HIR and remove emulation of hygiene with gensyms #51492

petrochenkov · 2018-06-11T10:05:13Z

continuation of #51072, part of #49300

Not all Names in HIR are replaced with Idents, only those needed for hygiene or already having attached spans.

petrochenkov · 2018-06-11T10:05:21Z

r? @eddyb

rust-highfive · 2018-06-11T10:05:23Z

r? @michaelwoerister

(rust_highfive has picked a reviewer for you, use r? to override)

michaelwoerister · 2018-06-11T12:00:56Z

Thanks for making progress here, @petrochenkov!

Let's do perf-run because this could potentially affect incr. comp. re-use rates.

@bors try

bors · 2018-06-11T12:01:08Z

⌛ Trying commit 0dcd858 with merge f006dd4...

Use `Ident`s in HIR and remove emulation of hygiene with gensyms continuation of #51072, part of #49300 Not all `Name`s in HIR are replaced with `Ident`s, only those needed for hygiene or already having attached spans.

bors · 2018-06-11T13:39:13Z

☀️ Test successful - status-travis
State: approved= try=True

Mark-Simulacrum · 2018-06-11T13:47:10Z

I've queued the perf build.

eddyb · 2018-06-11T17:51:33Z

src/libsyntax_pos/hygiene.rs

-            }
-        })
-    }
-}


eddyb · 2018-06-11T17:57:31Z

src/librustc/hir/mod.rs

+        }
+    }
+
+    pub fn modern(&self) -> LifetimeName {


I just checked and "modern" doesn't seem to be documented anywhere. Might be a good idea to point everything to Span (or even SyntaxContext)'s modern method, and have that documented.

OK, will document.
The name itself is terribly generic, but it does a relatively intuitive thing - normalizes an identifier "up to macros 2.0", i.e. identifiers coming from different macro_rules! macros or from the non-macro context become same, but identifiers coming from different macro macros still stay different.

Hmm, so this normalizes for "global" or "item" hygiene, as opposed to "local" hygiene?

I have some ideas how to make this less error-prone (so that .modern() is never forgotten accidentally) by removing PartialEq impl for Ident and introducing a couple of comparable newtypes, but this requires some prerequisite work.

eddyb · 2018-06-11T17:58:17Z

r=me (with or without documentation for modern)

Mark-Simulacrum · 2018-06-11T18:25:39Z

Perf is not quite done yet (~6 hours), https://perf.rust-lang.org/compare.html?start=18a00bd9855421a74f40a29fcb4dd2e0c8bea59f&end=f006dd490d005f53043b9c3e12b6a5286d882e6b&stat=instructions%3Au is the URL.

petrochenkov · 2018-06-11T18:43:27Z

src/librustc/ich/impls_ty.rs

@@ -1133,7 +1133,7 @@ impl<'a, 'gcx> HashStable<StableHashingContext<'a>> for ty::CratePredicatesMap<'

 impl_stable_hash_for!(struct ty::AssociatedItem {
    def_id,
-    name,
+    ident,


@michaelwoerister
A suspicious detail: for ty::FieldDef I had to hash only ident.name instead of full ident (in #51072), otherwise I got some linking errors while building standard library crates, but for ty::AssociatedItem hashing the full ident was still okay for some reason.

Mark-Simulacrum · 2018-06-12T17:18:17Z

Perf looks quite bad: 50-100% regressions in incremental println compile times.

petrochenkov · 2018-06-13T07:12:03Z

@michaelwoerister
How to avoid the incremental perf regressions better?
Avoid stable-hashing spans where they didn't previously exist (at least for now)?

eddyb · 2018-06-13T09:37:16Z

@Mark-Simulacrum But is it a fixed cost, that larger crates do not get hit too badly by?

eddyb · 2018-06-13T11:04:00Z

Can we make hashing spans faster, perhaps? What could that entail?

I wonder if there's a good "concatenative hash" we could use to cache "partial hashes" more often (and never hash hashes, among other things). I'm not sure if we really need a guarantee that hash_concat(hash(a), hash(b)) == hash((a, b)) but it mainly should be cheaper to "mix in" a partial hash than having to re-hash it as if it were just some integer value.
cc @eternaleye (who recently linked something similar, but I can't find it)

EDIIT: I thought about maybe replacing stable hashing where collisions are risky with something similar to what we do for Ty, where we always compare by-value as well, but only a single level, not recursing through it, thanks to interning - so what if we interned the HIR?
Also, @edef1c pointed out on IRC that I basically reinvented Merkle trees 😆.

Mark-Simulacrum · 2018-06-13T12:53:17Z

It doesn't appear to be fixed cost, I think - the absolute diffs are also nonconstant.

michaelwoerister · 2018-06-13T14:30:44Z

The problem is not that we hash more spans, the problem is that more data contains spans and therefore changing spans (e.g. when adding a newline somewhere) invalidates more cache entries. (At least I think that's the problem)

petrochenkov · 2018-06-13T14:41:47Z

@michaelwoerister
I assumed that if spans are not hashed by stable hasher then they effectively don't exist from incremental compilation's point of view and don't invalidate anything on change.

I'm not sure what is covered by incremental compilation right now, but if name resolution parts for methods/fields/lifetimes happen before incremental compilation starts working, then incremental compilation can ignore these spans.

eternaleye · 2018-06-13T15:21:34Z

@eddyb: So, you skipped a few steps in your message - the operation you described isn't "reinventing Merkle trees", but it is from the same kind of solution space. Some further information:

First - what you described above is known as an "associative" hash. There are currently no constructions for associative hashes with sufficient review to be cryptographically trustworthy. Monahan et. al. have proposed an approach called HashFusion, but it's reasonably likely that its additional structure (left/right inverses and identity hold for the combiner, not merely associativity) renders it insecure.

The reason to want an associative hash is because you can hash the smallest units, and then merge them in parallel from any set of points in the sequences. This allows for computing hashes of components, and recomputing only the changed components, regardless of the size of components. (For those who've studied ahead, this basically allows the root hash of a Merkle tree to be independent of chunk size.)

The closest anything currently gets to this is Merkle trees. However, Merkle trees have to decide a chunk size up front, necessitating padding for small objects and splitting for large ones. In addition, there are a variety of ways to screw up with Merkle trees, and become vulnerable to attacks that cause distinct sequences of input to produce the same Merkle tree - Bitcoin, among others, has been bitten by this in the past.

In addition, none of these approaches would actually fix the issue here - the reason is that a change that inserted or removed bytes would offset the underlying blocks, resulting in every hash afterwards changing. That's where rolling hashes are used currently, but they are not cryptographically strong.

If you were to try and make something for the actual problem being encountered, you'd need to do something like find the change boundaries using a rolling hash, and check a full cryptographic hash (or direct equality) for any ranges it claims are equal. However, this too has serious drawbacks:

The rolling hash stage has a serious precision/memory tradeoff, in that you need to store one rolling hash per point in the original file you want to be able to count as a boundary.
The boundaries it finds are unpredictable, so you will have to re-read the original file to perform the safety check, which requires keeping the original file around somewhere.

eddyb · 2018-06-14T08:49:21Z

@eternaleye note that in the "EDIT" of #51492 (comment), I mention interning - which doesn't require collision resistance, just rare-enough collisions (for performance), as the keys are always compared, even if the hash matches. The main benefit comes from doing this at the "node level" instead of on entire trees, so the keys you always compare, have a fixed size (for each type of node).

petrochenkov · 2018-06-16T17:33:09Z

@bors try

bors · 2018-06-16T17:33:21Z

⌛ Trying commit e1158a897cb796fa4245e8b342237d4d8482cf24 with merge a9f9140f04b5741ff0bdf7090086c1ad567935c2...

bors · 2018-06-16T19:09:58Z

☀️ Test successful - status-travis
State: approved= try=True

rust-highfive · 2018-06-28T07:10:11Z

The job x86_64-gnu-tools of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

  0     0    0     0    0     0      0      0 --:--:--  0:00:51 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:52 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:53 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:54 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:55 --:--:--     0curl: (6) Could not resolve host: s3-us-west-1.amazonaws.com
[00:59:28] thread 'main' panicked at 'failed to download openssl source: exit code: 6', bootstrap/native.rs:589:17
[00:59:28] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test --no-fail-fast src/doc/book src/doc/nomicon src/doc/reference src/doc/rust-by-example src/tools/rls src/tools/rustfmt src/tools/miri src/tools/clippy
[00:59:28] Build completed unsuccessfully in 0:56:41
[00:59:28] {"book":"test-pass","rust-by-example":"test-pass","nomicon":"test-pass","rls":"test-pass","clippy-driver":"test-pass","reference":"test-pass","miri":"test-pass","rustfmt":"test-pass"}
[00:59:28] Verifying status of book...
---
travis_time:end:007d9fc1:start=1530169797157452703,finish=1530169797164559194,duration=7106491
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:10902f05
$ head -30 ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
head: cannot open ‘./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers’ for reading: No such file or directory
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:17750071
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

rust-highfive · 2018-06-28T07:10:15Z

The job x86_64-gnu-tools of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

  0     0    0     0    0     0      0      0 --:--:--  0:00:51 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:52 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:53 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:54 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:55 --:--:--     0curl: (6) Could not resolve host: s3-us-west-1.amazonaws.com
[00:59:28] thread 'main' panicked at 'failed to download openssl source: exit code: 6', bootstrap/native.rs:589:17
[00:59:28] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test --no-fail-fast src/doc/book src/doc/nomicon src/doc/reference src/doc/rust-by-example src/tools/rls src/tools/rustfmt src/tools/miri src/tools/clippy
[00:59:28] Build completed unsuccessfully in 0:56:41
[00:59:28] {"book":"test-pass","rust-by-example":"test-pass","nomicon":"test-pass","rls":"test-pass","clippy-driver":"test-pass","reference":"test-pass","miri":"test-pass","rustfmt":"test-pass"}
[00:59:28] Verifying status of book...
---
travis_time:end:007d9fc1:start=1530169797157452703,finish=1530169797164559194,duration=7106491
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:10902f05
$ head -30 ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
head: cannot open ‘./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers’ for reading: No such file or directory
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:17750071
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Namely: labels, type parameters, bindings in patterns, parameter names in functions without body. All of these do not need hygiene after lowering to HIR, only span locations.

Remove emulation of hygiene with gensyms

…iously

petrochenkov · 2018-06-28T08:53:26Z

@bors r=eddyb

bors · 2018-06-28T08:53:27Z

📌 Commit d7072b5 has been approved by eddyb

bors · 2018-06-28T09:16:19Z

⌛ Testing commit d7072b5 with merge d84ad59...

Use `Ident`s in HIR and remove emulation of hygiene with gensyms continuation of #51072, part of #49300 Not all `Name`s in HIR are replaced with `Ident`s, only those needed for hygiene or already having attached spans.

bors · 2018-06-28T11:20:36Z

☀️ Test successful - status-appveyor, status-travis
Approved by: eddyb
Pushing d84ad59 to master...

rust-highfive assigned michaelwoerister Jun 11, 2018

rust-highfive assigned eddyb and unassigned michaelwoerister Jun 11, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 11, 2018

eddyb reviewed Jun 11, 2018

View reviewed changes

src/libsyntax_pos/hygiene.rs

}

})

}

}

Copy link

Member

eddyb Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

eddyb reviewed Jun 11, 2018

View reviewed changes

petrochenkov commented Jun 11, 2018

View reviewed changes

petrochenkov force-pushed the hirident branch from 0dcd858 to e1158a8 Compare June 16, 2018 17:31

petrochenkov added 8 commits June 28, 2018 11:04

Use Idents for lifetimes in HIR

c5454c0

Use Idents for associated type bindings in HIR

1fe9b4d

Use Idents for path segments in HIR

e8215a4

Use Idents in a number of structures in HIR

c6ca1e4

Namely: labels, type parameters, bindings in patterns, parameter names in functions without body. All of these do not need hygiene after lowering to HIR, only span locations.

Use Idents for associated item definitions in HIR

f0622df

Remove emulation of hygiene with gensyms

Support delegation in stable hashing macros

d5e9dff

incremental: Do not hash spans for things that didn't have spans prev…

554b428

…iously

Fix rebase

d7072b5

petrochenkov force-pushed the hirident branch from f7eab92 to d7072b5 Compare June 28, 2018 08:37

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 28, 2018

bors mentioned this pull request Jun 28, 2018

Bootstrap from 1.28.0 beta #51717

Merged

bors merged commit d7072b5 into rust-lang:master Jun 28, 2018

This was referenced Jun 28, 2018

Provide existing ref suggestions for more E0308 errors #51822

Merged

Point to lifetime spans on lifetime errors #51862

Merged

Implement Unsized Rvalues #51131

Merged

#48538 Specialization: Intersection Impls #49624

Closed

mati865 mentioned this pull request Jun 28, 2018

Rustup rust-lang/rust-clippy#2880

Merged

dwrensha added a commit to dwrensha/seer that referenced this pull request Jun 30, 2018

rustup for rust-lang/rust#51492

fd8603d

michaelwoerister mentioned this pull request Jul 13, 2018

syntax_pos::symbol::Symbol::gensym() is incompatible with stable hashing. #49300

Closed

petrochenkov mentioned this pull request Nov 30, 2018

Implement RFC 2338, "Type alias enum variants" #56225

Merged

petrochenkov mentioned this pull request May 9, 2019

Tweak Symbol and InternedString #60659

Merged

petrochenkov deleted the hirident branch June 5, 2019 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `Ident`s in HIR and remove emulation of hygiene with gensyms #51492

Use `Ident`s in HIR and remove emulation of hygiene with gensyms #51492

petrochenkov commented Jun 11, 2018

petrochenkov commented Jun 11, 2018

rust-highfive commented Jun 11, 2018

michaelwoerister commented Jun 11, 2018

bors commented Jun 11, 2018

bors commented Jun 11, 2018

Mark-Simulacrum commented Jun 11, 2018

eddyb Jun 11, 2018

eddyb Jun 11, 2018

petrochenkov Jun 11, 2018

eddyb Jun 12, 2018

petrochenkov Jun 12, 2018

petrochenkov Jun 12, 2018

eddyb commented Jun 11, 2018

Mark-Simulacrum commented Jun 11, 2018

petrochenkov Jun 11, 2018 •

edited

Loading

Mark-Simulacrum commented Jun 12, 2018

petrochenkov commented Jun 13, 2018 •

edited

Loading

eddyb commented Jun 13, 2018 •

edited

Loading

eddyb commented Jun 13, 2018 •

edited

Loading

Mark-Simulacrum commented Jun 13, 2018

michaelwoerister commented Jun 13, 2018

petrochenkov commented Jun 13, 2018

eternaleye commented Jun 13, 2018

eddyb commented Jun 14, 2018

petrochenkov commented Jun 16, 2018

bors commented Jun 16, 2018

bors commented Jun 16, 2018

rust-highfive commented Jun 28, 2018

rust-highfive commented Jun 28, 2018

petrochenkov commented Jun 28, 2018

bors commented Jun 28, 2018

bors commented Jun 28, 2018

bors commented Jun 28, 2018

Use Idents in HIR and remove emulation of hygiene with gensyms #51492

Use Idents in HIR and remove emulation of hygiene with gensyms #51492

Conversation

petrochenkov commented Jun 11, 2018

petrochenkov commented Jun 11, 2018

rust-highfive commented Jun 11, 2018

michaelwoerister commented Jun 11, 2018

bors commented Jun 11, 2018

bors commented Jun 11, 2018

Mark-Simulacrum commented Jun 11, 2018

eddyb Jun 11, 2018

Choose a reason for hiding this comment

eddyb Jun 11, 2018

Choose a reason for hiding this comment

petrochenkov Jun 11, 2018

Choose a reason for hiding this comment

eddyb Jun 12, 2018

Choose a reason for hiding this comment

petrochenkov Jun 12, 2018

Choose a reason for hiding this comment

petrochenkov Jun 12, 2018

Choose a reason for hiding this comment

eddyb commented Jun 11, 2018

Mark-Simulacrum commented Jun 11, 2018

petrochenkov Jun 11, 2018 • edited Loading

Choose a reason for hiding this comment

Mark-Simulacrum commented Jun 12, 2018

petrochenkov commented Jun 13, 2018 • edited Loading

eddyb commented Jun 13, 2018 • edited Loading

eddyb commented Jun 13, 2018 • edited Loading

Mark-Simulacrum commented Jun 13, 2018

michaelwoerister commented Jun 13, 2018

petrochenkov commented Jun 13, 2018

eternaleye commented Jun 13, 2018

eddyb commented Jun 14, 2018

petrochenkov commented Jun 16, 2018

bors commented Jun 16, 2018

bors commented Jun 16, 2018

rust-highfive commented Jun 28, 2018

rust-highfive commented Jun 28, 2018

petrochenkov commented Jun 28, 2018

bors commented Jun 28, 2018

bors commented Jun 28, 2018

bors commented Jun 28, 2018

Use `Ident`s in HIR and remove emulation of hygiene with gensyms #51492

Use `Ident`s in HIR and remove emulation of hygiene with gensyms #51492

petrochenkov Jun 11, 2018 •

edited

Loading

petrochenkov commented Jun 13, 2018 •

edited

Loading

eddyb commented Jun 13, 2018 •

edited

Loading

eddyb commented Jun 13, 2018 •

edited

Loading