Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FreezeLock type and use it to store Definitions #115401

Merged
merged 7 commits into from
Sep 6, 2023
Merged

Conversation

Zoxc
Copy link
Contributor

@Zoxc Zoxc commented Aug 31, 2023

This adds a FreezeLock type which allows mutation using a lock until the value is frozen where it can be accessed lock-free. It's used to store Definitions in Untracked instead of a RwLock. Unlike the current scheme of leaking read guards this doesn't deadlock if definitions is written to after no mutation are expected.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 31, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 31, 2023

Have you benchmarked using the std RwLock unconditionally for Definitions?

@Zoxc
Copy link
Contributor Author

Zoxc commented Aug 31, 2023

RwLock is currently unconditionally parking_lot::RwLock for the parallel compiler. I wouldn't expect the std one to be better.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 31, 2023

I thought this PR used Freeze in the nonparallel compiler, too? I meant trying out using RwLock there, too.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 1, 2023

The new datastructure is fancy enough that I think it should have a test suite with various possible executions, and it should pass miri and loom.

But before we go there, it may be good to actually show that this is going to be faster than a lock or rwlock with some benchmarks. If this is meant to get the perf of the non-parallel compiler, it may be good to try out switching to std::sync::RwLock on the non-parallel compiler and running it through the perf suite.

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 1, 2023

I'm still a bit confused about your RwLock suggestions.

The main benefit here and in #115418 is the nicer semantics over RwLock. In particular trying to write once frozen does not deadlock, unlike the current scheme of leaking read guards. I plan to apply it to CStore too. That might be hot enough to show a performance benefit on multiple threads.

There does seem to be a bit of lock contention on Definitions before freezing, so this should probably use RwLock inside. I wonder where that contention is, maybe they end up frozen a bit late currently.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 1, 2023

That makes a lot of sense. Please include this information in the main post, or even better the documentation on Freeze, comparing it with the RwLock usage.

Let's see what perf says about it in the single threaded case

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 1, 2023
@bors
Copy link
Contributor

bors commented Sep 1, 2023

⌛ Trying commit 4b90650cfed7714649dd4fce2026911d73dddc56 with merge fbe9d41f54b8ad93a882dc9bb0d841c032925d5c...

@bors
Copy link
Contributor

bors commented Sep 1, 2023

☀️ Try build successful - checks-actions
Build commit: fbe9d41f54b8ad93a882dc9bb0d841c032925d5c (fbe9d41f54b8ad93a882dc9bb0d841c032925d5c)

@rust-timer

This comment has been minimized.

@Zoxc Zoxc changed the title Add Freeze type and use it to store Definitions Add FreezeLock type and use it to store Definitions Sep 1, 2023
@bors
Copy link
Contributor

bors commented Sep 2, 2023

☔ The latest upstream changes (presumably #115422) made this pull request unmergeable. Please resolve the merge conflicts.

@vacuus
Copy link
Contributor

vacuus commented Sep 6, 2023

Out of curiosity, since I'm struggling to understand memory ordering: in the case where read observes that frozen is true, does the Acquire load happen-before the Release store in freeze, which connects with the happens-before sequence of the lock accesses? That would ensure that read returns the proper frozen value after all writes are done even without locking, right?

Thank you very much for your time!

@oli-obk
Copy link
Contributor

oli-obk commented Sep 6, 2023

Using RwLock in non-parallel rustc does have a performance overhead

I tried this out in #115557. Looks like incremental encoding invokes these functions a lot after the data structure gets frozen.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 6, 2023

The new datastructure is fancy enough that I think it should have a test suite with various possible executions, and it should pass miri and loom.

I still believe this, but it's simple enough, so 🤷

r=me after a rebase

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 6, 2023

I rebased it and changed the guards to pass a reference to the lock to deal with the issue of passing guards as function parameters.

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 6, 2023

@vacuus The store happens before the load, otherwise that sounds right.

@rust-log-analyzer

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 6, 2023

@bors r+

@bors
Copy link
Contributor

bors commented Sep 6, 2023

📌 Commit 35e8b67 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2023
@bors
Copy link
Contributor

bors commented Sep 6, 2023

⌛ Testing commit 35e8b67 with merge a0c28cd...

@bors
Copy link
Contributor

bors commented Sep 6, 2023

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing a0c28cd to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 6, 2023
@bors bors merged commit a0c28cd into rust-lang:master Sep 6, 2023
@rustbot rustbot added this to the 1.74.0 milestone Sep 6, 2023
@Zoxc Zoxc deleted the freeze branch September 6, 2023 13:50
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (a0c28cd): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 11
Regressions ❌
(secondary)
0.3% [0.3%, 0.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.3% [0.2%, 0.4%] 11

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.1% [2.7%, 3.5%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.2% [1.2%, 1.2%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.2% [1.2%, 1.2%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 627.8s -> 627.776s (-0.00%)
Artifact size: 317.88 MiB -> 317.73 MiB (-0.05%)

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 7, 2023

There's some incremental regressions, so I tested the parallel compiler in that scenario:

Benchmark with 2 threadsBeforeAfter
TimeTime%
🟣 diesel:check:unchanged2.0623s2.0533s -0.43%
🟣 syn:check:unchanged0.6457s0.6443s -0.22%
Total2.7080s2.6976s -0.38%
Summary1.0000s0.9967s -0.33%
Benchmark with 6 threadsBeforeAfter
TimeTime%
🟣 diesel:check:unchanged2.2563s2.2439s -0.55%
🟣 syn:check:unchanged0.7443s0.7397s -0.62%
Total3.0005s2.9836s -0.56%
Summary1.0000s0.9942s -0.58%

It seems like this more of a serial / parallel compiler tradeoff, performance wise.

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 7, 2023
Use `Freeze` for `SourceFile`

This uses the `Freeze` type in `SourceFile` to let accessing `external_src` and `lines` be lock-free.

Behavior of `add_external_src` is changed to set `ExternalSourceKind::AbsentErr` on a hash mismatch which matches the documentation. `ExternalSourceKind::Unneeded` was removed as it's unused.

Based on rust-lang#115401.
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 8, 2023
Use `Freeze` for `SourceFile`

This uses the `Freeze` type in `SourceFile` to let accessing `external_src` and `lines` be lock-free.

Behavior of `add_external_src` is changed to set `ExternalSourceKind::AbsentErr` on a hash mismatch which matches the documentation. `ExternalSourceKind::Unneeded` was removed as it's unused.

Based on rust-lang#115401.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Sep 12, 2023
Use `Freeze` for `SourceFile`

This uses the `Freeze` type in `SourceFile` to let accessing `external_src` and `lines` be lock-free.

Behavior of `add_external_src` is changed to set `ExternalSourceKind::AbsentErr` on a hash mismatch which matches the documentation. `ExternalSourceKind::Unneeded` was removed as it's unused.

Based on rust-lang/rust#115401.
@pnkfelix
Copy link
Member

  • The impact here is hypothesized to be due to serial/parallel trade-off; we benchmark the serial case and observe a small regression, while the parallel case is observing an improvement of roughly the same caliber.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Sep 13, 2023
rust-cloud-vms bot pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Sep 14, 2023
Use `Freeze` for `SourceFile`

This uses the `Freeze` type in `SourceFile` to let accessing `external_src` and `lines` be lock-free.

Behavior of `add_external_src` is changed to set `ExternalSourceKind::AbsentErr` on a hash mismatch which matches the documentation. `ExternalSourceKind::Unneeded` was removed as it's unused.

Based on rust-lang/rust#115401.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants