-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace synchronization primitives with those from parking_lot #1632
Conversation
Just as a first comment, I'm not sure how the parking_lot primitives can be adopted in favor of the current standard library primitives without violation of Rust's stability guarantees. |
If this can be done in a way that is compatible, I would absolutely love to do this. |
As mentioned in the RFC, the parking_lot primitives have the same API as the standard ones, so there is no compatibility issue if we replace one with the other. |
I expected to see benchmarks on all tier 1 and at least some tier 2 platforms included into the RFC text as the reason to/not to accept the RFC. |
@nagisa I added some benchmark results for Linux and Windows. Unfortunately I don't have any other platforms to test on. |
I seem to recall from the Webkit article that the parking-lot-style locks are not entirely fair. While I also recall good reasons why this isn't so bad, if this is also true of the Rust crate, I would expect it (as well as any other downsides that may exist — for example, does the crate leak memory as Webkit does?) to be discussed in the RFC. |
I personally haven’t came across a single case where entirely-fair locks were necessary or even desirable. Is there one? I don’t recall us ever guaranteeing this for any locking mechanism in the standard library so far. Either way, from the same webkit blog post it seems like making locks fair is also an option. |
I'm not particularly keen on fair locks either (then again, I'm not sure I'm qualified to judge anyway). I just want all the differences, including the ones that may be irrelevant or drawbacks, documented in the RFC to make it easier to form an opinion without scavenging N articles across the internet. In particular if there are so many minor design decisions where |
Neither pthread mutexes or Windows SRWLOCK, which is what the standard library primitives are based on, make any fairness guarantees, so there is no difference here. |
👍 needing an allocation for each mutex is awful. |
The rfc does not mention, that |
I have poisoning support for parking_lot in a branch, and this is the version that would be integrated into the standard library. I don't plan on adding poisoning support to the parking_lot crate itself because poisoning has a significant performance cost and I feel that it makes the API uglier. By the way, lack of poisoning doesn't make anything unsafe, it just means that user code needs to be aware that locked data may be left in an inconsistent (but safe) state after a panic. This is why |
As an example, on AArch64, adding support for poisoning makes |
Is it possible to turn off poisoning when we are linking to the |
Abort still runs various handlers before actually quitting process (to my knowledge, at least), therefore all data behind mutex that may be accessed there should also be poisoned. |
@nagisa It only runs the panic hook, nothing else. |
So, it should be possible (abort doesn't call destructors anyways). |
You may have mutexes in statics and the panic hook may run arbitrary code. |
@nagisa With panic = abort a mutex can never become poisoned because that is done in the destructor of |
@nagisa The point is that the inner value cannot become poisoned because the destructor won't be called. |
One thing I feel is worth noting - the [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.103.8168&rep=rep1&type=pdf |
@eternaleye |
@Amanieu: Ah, had missed the backoff. However, I do strongly suggest reading my second link - it shows catastrophic throughput loss for the ticket spinlocks in the Linux kernel of the time, and compares against their MCS lock implementations patched into Linux. |
One big assumption that this paper makes is that all "threads" are always running and never scheduled out. This is a perfectly fine assumption to make in the context of a kernel where a "thread" is a raw CPU. However this doesn't necessarily apply in the context a userspace process where the scheduler will arbitrarily suspend threads for a (relatively) long duration. And this leads to the big issue with queue-based locks: you have no way of knowing whether the next thread in the queue has been scheduled out. If it has then you have one whole timeslice during which nobody owns the lock. If you instead have threads spinning on the lock word waiting for it to get unlocked then you are pretty much guaranteed to pass the lock to a thread that is running. Now, with that said, the spin waiting strategy in parking_lot is a bit ad-hoc and could use some fine-tuning (read: I made up some numbers and it seemed to give good results). In particular I'm considering reducing the number of times I yield the thread in favor of just queuing up the thread and having it sleep until explicitly woken up. |
In order to really discuss this we need benchmarks with the poisoned branch on most tier A platforms. |
The benchmarks linked in the RFC were done on the poison branch. |
There are two reasons why this won't work on
|
A more conservative change would be to just move the OS-specific implementations of |
Additionally, I'd like to see benchmarks against the standard library implementation of |
I expect the performance of As for |
Sorry, I meant to write "full cache line" and "cache-line-aligned." AFAICT, it should be really easy to provide both aligned-and-padded and single-byte variants. |
@briansmith What you really want to do is align the mutex and the data it is protecting to a cacheline. This can easily be done by marking |
This would be a nice thing to optionally enable if we gave |
What seems a bit risky with this change is that it not only replaces the implementation, but that it also makes additional promises. This restricts alternative implementations and prevents backing out of the change. |
@CodesInChaos the guarantees can always be reduced and this would still give good improvements in the performance side. Anyway I fell like this is a great proposal, we just need to wait a few months (let parking lot prove itself) before really making a decision. |
The |
Regarding real-world usage of parking_lot, today has brought us http://plhk.ru/trash/netmap-20160710.pdf . |
🔔 This RFC is now entering its week-long final comment period 🔔 The libs team felt that this RFC's discussion has reached a standstill now and as a result it's ready to move into FCP. Our feeling matches what I wrote earlier about how |
Having an implementation of synchronization primitives directly in standard
... (and many others, not to mention differences between Windows API). It seems that at some point it would be a really nice thing to have. |
I agree, there are lots of potential benefits to offering these constructs. But it's a pretty big move to make, and for the time being gaining more experience with them externally seems prudent. (In general, it's not 100% clear to me what direction the concurrency story within |
The libs team got a chance to talk about this RFC this week at libs triage, and our conclusion remains the same as before in that while |
Can this still be included as a fallback mechanism to allow Rust to run not only on shiny new Windows? |
Another blog post mentioning parking_lot: https://blog.mozilla.org/nfroyd/2017/03/29/on-mutex-performance-part-1/ It's been about a year since the crate was first published, so at some point it would be worth thinking about a timeline or otherwise mulling over specific criteria for potentially moving forward on this. Rather than trying to go into the stdlib directly, maybe a more reasonable first step would be to get this into rust-lang-nursery and see where it goes from there. |
I can easy imagine adding some Cargo features to make this opt-in (c.f. allocator choice). |
It has been a long time now. Should this pull request be reopened? |
The work of the up-and-comming portability time this year will hopefully allow this for free. |
Work on this has been going on for quite a while over at rust-lang/rust#56410. |
Rendered