-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate snapshot restore performance on host kernel >= 5.4 #2129
Comments
This is not related to AMD. It seems to be related to the host kernel version. On |
On host kernel |
On host kernel |
The entire overhead comes from the When running within the jailer cgroup:
When running without cgroups:
The difference is about |
For the moment I traced the overhead to the |
Tracing the overhead further down the host kernel call stack:
|
Looks like the overhead was introduced by this kernel patch. More specifically these 2 commits: sched/core: Prevent race condition between cpuset and __sched_setscheduler() |
Actually, to be even more specific, looks like the overhead was introduced only by the following commit: I tried to use |
The issue is reproducible on Intel/AMD. |
Just a quick update. I stumbled upon some documentation. Looks like this is how a
|
I managed to reproduce the issue with this simple rust executable:
and this small script that emulates what jailer does:
Note that if we don't add the extra sleep:
The issue doesn't reproduce. |
When a writer requests access to some shared data, the RCU schedules it on a queue and then waits for at least one so-called "grace-period" to pass. A grace period is a period in witch the RCU waits for all the previously acquired read locks to be released. A grace period ends after all the CPUs go through a quiescent state. Grace periods can be quite long. When we start the process ( Otherwise, if we do This is the desired RCU behavior. And it looks like the I don't think there's any easy way around this issue. |
We have tracked down the root cause of this issue to be cgroups v1 implementation on 5.x kernels where x>4. We were able to replicate the findings from the above investigation and tracked down the latency impact to stem from the
Another bad thing we can notice in the V1 measurements is that results also vary a lot from run to run. Since this is a kernel design originating issue we are currently recommending to users to use the snapshot functionality (on kernels higher than 5.4) on cgroups v2 enabled hosts. The snapshot resume latency results for 5.10 kernel for the currently supported x86 platforms can be found here: |
The snapshot restore operation is slower on AMD than on Intel.
We should investigate why and either fix the root cause or change the
test_snapshot_resume_latency
test in order to check different values for AMD and Intel.The text was updated successfully, but these errors were encountered: