-
Notifications
You must be signed in to change notification settings - Fork 118
K8s jobs slow due to hard cpu limits #352
Comments
some configure: spark.eventLog.dir hdfs://10.196.132.104:49000/spark/eventlog spark.shuffle.service.enabled true spark.kubernetes.executor.memoryOverhead 5000 |
update logs, and could see task 0 on k8s tasks 10s, while on yarn takes only 4s. kubelet and NM are on the same machine, had same jvm paramters ,please skip the logs added for debug |
found the main reason is in k8s cpu limit is the same as cpu request, in this case is 1 cores, while in yarn, executor could exceed cpu request,add a pull request to add "spark.kubernetes.executor.limit.cores" to specify cpu limit of executor |
Great debugging @sandflee ! What you're proposing around cpu limits is definitely a plausible explanation for some of the discrepancy in performance between YARN and k8s. Looking at these YARN docs it appears that YARN doesn't do cpu limiting of a YARN container at the OS level without configuring cgroups, and because it's not on by default probably most YARN installations therefore don't have cpu limiting. So I suspect when a YARN scheduler allocates vcores to an application the number of cores is actually just an un-enforced request to the application, which may exceed the specified core usage either intentionally or unintentionally. So in your testing you were probably comparing a strictly 1core executor in k8s against a 1core-but-actually-unlimited executor in YARN. Indeed those performance results should be different! Bottom line is, making this core limit configurable makes sense to me. I suspect in many of my own deployments I would want to emulate the YARN behavior of allowing the pod to use an unlimited amount of CPU on the kubelet, especially when no other pods are on the kubelet. Do you (or @foxish) know a way to set no cpu limit, or unlimited cpu? |
Setting a request and no limit would be the way to make sure that there is only a minimum guarantee and no upper bound on usage. Making that the default makes sense to me. We should have the limit be optional using the option that @sandflee wrote in his PR. |
CPU is considered a "compressible" resource (unlike memory), so, there is no harm in making it unbounded. If the system has insufficient cpu, the executor's CPU usage will be throttled (down to the request of 1 CPU), but the executor will not be killed. See also: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/resource-qos.md#compressible-resource-guarantees |
Spark already has |
I don't think we can rely on always using And the arithmetic But because cpu is a compressible resource, I think a better default would be unlimited cpu. |
My bad, it should be |
@liyinan926 Thanks for debugging this executor core limit issue. This is quite interesting.
I am curious how this would work with dynamic allocation when the number of executors varies as the job goes. Is this suggested only for static allocation? Then, do we need another flag anyway if we want to limit max cores per executor for dynamic allocation? |
@kimoonkim That's a good point. Changing the number of executors will cause the limit per executor to change as well if the value of |
@sandflee Have you tested the result with your new patch? If I remember correctly, the |
yes, I have test the patch, set hard limit is a user option with default no limit. user could limit cpu usage if needed, such as co-running a online service and a spark job. |
Closed by #356 |
Thanks again for debugging this @sandflee ! Happy to discuss any more performance discrepancies you find vs YARN or any other improvements you might find in future issues. |
@sandflee I checked the patch and thanks! |
run a simple wordCount app, comparing the running time on yarn and k8s. jobs on k8s are much slower,
The text was updated successfully, but these errors were encountered: