-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumentation of SpillBuffer #7351
Instrumentation of SpillBuffer #7351
Comments
Not sure if I understand this one. The only case where a key doesn't end up in fast when you write to the SpillBuffer is when it's individually larger than the target threshold. |
I thought there was some logic that would put a key into slow if this key would push us over the limit. If that's not the case, ignore this, I don't think data shards larger than the limit are a very common problem we need to build instrumentation for |
No, if inserting a key pushes us over the limit, the least recently used keys are pushed out. The latest inserted one is on top of the LRU pipe and is the only one guaranteed to be in fast at the end of the insertion. |
The thing I was hoping to differentiate with this comment is the spilling that happens as a result of setitem vs the spilling that happens when the memory_manager is evicting. |
Both will evict the same keys, in the same order. The only difference is that memory_manager kicks in when there's substantial unmanaged memory (but on the flip side it's less responsive). |
Closed by
|
A modified copy of the Coiled Grafana dashboard is now available at https://grafana.dev-sandbox.coiledhq.com/d/eU1bT-nVw The plots above were run on
@ntabris could you please review the modified grafana dashboard and, if you're happy with it, merge the new plots into the main one? (note that the PRs producing the new data have not been merged yet). |
This is great work—having this sort of information is extremely helpful both operationally when using dask, and for prioritizing what to improve next.
It seems like the gist of what you're saying here is "we have to un-spill in order to execute tasks a lot more often that we un-spill to transfer keys". I'm curious how well that generalizes, or how specific it is to the scheduling and transfer patterns of For #5996, I think the metric we need to assess its importance is not "how much time is spend un-spilling keys to transfer them", but "how much extra memory is used by keys that were un-spilled for transfer which otherwise could have remained spilled". Presumably, async disc access would address the time component for un-spilling, whether due to execute or transfer. The purpose of sendfile would be to reduce the extra memory used. |
Not a trivial thing to answer, because the same key may be also requested by task execution shortly afterwards. In that case, #5996 would actually double the amount of disk I/O and only slightly delay memory usage. The plot on the top right suggests that unspilling a key for get-data not shortly after the same key has been unspilled for execute is a fairly uncommon event. This makes me infer that the opposite may also be true, that needing a key for get-data shortly after the same key has been unspilled for execute is a fairly common event |
Curious whether disabling compression was explored in that experiment? |
Agreed, that would be a separate task to figure out how to instrument it (but it does seem like something worth instrumenting).
I'm not following how to infer that from the graph? Use of a key that's already in memory simply wouldn't show up on the graph. I'm seeing yellow (spill for transfer) go up a little, but green (un-spill for execute) doesn't go down by the same amount after (in fact, it usually spikes too). To me, that could even imply that plenty of keys which are un-spilled for transfer aren't immediately used for execute, otherwise we'd seen green go down more after yellow. But I think all of this is very speculative since the chart doesn't show cache hits. If we could look at the percentage of SpillBuffer accesses that touched disk alongside this, that might tell more of the story. |
Good job @crusaderky . This is very interesting. Looking forward to see this for other kinds of workloads. Another question this raises is whether LRU is a good policy for picking the to-be-spilled keys. disk-read-execute is strongly coupled to assigned priorities and I guess a priority based system would perform better than LRU and would reduce the total amount of spilling. I don't think we can estimate the impact of this easily from the provided measurements. |
It wasn't. I don't think there will be much of a difference here, because the test cases runs on uniformly distributed random floats - e.g. uncompressible.
Exactly; the yellow part of the graph shows only keys that are requested by other workers and were neither produced nor consumed by execute recently. |
Another insight: |
The only way we currently have to observe disk access is
startstops
we measure whenever we load/store data.However, with our
SpillBuffer
we have the possibility to introduce many instrumentation hooks to get much better insights into what's going onFor instance
These metrics should not be tracked on
TaskState
level.The text was updated successfully, but these errors were encountered: