Added Scheduler and Worker memory tracking #2847

TomAugspurger · 2019-07-16T16:20:23Z

Part 1 of #2602.

This adds

Scheduler.task_net_nbytes
Worker.net_nybtes

modeled after task_duration / durations (with the primary difference that the Worker.net_nbytes is at the prefix, rather than task level).

The basic idea is to learn a per-prefix measure of the net memory usage of running a task. We already know

the memory usage of a tasks' output
the memory usage of each dependency

Now we just do a bit of arithmetic when we complete a task and when we release dependencies.

This information is passed to works to aid with scheduling, but that part will be done in a separate PR.

Part 1 of dask#2602

mrocklin · 2019-07-16T18:28:23Z

Thinking about this from a diagnostics perspective, I wonder if the thing to track might instead be the expected size in bytes of each key prefix rather than the delta. My guess is that this will be more generally useful for a variety of applications.

We might then derive the delta information from this information on the fly for the specific use case of memory-aware scheduling.

Sometimes when thinking about scheduler state I find myself also being motivated by what would look good in a dashboard. This is a rather shallow objective on its own, but the information for good automated task scheduling can look surprisingly like the information for good human consumption.

Thoughts @TomAugspurger ?

TomAugspurger · 2019-07-16T18:39:39Z

I think tracking prefix-level memory usage, and then deriving the delta, was my initial path. It's been a little while, but IIRC that ran a problem with knowing whether completing a task actually freed the memory of a tasks dependencies.

it's not clear (to me) how we would know that completing b isn't responsible for freeing a. We could probably track some kind of task_frees : Dict[Tuple[prefix_1, prefix_2]]: int that keeps a count of how often completing task with prefix_1 caused a release of a task with prefix_2? I'm not sure it's clearer (though it would likely be more useful for diagnostics)

mrocklin · 2019-07-16T19:08:58Z

I think I mentioned this earlier, but have since forgotten what the response was, but for any particular task we could probably include a fractional hit for every dependency based the number of its dependents.

So in the case above b would be half responsible for freeing a (assuming top-to-bottom execution) and so when we compute if a task is memory-freeing or not it would get half of the weight of a.

TomAugspurger · 2020-09-11T14:04:40Z

I'm not actively working on this at the moment. Closing to clear the backlog.

Added Scheduler and Worker.net_bytes for memory tracking

264a475

Part 1 of dask#2602

TomAugspurger mentioned this pull request Jul 16, 2019

an example that shows the need for memory backpressure #2602

Closed

TomAugspurger added 2 commits July 16, 2019 12:11

handle nullable nbytes

e64716a

remove print

06a5d0e

TomAugspurger added 2 commits July 16, 2019 14:12

remove worker asserts

d2101cf

fix docs

7babd85

TomAugspurger closed this Sep 11, 2020

mrocklin mentioned this pull request Jun 8, 2021

Deprioritize/pause tasks that consume memory #4891

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Scheduler and Worker memory tracking #2847

Added Scheduler and Worker memory tracking #2847

TomAugspurger commented Jul 16, 2019

mrocklin commented Jul 16, 2019

TomAugspurger commented Jul 16, 2019

mrocklin commented Jul 16, 2019

TomAugspurger commented Sep 11, 2020

Added Scheduler and Worker memory tracking #2847

Added Scheduler and Worker memory tracking #2847

Conversation

TomAugspurger commented Jul 16, 2019

mrocklin commented Jul 16, 2019

TomAugspurger commented Jul 16, 2019

mrocklin commented Jul 16, 2019

TomAugspurger commented Sep 11, 2020