Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add methods Client.forward_logging() and Client.unforward_logging(). #7276

Merged
merged 10 commits into from
Apr 5, 2023

Conversation

maxbane
Copy link
Contributor

@maxbane maxbane commented Nov 9, 2022

New client method forward_logging() allows worker tasks to seamlessly forward LogRecords handled by arbitrary loggers in their process back to the client process, where they may be emitted in the user's session. In other words, this is a simple, built-in solution for log aggregation that is independent of the user's cluster manager (if any).

The implementation works by installing a new ForwardLoggingPlugin on the worker-side, which in turn adds a custom logging Handler to the desired logger. This Handler (_ForwardLoggingHandler) publishes serialized LogRecords objects to a topic that the client subscribes to with a handler that deserializes the LogRecord objects and sends them off to the appropriate logger in the client process.

Care should be taken not to forward very chatty loggers at granular levels (e.g. DEBUG or TRACE) because the large number of resulting messages will put performance pressure on the dask scheduler. Very granular log aggregations are still best handled by an external aggregator, but the implementation here provides a convenient, out-of-the-box solution for simple debugging tasks and aggregation of sparse logging output from custom tasks.

Adds test test_client:test_forward_logging.

Relates to issues #2033, #5217.
Passes pre-commits.

These allow worker tasks to seamlessly forward LogRecords handled by
arbitrary loggers in their process back to the client process, where
they may be emitted in the user's session. In other words, this is a
simple, built-in solution for log aggregation that is independent of the
user's cluster manager (if any).

The implementation works by installing a new `ForwardLoggingPlugin` on
the worker-side, which in turn adds a custom logging Handler to the
desired logger. This Handler (`_ForwardLoggingHandler`) publishes
serialized LogRecords objects to a topic that the client subscribes to
with a handler that deserializes the LogRecord objects and sends them
off to the appropriate logger in the client process.

Care should be taken not to forward very chatty loggers at granular
levels (e.g. DEBUG or TRACE) because the large number of resulting
messages will put performance pressure on the dask scheduler. Very
granular log aggregations are still best handled by an external
aggregator, but the implementation here provides a convenient,
out-of-the-box solution for simple debugging tasks and aggregation of
sparse logging output from custom tasks.
@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 9, 2022

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       25 files   -        1         25 suites   - 1   13h 37m 40s ⏱️ - 35m 24s
  3 547 tests +       1    3 436 ✔️ ±       0     106 💤 +  1  5 ±0 
43 571 runs   - 1 287  41 365 ✔️  - 1 256  2 201 💤  - 31  5 ±0 

For more details on these failures, see this check.

Results for commit baa15e5. ± Comparison against base commit b8dfda1.

This pull request skips 1 test.
distributed.protocol.tests.test_protocol ‑ test_large_messages

♻️ This comment has been updated with latest results.

@maxbane
Copy link
Contributor Author

maxbane commented Nov 9, 2022

Test failures appear to be random and unrelated to this PR, coming from test_semaphore.py and test_steal.py.

Copy link
Member

@hendrikmakait hendrikmakait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxbane, thanks for your work on this PR!

My main concern with the current implementation is that by sending all forwarded log messages to a global topic, users could run into log duplication. Registering a worker plugin with the same name is idempotent, but if one user forwards some logger foo and another decides to forward all log records via the root logger, this will install two plugins, both writing foo logs onto the same topic. As a result, each foo log will be logged twice on both clients.

As a possible solution, I would suggest creating dedicated topics for the different plugins with a common prefix. This will make sure that the logs are only written once per plugin. As a bonus, clients can unsubscribe from a topic once they unforward.

There's the additional problem of multiple clients forwarding the same logs where the first one leaving would drop the plugin. I don't consider this much of an issue, but it might be worth adding a small warning not to do that in the docstrings, what do you think? [EDIT]: Maybe we could emit a log record when removing the worker plugin that would inform other users that log forwarding has stopped?

distributed/client.py Outdated Show resolved Hide resolved
distributed/client.py Outdated Show resolved Hide resolved
distributed/client.py Outdated Show resolved Hide resolved
maxbane and others added 2 commits November 10, 2022 10:43
@maxbane
Copy link
Contributor Author

maxbane commented Nov 10, 2022

Thanks for the review, @hendrikmakait! Good point that I hadn't considered. Without even involving the root logger, I think if someone forwards logger "foo" and then (possibly someone else in another client) also forwards "foo.bar", then messages to "foo.bar" may get forwarded twice, even in both clients. I'll run some tests. Your proposed fix makes sense... if someone forwards both loggers in their own client, they will still get duplicate messages (which I think is fine; in a sense it's what they asked for), but at least it won't spill across clients.

@hendrikmakait
Copy link
Member

if someone forwards both loggers in their own client, they will still get duplicate messages (which I think is fine; in a sense it's what they asked for), but at least it won't spill across clients.

From what I recall, that's also what would happen if they installed two hierarchical log handlers on a local machine, so yeah, they kinda asked for it.

If fixing the issue turns out to become more complicated for some reason, you could also just add a warning or two about that in the documentation and leave a more polished solution for future work (should someone need it).

Implement @hendrikmakait's suggestion of a separate topic per plugin
instance, so that multiple clients don't step on each other. This also
allows clients to unsubscribe from the topic safely when unforwarding.
@maxbane
Copy link
Contributor Author

maxbane commented Dec 9, 2022

Hey @hendrikmakait, sorry for the long delay, but I've implemented your suggestion. Each plugin instance now forwards to a topic derived from a common prefix plus the plugin instance's name, which itself includes the name of the forwarded logger as the user provided it to forward_logging(). This should prevent your scenario of two different clients who request overlapping logger specs (e.g. "foo" and "foo.bar") from seeing each other's log records in a surprising way. It also, as you point out, allows the client to unsubscribe cleanly in unfoward_logging(), since we now have a unique topic per plugin.

Thanks for the feedback earlier, and let me know if this looks good to merge!

@maxbane maxbane requested a review from hendrikmakait December 9, 2022 19:41
@maxbane
Copy link
Contributor Author

maxbane commented Jan 17, 2023

Hey, just wanted to ping this PR. We were so close to merging it back in November :)

cc @hendrikmakait @jrbourbeau

Copy link
Member

@hendrikmakait hendrikmakait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxbane: Sorry for the late reaction, I lost track of this before the winter break. The code looks good to me, thanks for implementing the suggestions!

For additional context, I know there's some contention among maintainers about adding more methods to the client, but in my opinion, the new ones are fine and could even be removed if we added support for a ClientPlugin.

Could a committer please have another look and merge? cc @fjetter, @jrbourbeau

@fjetter
Copy link
Member

fjetter commented Apr 5, 2023

haven't reviewed the code but I'm fine with adding this as long as it's off by default which I believe is the case. @hendrikmakait I'll trust your judgmenet.
@maxbane thanks for the contribution and sorry for the long delay

@hendrikmakait
Copy link
Member

@maxbane: I've merged the latest main into your branch to verify that this PR is still compatible. I'll merge it once CI is green.

@hendrikmakait hendrikmakait merged commit 078d06f into dask:main Apr 5, 2023
@hendrikmakait
Copy link
Member

Thanks, @maxbase, for your contribution, and sorry for taking so long to get this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants