-
-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods Client.forward_logging()
and Client.unforward_logging()
.
#7276
Conversation
These allow worker tasks to seamlessly forward LogRecords handled by arbitrary loggers in their process back to the client process, where they may be emitted in the user's session. In other words, this is a simple, built-in solution for log aggregation that is independent of the user's cluster manager (if any). The implementation works by installing a new `ForwardLoggingPlugin` on the worker-side, which in turn adds a custom logging Handler to the desired logger. This Handler (`_ForwardLoggingHandler`) publishes serialized LogRecords objects to a topic that the client subscribes to with a handler that deserializes the LogRecord objects and sends them off to the appropriate logger in the client process. Care should be taken not to forward very chatty loggers at granular levels (e.g. DEBUG or TRACE) because the large number of resulting messages will put performance pressure on the dask scheduler. Very granular log aggregations are still best handled by an external aggregator, but the implementation here provides a convenient, out-of-the-box solution for simple debugging tasks and aggregation of sparse logging output from custom tasks.
Can one of the admins verify this patch? Admins can comment |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 25 files - 1 25 suites - 1 13h 37m 40s ⏱️ - 35m 24s For more details on these failures, see this check. Results for commit baa15e5. ± Comparison against base commit b8dfda1. This pull request skips 1 test.
♻️ This comment has been updated with latest results. |
Test failures appear to be random and unrelated to this PR, coming from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxbane, thanks for your work on this PR!
My main concern with the current implementation is that by sending all forwarded log messages to a global topic, users could run into log duplication. Registering a worker plugin with the same name is idempotent, but if one user forwards some logger foo
and another decides to forward all log records via the root logger, this will install two plugins, both writing foo
logs onto the same topic. As a result, each foo
log will be logged twice on both clients.
As a possible solution, I would suggest creating dedicated topics for the different plugins with a common prefix. This will make sure that the logs are only written once per plugin. As a bonus, clients can unsubscribe from a topic once they unforward
.
There's the additional problem of multiple clients forwarding the same logs where the first one leaving would drop the plugin. I don't consider this much of an issue, but it might be worth adding a small warning not to do that in the docstrings, what do you think? [EDIT]: Maybe we could emit a log record when removing the worker plugin that would inform other users that log forwarding has stopped?
Co-authored-by: Hendrik Makait <[email protected]>
Co-authored-by: Hendrik Makait <[email protected]>
Thanks for the review, @hendrikmakait! Good point that I hadn't considered. Without even involving the root logger, I think if someone forwards logger |
From what I recall, that's also what would happen if they installed two hierarchical log handlers on a local machine, so yeah, they kinda asked for it. If fixing the issue turns out to become more complicated for some reason, you could also just add a warning or two about that in the documentation and leave a more polished solution for future work (should someone need it). |
Implement @hendrikmakait's suggestion of a separate topic per plugin instance, so that multiple clients don't step on each other. This also allows clients to unsubscribe from the topic safely when unforwarding.
Hey @hendrikmakait, sorry for the long delay, but I've implemented your suggestion. Each plugin instance now forwards to a topic derived from a common prefix plus the plugin instance's name, which itself includes the name of the forwarded logger as the user provided it to Thanks for the feedback earlier, and let me know if this looks good to merge! |
Hey, just wanted to ping this PR. We were so close to merging it back in November :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxbane: Sorry for the late reaction, I lost track of this before the winter break. The code looks good to me, thanks for implementing the suggestions!
For additional context, I know there's some contention among maintainers about adding more methods to the client, but in my opinion, the new ones are fine and could even be removed if we added support for a ClientPlugin
.
Could a committer please have another look and merge? cc @fjetter, @jrbourbeau
haven't reviewed the code but I'm fine with adding this as long as it's off by default which I believe is the case. @hendrikmakait I'll trust your judgmenet. |
@maxbane: I've merged the latest |
Thanks, @maxbase, for your contribution, and sorry for taking so long to get this merged. |
New client method
forward_logging()
allows worker tasks to seamlessly forward LogRecords handled by arbitrary loggers in their process back to the client process, where they may be emitted in the user's session. In other words, this is a simple, built-in solution for log aggregation that is independent of the user's cluster manager (if any).The implementation works by installing a new
ForwardLoggingPlugin
on the worker-side, which in turn adds a custom logging Handler to the desired logger. This Handler (_ForwardLoggingHandler
) publishes serialized LogRecords objects to a topic that the client subscribes to with a handler that deserializes the LogRecord objects and sends them off to the appropriate logger in the client process.Care should be taken not to forward very chatty loggers at granular levels (e.g. DEBUG or TRACE) because the large number of resulting messages will put performance pressure on the dask scheduler. Very granular log aggregations are still best handled by an external aggregator, but the implementation here provides a convenient, out-of-the-box solution for simple debugging tasks and aggregation of sparse logging output from custom tasks.
Adds test
test_client:test_forward_logging
.Relates to issues #2033, #5217.
Passes pre-commits.