Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Better Logger #6901

Closed
wants to merge 10 commits into from
Closed

[WIP] Better Logger #6901

wants to merge 10 commits into from

Conversation

ezrast
Copy link

@ezrast ezrast commented Oct 2, 2018

See #5874

Still needs specs and documentation, but see https://gist.github.com/ezrast/37c3377c172cd225efe0ed40f4da1e59 for example usage.

@RX14
Copy link
Contributor

RX14 commented Oct 2, 2018

I like the interface but the implementation is a little on the complex side. I'd rather have a singleton root logger instead of extend Base. I don't think Logger.log should work either, you should need to .get a logger or use Logger::BASE (not recommended).

The HierarchyFilter could work on longest-substring match, which avoids assuming that :: is used as a separator. This can be done now by creating an array of the filter rules, ordered by match string length, and then iterating that. Later, replace with a radix tree.

This is a really good starting point though!

@RX14
Copy link
Contributor

RX14 commented Oct 2, 2018

Perhaps (not sure) also just have the Filter on the Emitter in this design, with the global Logger::ROOT (BASE was a bad name) controlling which emitter is used, and the emitter handling filtering.

So perhaps:

class Logger
  getter component : String
  property backend : Logger | Emitter
end

class Emitter
  property filter : Filter
end

This design allows having a "root logger" which isn't Logger::BASE, meaning there can be many trees of loggers, but usually only one with the root being Logger::BASE. The root of the logger tree references the emitter which controls where the logs go, how they're formatted, and if they're filtered.

I'd prefer to have filtering in a single place for understandability purposes, and this is the best way I can think to do it.


def message : String
if (msg = @message).is_a? Proc(String)
return @message = msg.call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant return, ditto below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let's focus on the general design before going into such details.

@ezrast
Copy link
Author

ezrast commented Oct 2, 2018

The HierarchyFilter could work on longest-substring match, which avoids assuming that :: is used as a separator. This can be done now by creating an array of the filter rules, ordered by match string length, and then iterating that. Later, replace with a radix tree.

If we don't assume a separator then Animals::Anteater will inherit from Animals::Ant. I think we want to avoid that?

Perhaps (not sure) also just have the Filter on the Emitter in this design, with the global Logger::ROOT (BASE was a bad name) controlling which emitter is used, and the emitter handling filtering.

So perhaps:

class Logger
  getter component : String
  property backend : Logger | Emitter
end

class Emitter
  property filter : Filter
end

The issue with this is that it makes the filter difficult to access from the outside - changing the default log level becomes Logger::ROOT.backend.as?(Emitter).try &.filter= Logger::DEBUG, which isn't great. If you want to avoid having nil filters on all the forwarders, I think the way to do it is to make Logger.get return a Forwarder object that doesn't inherit from Logger at all. Then #debug and friends live on Forwarder instead of, or in addition to, on Logger. I'm not sure whether that actually simplifies anything, but I'll play with it.

This design allows having a "root logger" which isn't Logger::BASE, meaning there can be many trees of loggers, but usually only one with the root being Logger::BASE. The root of the logger tree references the emitter which controls where the logs go, how they're formatted, and if they're filtered.

Whoops, I had meant to move get from a class method on Logger to an instance method on Logger::Base. Then one can create forwarders for any arbitrary logger.

@RX14
Copy link
Contributor

RX14 commented Oct 2, 2018

@ezrast if you add a filter= on Logger which asserts that this is the root logger and then sets the filter then it's just Logger::ROOT.filter =

@j8r
Copy link
Contributor

j8r commented Oct 2, 2018

@ezrast Note you haven't set the a mail linked with your github's account in your .gitconfig

@ezrast
Copy link
Author

ezrast commented Oct 2, 2018

@j8r Good catch! That explains why Github thinks I've never committed before.

@RX14 That still seems like fighting the type system. If we don't want components to have their own filters we should give them a type that doesn't have filters. Does this change do what you want in that regard? https://github.com/ezrast/crystal/pull/3/files

@RX14
Copy link
Contributor

RX14 commented Oct 2, 2018

@ezrast if you go that route then i'd rather just have Logger and RootLogger < Logger and only RootLogger has the emitter and filter.

But I'll wait for other people on the core team to chime in since they'll have their own opinions.

@straight-shoota
Copy link
Member

This Component type is what I already had in mind in #5874 (comment) where the directly used object would just be a shallow wrapper with a component path and a pointer to the logger instance. Since it's immutable, it could even be a struct and make calling Logger.get pretty cheap.

However, this implies a very centralized architecture with every log message routing through one main logger instance. It can get pretty complex to apply filters to specific component paths.

So, I'm not sure that's the best solution.

Looking at Java's logging API, every Logger instance has a filter and a list of handlers as well as an optional reference to a parent logger. Log messages are published to the local handlers and the handlers of all parent loggers (this propagation can be disabled).

@RX14
Copy link
Contributor

RX14 commented Oct 3, 2018

I tend to think that copying what java does is the best. There have been about 5-10 java logging libraries over the years, and they're very well optimized by now for production usecases.

@RX14
Copy link
Contributor

RX14 commented Oct 3, 2018

I've checked through way too many levels of abstraction to confirm that in log4j2, each logger has a filter, a set of appenders (handlers) and then forwards to it's parent. I'd probably just copy that for now, although it does require a registry of logger instances to look up parents, but that's required anyway to reconfigure loggers at runtime.

Copy link
Member

@bcardiff bcardiff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the interface. The implementation is a bit hard to follow in some cases.
There are some specific comments. As long as we can be clear on the smallest public interface the user should expect I'm positive to move forward in this overall direction.

I definitely would like to see a module the user can be referenced to emit events and a factory method. And decouple the story of filtering in another module/interface. That should help the description to the user.

src/logger.cr Show resolved Hide resolved
require "./severity"

class Logger
class Entry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entry should be a struct like Message was before. So we avoid allocations for each entry.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entry mutates itself so that message strings are evaluated at most once in the case of calls like logger.debug{ do_expensive_thing.to_s }. IMO this is a good semantic to preserve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's worth it. Log messages are usually not very expensive and then it's much better to avoid allocations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @straight-shoota . It is tempting the idea of lazy building the logger message, but not at the cost of stressing the GC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lazy evaluation will save allocations in real-world scenarios, but putting them back in should be a non-breaking change, so I've removed them and will re-raise the issue in a later ticket if I can generate some meaningful benchmarks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, lazy evaluation itself really isn't the issue. It's memoizing the result...

src/logger.cr Outdated
when Severity
return unless entry.severity >= ff
end
emitter.try &.call(entry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was hard to follow how the Logger.class that is a Logger::Base ends up chained as the default emitter.

@RX14
Copy link
Contributor

RX14 commented Oct 3, 2018

I think we still have architecture issues to discuss before nitpicking this PR.

@ezrast
Copy link
Author

ezrast commented Oct 3, 2018

@RX14, as a very general response to your Java comment: in my experience Java things tend to be a pain to deal with operationally, and one of the many reasons is that their components write things to lots of different log files, including duplicating entries between log files, all loosely controlled by a configuration file according to a scheme that is largely inscrutable to non-Java programmers. So I'm in favor of pushing configuration into the root logger instance. I don't want to encourage components to write to their own log files "just in case" and then also push logs downstream to be written again somewhere else.

That said, changing Logger to hold an array of Emitters seems like a benign change, and after that (plus making get an instance method) what we'll have isn't that different from what you and @straight-shoota describe: calling Logger#get will return another logger that forwards to its parent by default but can have a filter and arbitrary emitters added to it.

The main thing missing is a registry, which I don't understand the use case for. Global configuration should go in the root logger; why make every component's logger be available globally?

@ezrast
Copy link
Author

ezrast commented Oct 3, 2018

I definitely would like to see a module the user can be referenced to emit events and a factory method. And decouple the story of filtering in another module/interface. That should help the description to the user.

@bcardiff, Can you clarify what you mean by this? Emitter and Filter are already their own modules that will be documented individually.

@RX14
Copy link
Contributor

RX14 commented Oct 3, 2018

@ezrast okay, yes, root logger with an array of Emitters, each emitter receiving every log and having separate filtering settings is fine. No registry or parenting arrangement required.

Also the naming is iffy. I'd propose:

  • Logger::Manager - Holds the array of Collectors
  • Logger::Collector - Handles the core logic of filtering and serializing logs
  • Logger::Filter
  • Logger itself is a struct containing a String context and a reference to the Logger::Manager instance
  • Logger::DEFAULT_MANAGER, a default Manager
  • Logger::Manager#get(String), returns a Logger
  • Logger.get - forwards to Logger::DEFAULT_MANAGER.get

Thoughts?

@straight-shoota
Copy link
Member

@RX14 I'm not sure if we have agreed on using thin wrappers. Honestly, I'm not even sure myself if this is a great idea.
What speak against having a fat Logger instance everywhere logging is needed? After all, you typically don't need a lot of them and they stay around for a long time, so it's not much overhead to allocate. But it's easier to manage filters, because they only need to be applied to the specific instance, instead of needing to distinguish individual paths in the main logger.

@RX14
Copy link
Contributor

RX14 commented Oct 3, 2018

@straight-shoota because if loggers have state, they need to be stored in a registry so that they can be configured using a central API. That complicates the whole design.

@straight-shoota
Copy link
Member

Hm, yeah. But now Logger::Manager is kind of a registry. It at least needs to associate filters to component paths. I'm not favouring any of both approaches, just thinking about it...

@ezrast
Copy link
Author

ezrast commented Oct 3, 2018

The current PR has no registry, distinguishes between components hierarchically in the root logger if you want it to (for which the entire filter class is 23 lines of code and not complex at all), and still has fat loggers at the leaves just in case you want to do something wacky. I'm skeptical that it could get much less complicated.

@RX14
Copy link
Contributor

RX14 commented Oct 3, 2018

@straight-shoota except in this design the "registry" (really just parsing the component) is handled by Collectors, and collectors can be implemented by shards. This moves almost all the complexity outside the stdlib and allows flexibility. The stdlib will still provide good solutions for deployments with only a few rules (see the API for Logger::HierarchyFilter and compare that to a registry)

src/logger.cr Outdated
when Filter, Proc(Entry, Bool)
return unless ff.call(entry)
when Severity
return unless entry.severity >= ff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean return if entry.severity < ff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The expressions are semantically equal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if X is inferior than Y" is simpler to understand than "unless X is superior or equal than Y", my opinion.

@ezrast
Copy link
Author

ezrast commented Oct 10, 2018

I made some of the less-controversial (I think; feel free to prove me wrong) requested changes and updated https://gist.github.com/ezrast/37c3377c172cd225efe0ed40f4da1e59 accordingly.

Outstanding discussion topics include:

  • Make Logger#get return a thin forwarder instance (removes a footgun but also some flexibility; makes the implementation slightly more complex) or keep it as a fat logger that we just pretend is thin (simplest solution but potentially confusing)?
  • Filter-per-emitter or filter-per-logger?
  • What to name things? (should be deferred until the architecture is nailed down)

Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's getting into really good shape, I think!

src/logger.cr Outdated
emitters.each &.call(entry)
property component : String
property filter : FilterType?
property emitters : Array(EmitterType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the ivar accessors became properties? I don't think they need to be mutable. getter should be enough.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made filter a property since Logger::ROOT.filter = ... needs to be possible, and then brought the rest along for consistency. But I agree immutability would be better in other cases. I think that's an argument in favor of making Logger.get return a class that is thinner than Logger.

component can be made immutable, in any case. emitters doesn't really matter since the array will always be mutable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see the last commit. Yeah, when we're not using thin wrappers, filter should probably be mutable. But component and emitters do not.

@straight-shoota
Copy link
Member

  • I'm still not sure about the thin wrappers, but I figure we can leave it as is for now, as it is a sound solution. Once we get some experience with real world usage of the loggers, and determine that thin wrappers would provide an improvement, this can always be revisited.
  • I'd say per logger. That makes it easy to share the same emitters across different logger instances. If you need different filter settings, you can just create another logger instance and forward messages to it. An alternative would be a FilterEmitter which wraps an Emitter and filters messages before relaying them. But it's not much different from another Logger plus Forwarder.
  • The current naming actually looks quite nice.

Copy link
Contributor

@RX14 RX14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logger needs to be stateless and immutable, because otherwise you need to keep a registry of loggers to make filters and appenders configurable by the application.

I much prefer the simple and understandable way of forwarding all logs to a central manager, which distributes those logs between several consumers which can optionally filter logs. The design I described before has the same flexibility as this design, while massively reducing the amount of mutable state.

src/logger.cr Outdated
require "logger/entry"
require "logger/filter"
require "logger/emitter"
require "logger/severity"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger/*?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer, sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO individual includes are preferable. They show exactly what gets included and where. And you can apply an order which can be of significance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@straight-shoota I agree, but for now I'll be consistent with the rest of the codebase. Removing wildcard requires from the language is a fight for tomorrow. ;)

@RX14
Copy link
Contributor

RX14 commented Oct 10, 2018

Ultimately, what i'm trying to say is that the data flow graph for my proposal is this:

 Logger --\                      /--> Logger::Collector -------> ...                                            
          |                      |           ^                                                                  
          |                      |           |                                                                  
 Logger --|--> Logger::Manager --|           |                                                                  
          |                      |           v                                                                  
          |                      |    Logger::Filter                                                            
 Logger --/                      |                                                                            
                                 |                                                                          
                                 \--> Logger::Collector -------> ...                                                                  
                                             ^                                                                                       
                                             |                                                                                       
                                             |                                                                                       
                                             v                                                                                       
                                      Logger::Filter                                                                                 

whereas, the same graph for this PR is spaghetti

@ezrast
Copy link
Author

ezrast commented Oct 10, 2018

Logger needs to be stateless and immutable, because otherwise you need to keep a registry of loggers to make filters and appenders configurable by the application.

@RX14, a registry isn't in the question at this point. It's "thin loggers; central configuration" vs "fat loggers; central configuration". In practice, the current PR's data flow is like this --

 Logger (leaf) --> Logger::Forwarder --\                    /--> Logger::Emitter --> ...
                                       |                    |
                                       |                    |
 Logger (leaf) --> Logger::Forwarder --|--> Logger (root) --|--> Logger::Emitter --> ...
                                       |          ^         |
                                       |          |         |
 Logger (leaf) --> Logger::Forwarder --/          |         \--> Logger::Emitter --> ...
                                                  v
                                            Logger::Filter

-- which has the potential to turn into spaghetti if people go adding filters and emitters to the leaf loggers, but the docs should make it clear that that's not the intended use case. Otherwise, the only effective difference between your graph and mine is where the filters go, which is a separate question.

I am not actually against the idea of giving leaf loggers their own immutable class, I just think you're overstating the difference between the two setups.

@straight-shoota
Copy link
Member

One important difference between both data flow diagrams isn't shown. In RX14's design, Logger.get("Foo") will always return essentially and identical logger whereas in the current state it will always return a new instance and when you modify it, with say a new filter, other instances will not be affected. This makes it the duty of the application to ensure it shares its logger.

@ezrast
Copy link
Author

ezrast commented Oct 10, 2018

Yeah, that just occurred to me as well and is definitely pushing me in favor of immutable loggers.

@RX14
Copy link
Contributor

RX14 commented Oct 10, 2018

@ezrast that's what i meant by "Logger needs to be stateless and immutable, because otherwise you need to keep a registry of loggers". If Loggers hold configuration data, applications then need to be able to access the loggers of their libraries through .get, which requires a registry. If loggers are immutable, then the config is entirely in the Logger::Manager and the Collectors/Filters - which are easily accessible through the manager.

Also, if you give people the ability to add filters and emitters to leaf loggers, they will add filters and emitters to leaf loggers. If you add filters and emitters to leaf loggers, the data flow diagram does turn into spaghetti.

@ezrast
Copy link
Author

ezrast commented Oct 21, 2018

^ This splits out the guts of Logger into Logger::Dispatcher ("manager" is a very non-descriptive name imo). Logger itself is now a thin struct. What was Logger::ROOT is now Logger.default_dispatcher. Gist updated: https://gist.github.com/ezrast/37c3377c172cd225efe0ed40f4da1e59/revisions

I think @straight-shoota has a good point about putting filters on the Emitter. Emitters may do globally-stateful things like rate limiting or database locking so it makes sense to want to use the same Emitter in two different places that may not have the same log level.

Dispatcher includes Emitter so you can chain Dispatchers together if you want some Emitters to use a separate log level:

# This MyEmitter instance will respond only to warnings and above
Logger.default_dispatcher.emitters.push(
  Logger::Dispatcher.new(Logger::WARN, MyEmitter.new)
)

@RX14
Copy link
Contributor

RX14 commented Oct 21, 2018

Looks good. I'll wait for @bcardiff to comment on the design before doing a full review.

@z64
Copy link
Contributor

z64 commented Feb 18, 2019

Could this get a review :)? 🏓 This is at least blocking crystal-db from getting out of the box logging which is sorely needed, and the rest of the driver ecosystem would immediately benefit from. (crystal-lang/crystal-db#64 (comment)).

It would be great if the design here could be approved so that this PR can continue, or the work can be picked up and completed in another.

@ezrast
Copy link
Author

ezrast commented Feb 19, 2019

Thanks for the bump, @z64.

If it would help this get reviewed I can find time to add in documentation soon. Otherwise I'm holding off on docs/tests until the design gets approval.

@z64 z64 mentioned this pull request Mar 4, 2020
@Blacksmoke16
Copy link
Member

Prob can be closed now due to #8847.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants