Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using logging #690

Closed
tomschr opened this issue Dec 14, 2020 · 8 comments · Fixed by #614
Closed

Consider using logging #690

tomschr opened this issue Dec 14, 2020 · 8 comments · Fixed by #614

Comments

@tomschr
Copy link

tomschr commented Dec 14, 2020

Situation

This is more of a long term idea than an actual bug report.

After looking into the code, it seems that Python's logging module isn't used at all. There is a crmsh.msg package which tries to emulate(?) logging. It looks like it replicates the work what logging already does, so I fear, this is just reinventing the wheel. 😉

Suggested idea

Maybe there is a reason why this is the case. If feasible, I would suggest use logging instead of ErrorBuffer class as a long term strategy. The logging module has the following benefits:

  • Provides several ways to configure it: for example, you can save logging information into files or output it to the standard error channel.
  • Supports several levels of information, from simple debug messages, information, to warnings and error.
  • Makes it possible to configure logging hierarchies. For example, you can distinguish between corosync logging, UI logging, or anything else (if you want that).
  • Helps to dissect or filter information.
  • Allows to change the output of the logging messages.

For more information

@tomschr tomschr changed the title Conside using logging Consider using logging Dec 14, 2020
@liangxin1300
Copy link
Collaborator

Hi @tomschr ,

About logging, I totally agree with you since it does have many benefits.
I've already created a related PR here: #614
When higher priorities jobs got finished, I will push this PR reviewed and merged

Thank you!

@tomschr
Copy link
Author

tomschr commented Dec 15, 2020

Thanks @liangxin1300,
I've tried to search existing issues, but it seems my search missed that. 😄

If you think this is a duplicate, you can close this issue. :-) All fine, it's just an idea, no hurry.

@zzhou1 zzhou1 linked a pull request Dec 31, 2020 that will close this issue
@zzhou1
Copy link
Contributor

zzhou1 commented Jan 4, 2021

I second the request to improve loggin facility in general. Here is an example I encountered. Two findings:

1). When crm exits abnornally or even success for crmsh cluster remove. It SHOULD print the stdout message to guide user to check /var/log/crmsh/ha-cluster-boostrap.log. Well, this is printed for a successful operation, eg. crmsh cluster init|join ...

2). Error messages are treated differently, not aligned. Example below, the first Error message only exists in ha-cluster-boostrap.log, not in the stdout.

2020-12-31 11:21:53+08:00 /usr/sbin/crm cluster remove sle15sp2-2
----------------------------------------------------------------
# Stopping the hawk.service
+ ssh -o StrictHostKeyChecking=no root@sle15sp2-2 "bash -c \"rm -f /etc/sysconfig/sbd /etc/csync2/csync2.cfg /etc/corosync/corosync.conf /etc/csync2/key_hagroup /etc/corosync/authkey /var/lib/heartbeat/crm/* /var/lib/pacemaker/cib/*\""
# Removing the node sle15sp2-2
+ crm node delete sle15sp2-2
ERROR: according to crm_node, node sle15sp2-2 is still active
ERROR: Failed to remove sle15sp2-2
+ systemctl reload rsyslog.service

@tomschr
Copy link
Author

tomschr commented Jan 4, 2021

[...]
1). When crm exits abnornally or even success for crmsh cluster remove. It SHOULD print the stdout message [...]

and

2). Error messages are treated differently, not aligned. Example below, the first Error message only exists in ha-cluster-boostrap.log, not in the stdout.

Both issues can be tackled with Python's logging module. It's just a matter of configuration. For example, you could define your configuration in a way that it will log all events into a file and to some degree to stdout/stderr.

Depending on your wishes, you could output different information for different output handlers. You could even sent a mail to a specific server.

Here is an short example to demonstrate the basic principle:
https://gist.github.com/tomschr/10837c2f7286b13d3ebc59e40d943188

I would highly recommend to NOT use the root logger but create your own logger instance. Especially if someone wants to import the crmsh as a library (is that possible?) you have a dedicated logger.

@tomschr
Copy link
Author

tomschr commented Jan 4, 2021

Okay, some other questions to think about:

  1. Do you want to provide the configuration through a config file?
    The link from my last message provides the configuration inside the script itself. However, it is also possible to read the logging configuration from a file (INI or Yaml format).
    This would give the customer (and maybe support engineers?) a more flexible way. However, it could also mean, it's more error-prone. Not sure if you want to go down to this path.
  2. Do you want to log to a file and to stdout/stderr?
    Both is possible as I mentioned in my last message. It's just a matter of what information you want to log where.
  3. Do you want to create a hierarchical loggers?
    It's possible for the logging module to create child loggers. You can configure it as they react in the same way as their parent logger. The interesting part is when it comes to filtering.
    With hierarchical loggers you can, for example, dissect and filter your logging output. That means, you can just filter fencing events or maintenance.

@liangxin1300
Copy link
Collaborator

I second the request to improve loggin facility in general. Here is an example I encountered. Two findings:

1). When crm exits abnornally or even success for crmsh cluster remove. It SHOULD print the stdout message to guide user to check /var/log/crmsh/ha-cluster-boostrap.log. Well, this is printed for a successful operation, eg. crmsh cluster init|join ...

Added the same finish statement when finished removing node

2). Error messages are treated differently, not aligned. Example below, the first Error message only exists in ha-cluster-boostrap.log, not in the stdout.

2020-12-31 11:21:53+08:00 /usr/sbin/crm cluster remove sle15sp2-2
----------------------------------------------------------------
# Stopping the hawk.service
+ ssh -o StrictHostKeyChecking=no root@sle15sp2-2 "bash -c \"rm -f /etc/sysconfig/sbd /etc/csync2/csync2.cfg /etc/corosync/corosync.conf /etc/csync2/key_hagroup /etc/corosync/authkey /var/lib/heartbeat/crm/* /var/lib/pacemaker/cib/*\""
# Removing the node sle15sp2-2
+ crm node delete sle15sp2-2
ERROR: according to crm_node, node sle15sp2-2 is still active
ERROR: Failed to remove sle15sp2-2
+ systemctl reload rsyslog.service

Changed as fatal("Failed to remove {}: {}".format(node, err)), will also in the stderr
BTW, remove process might exist issue, I will create a new PR to fix that

@liangxin1300
Copy link
Collaborator

@tomschr ,

Okay, some other questions to think about:

  1. Do you want to provide the configuration through a config file?
    The link from my last message provides the configuration inside the script itself. However, it is also possible to read the logging configuration from a file (INI or Yaml format).
    This would give the customer (and maybe support engineers?) a more flexible way. However, it could also mean, it's more error-prone. Not sure if you want to go down to this path.

I think be in dictionary way will be good enough:) log.LOGGING_CFG includes some custom classes, it's more error-prone to change when in the file way

  1. Do you want to log to a file and to stdout/stderr?
    Both is possible as I mentioned in my last message. It's just a matter of what information you want to log where.

Yes, both in console and /var/log/crmsh/crmsh.log

  1. Do you want to create a hierarchical loggers?
    It's possible for the logging module to create child loggers. You can configure it as they react in the same way as their parent logger. The interesting part is when it comes to filtering.
    With hierarchical loggers you can, for example, dissect and filter your logging output. That means, you can just filter fencing events or maintenance.

Yes, I have implemented hierarchical loggers

Thanks for your nice suggestions!

@tomschr
Copy link
Author

tomschr commented Aug 26, 2021

@liangxin1300 That looks really great! 👍 Congrats for your implementation, well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants