Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Support outputting structured logs in addition to standard logs #8607

Merged
merged 23 commits into from
Oct 29, 2020

Conversation

clokep
Copy link
Member

@clokep clokep commented Oct 20, 2020

This supports outputting structured logs at the same time as "standard" logs by reworking how the structured logging system works to be configurable as part of the standard Python logging config. This unfortunately got very large, but I've been unable to come up with a reasonable plan to split it up. It should be reviewable commit-by-commit, although a couple are still rather large! I can walk through it with someone if that's easiest!

The first commit removes the custom configuration code (synapse.logging._structured) for structured logging, which is no longer necessary.

The main commit converts the JSON log observers into a Formatter and the TCP log observer into a Handler. This is essentially converting from Twisted logging infrastructure to the Python standard library infrastructure. Due to this, the terse JSON code gets simplified since a LogRecord is simpler than the Twisted log events.

After doing that we need to add back support for passing the server_name through to the output via a Filter.

Finally, we update synmark to use the new code, which is fairly invasive. I ended up modifying it to avoid creating a homeserver and only instantiating the logging framework, this should be more realistic for benchmarking.

There's also a couple of clean-up commits, but hopefully those are self-explanatory.

This builds on #8587

Fixes #8588

@clokep clokep force-pushed the clokep/struct-logging-via-stdlib branch from d3d59ab to 1d0d149 Compare October 21, 2020 11:09
@clokep clokep force-pushed the clokep/struct-logging-via-stdlib branch 2 times, most recently from 1c56b1c to 22e7b22 Compare October 21, 2020 18:14
This dramatically simplifies the configuration necessary for running
the synmark logging suite by avoiding creating a homeserver object
we can avoid creating a database and more quickly run the testing
suite.
@clokep clokep force-pushed the clokep/struct-logging-via-stdlib branch from 22e7b22 to 4462758 Compare October 22, 2020 12:53
@clokep clokep marked this pull request as ready for review October 22, 2020 13:22
@clokep clokep requested a review from a team October 22, 2020 13:23
@clokep
Copy link
Member Author

clokep commented Oct 22, 2020

I should add something to UPGRADE.rst about this as it is a breaking change in the config file format.

synapse/logging/__init__.py Outdated Show resolved Hide resolved
tests/logging/test_remote_handler.py Outdated Show resolved Hide resolved
synapse/logging/_terse_json.py Outdated Show resolved Hide resolved
synapse/logging/_terse_json.py Outdated Show resolved Hide resolved
synapse/logging/filter.py Show resolved Hide resolved
@erikjohnston
Copy link
Member

I should add something to UPGRADE.rst about this as it is a breaking change in the config file format.

We probably should also throw a helpful exception if we see structured: true in the config?

@erikjohnston
Copy link
Member

@clokep What is your intention for producing structured logs? I think the reason for using the twisted logger in the first place was that it has native support structured logging? I guess I'd kinda like to see an example of how we'd actually attach extra metadata to the log lines?

Co-authored-by: Erik Johnston <[email protected]>
@clokep
Copy link
Member Author

clokep commented Oct 26, 2020

@clokep What is your intention for producing structured logs? I think the reason for using the twisted logger in the first place was that it has native support structured logging? I guess I'd kinda like to see an example of how we'd actually attach extra metadata to the log lines?

@erikjohnston and I had some conversation about this out of band, it seems the intention of the original code was to switch to using the Twisted logging system in places that we want structured logs to make them something like:

log.info("Stuff {stuff} and {things}", stuff=stuff, things=things, extra_data=extra_data)

The Python docs on structured logging don't seem super pretty to use, however.

I think I need to:

  1. Investigate how additional info for particular log entries could be included.
  2. Include an example in this PR (or have a follow-up ready) where we actually use structured logging.

Erik also asked:

Also, ooi do you know how the stdlib TCP logging handler compares to our custom one?

The code in this PR uses Twisted for the TCP mechanism which should help with ensuring that logging doesn't block the reactor. This is expanded a bit on the issue: #8588 (comment)

@clokep
Copy link
Member Author

clokep commented Oct 26, 2020

I should add something to UPGRADE.rst about this as it is a breaking change in the config file format.

We probably should also throw a helpful exception if we see structured: true in the config?

I added this. 👍

@clokep
Copy link
Member Author

clokep commented Oct 27, 2020

I guess I'd kinda like to see an example of how we'd actually attach extra metadata to the log lines?

7fb5505 (#8607) now includes a test that shows how additional metadata can be attached to log lines. The tl;dr is that the standard extra keyword is used:

The fourth keyword argument is extra which can be used to pass a dictionary which is used to populate the __dict__ of the LogRecord created for the logging event with user-defined attributes. These custom attributes can then be used as you like. For example, they could be incorporated into logged messages.

The example using the Twisted logging above was:

log.info("Stuff {stuff} and {things}", stuff=stuff, things=things, extra_data=extra_data)

This would be become:

logger.info("Stuff %s and %s", stuff, things, extra={"stuff": stuff, "things": things})

Which is a bit more manual, but seems like a reasonable way to get started?

@clokep
Copy link
Member Author

clokep commented Oct 27, 2020

As a summary (since quite a few changes were made):

  • 24ab2df was the first commit after review
  • Some trivial formatting / style changes were made
  • Specific comments were handled and the comment threads resolved above.
  • An error is now raised is structured: true is still in the logging config.
  • The sample configuration was updated to point to the structured logging docs.
  • Tests were reworked a bit to abstract code.
  • An additional test for including additional structured data was included.

Open questions:

  • Any unresolved comment threads (seems to only be one about including the logging time).
  • Whether using extra to include additional information is OK or not.

@clokep clokep requested a review from erikjohnston October 27, 2020 11:48
@clokep clokep self-assigned this Oct 27, 2020
@erikjohnston
Copy link
Member

This broadly looks good. @clokep shall I wait until we've heard back from ops, or do you want to just land this?

@clokep
Copy link
Member Author

clokep commented Oct 28, 2020

This broadly looks good. @clokep shall I wait until we've heard back from ops, or do you want to just land this?

I think I'd prefer to land this as is, I don't think it makes the output structured logs worse than they are now and it does allow running both standard and structured logging at the same time, which is a definite improvement. Additionally, adding more into this PR seems like a bad idea.

I think work left to do includes:

  • Coming up with a way to not have to repeat information for the structured logs (e.g. right now it requires the extra dict).
  • Switching some of our logging over to include better / more information.
  • Adding additional context (from the request?) onto logged messages (besides just the request ID).

Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remember to yell loudly in the changelog

@clokep
Copy link
Member Author

clokep commented Oct 28, 2020

We need to remember to yell loudly in the changelog

Does the info added to UPGRADE.rst count? 😄

@clokep
Copy link
Member Author

clokep commented Oct 28, 2020

From our conversation off GitHub about making this backwards compatible the following changes were enacted:

  • Instead of raising an error if structured: true is seen this kicks off a translation of the config to a compatible config.
  • This adds a JsonFormatter, which is the TerseJsonFormatter without the time. (Why is the "terse" formatter less-terse?)
  • Updates the upgrade notes and documentation about these changes (including a guide for how to upgrade your config).

The diff is a little funky since I brought back the synapse.logging._structured file which had some code I wanted.

@clokep clokep requested a review from erikjohnston October 28, 2020 19:22
@clokep clokep merged commit 00b24aa into develop Oct 29, 2020
@clokep clokep deleted the clokep/struct-logging-via-stdlib branch October 29, 2020 11:27
@richvdh richvdh mentioned this pull request Oct 31, 2020
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Nov 18, 2020
Synapse 1.23.0 (2020-11-18)
===========================

This release changes the way structured logging is configured. See the [upgrade notes](UPGRADE.rst#upgrading-to-v1230) for details.

**Note**: We are aware of a trivially exploitable denial of service vulnerability in versions of Synapse prior to 1.20.0. Complete details will be disclosed on Monday, November 23rd. If you have not upgraded recently, please do so.

Bugfixes
--------

- Fix a dependency versioning bug in the Dockerfile that prevented Synapse from starting. ([\#8767](matrix-org/synapse#8767))


Synapse 1.23.0rc1 (2020-11-13)
==============================

Features
--------

- Add a push rule that highlights when a jitsi conference is created in a room. ([\#8286](matrix-org/synapse#8286))
- Add an admin api to delete a single file or files that were not used for a defined time from server. Contributed by @dklimpel. ([\#8519](matrix-org/synapse#8519))
- Split admin API for reported events (`GET /_synapse/admin/v1/event_reports`) into detail and list endpoints. This is a breaking change to #8217 which was introduced in Synapse v1.21.0. Those who already use this API should check their scripts. Contributed by @dklimpel. ([\#8539](matrix-org/synapse#8539))
- Support generating structured logs via the standard logging configuration. ([\#8607](matrix-org/synapse#8607), [\#8685](matrix-org/synapse#8685))
- Add an admin API to allow server admins to list users' pushers. Contributed by @dklimpel. ([\#8610](matrix-org/synapse#8610), [\#8689](matrix-org/synapse#8689))
- Add an admin API `GET /_synapse/admin/v1/users/<user_id>/media` to get information about uploaded media. Contributed by @dklimpel. ([\#8647](matrix-org/synapse#8647))
- Add an admin API for local user media statistics. Contributed by @dklimpel. ([\#8700](matrix-org/synapse#8700))
- Add `displayname` to Shared-Secret Registration for admins. ([\#8722](matrix-org/synapse#8722))


Bugfixes
--------

- Fix fetching of E2E cross signing keys over federation when only one of the master key and device signing key is cached already. ([\#8455](matrix-org/synapse#8455))
- Fix a bug where Synapse would blindly forward bad responses from federation to clients when retrieving profile information. ([\#8580](matrix-org/synapse#8580))
- Fix a bug where the account validity endpoint would silently fail if the user ID did not have an expiration time. It now returns a 400 error. ([\#8620](matrix-org/synapse#8620))
- Fix email notifications for invites without local state. ([\#8627](matrix-org/synapse#8627))
- Fix handling of invalid group IDs to return a 400 rather than log an exception and return a 500. ([\#8628](matrix-org/synapse#8628))
- Fix handling of User-Agent headers that are invalid UTF-8, which caused user agents of users to not get correctly recorded. ([\#8632](matrix-org/synapse#8632))
- Fix a bug in the `joined_rooms` admin API if the user has never joined any rooms. The bug was introduced, along with the API, in v1.21.0. ([\#8643](matrix-org/synapse#8643))
- Fix exception during handling multiple concurrent requests for remote media when using multiple media repositories. ([\#8682](matrix-org/synapse#8682))
- Fix bug that prevented Synapse from recovering after losing connection to the database. ([\#8726](matrix-org/synapse#8726))
- Fix bug where the `/_synapse/admin/v1/send_server_notice` API could send notices to non-notice rooms. ([\#8728](matrix-org/synapse#8728))
- Fix PostgreSQL port script fails when DB has no backfilled events. Broke in v1.21.0. ([\#8729](matrix-org/synapse#8729))
- Fix PostgreSQL port script to correctly handle foreign key constraints. Broke in v1.21.0. ([\#8730](matrix-org/synapse#8730))
- Fix PostgreSQL port script so that it can be run again after a failure. Broke in v1.21.0. ([\#8755](matrix-org/synapse#8755))


Improved Documentation
----------------------

- Instructions for Azure AD in the OpenID Connect documentation. Contributed by peterk. ([\#8582](matrix-org/synapse#8582))
- Improve the sample configuration for single sign-on providers. ([\#8635](matrix-org/synapse#8635))
- Fix the filepath of Dex's example config and the link to Dex's Getting Started guide in the OpenID Connect docs. ([\#8657](matrix-org/synapse#8657))
- Note support for Python 3.9. ([\#8665](matrix-org/synapse#8665))
- Minor updates to docs on running tests. ([\#8666](matrix-org/synapse#8666))
- Interlink prometheus/grafana documentation. ([\#8667](matrix-org/synapse#8667))
- Notes on SSO logins and media_repository worker. ([\#8701](matrix-org/synapse#8701))
- Document experimental support for running multiple event persisters. ([\#8706](matrix-org/synapse#8706))
- Add information regarding the various sources of, and expected contributions to, Synapse's documentation to `CONTRIBUTING.md`. ([\#8714](matrix-org/synapse#8714))
- Migrate documentation `docs/admin_api/event_reports` to markdown. ([\#8742](matrix-org/synapse#8742))
- Add some helpful hints to the README for new Synapse developers. Contributed by @chagai95. ([\#8746](matrix-org/synapse#8746))


Internal Changes
----------------

- Optimise `/createRoom` with multiple invited users. ([\#8559](matrix-org/synapse#8559))
- Implement and use an `@lru_cache` decorator. ([\#8595](matrix-org/synapse#8595))
- Don't instansiate Requester directly. ([\#8614](matrix-org/synapse#8614))
- Type hints for `RegistrationStore`. ([\#8615](matrix-org/synapse#8615))
- Change schema to support access tokens belonging to one user but granting access to another. ([\#8616](matrix-org/synapse#8616))
- Remove unused OPTIONS handlers. ([\#8621](matrix-org/synapse#8621))
- Run `mypy` as part of the lint.sh script. ([\#8633](matrix-org/synapse#8633))
- Correct Synapse's PyPI package name in the OpenID Connect installation instructions. ([\#8634](matrix-org/synapse#8634))
- Catch exceptions during initialization of `password_providers`. Contributed by Nicolai Søborg. ([\#8636](matrix-org/synapse#8636))
- Fix typos and spelling errors in the code. ([\#8639](matrix-org/synapse#8639))
- Reduce number of OpenTracing spans started. ([\#8640](matrix-org/synapse#8640), [\#8668](matrix-org/synapse#8668), [\#8670](matrix-org/synapse#8670))
- Add field `total` to device list in admin API. ([\#8644](matrix-org/synapse#8644))
- Add more type hints to the application services code. ([\#8655](matrix-org/synapse#8655), [\#8693](matrix-org/synapse#8693))
- Tell Black to format code for Python 3.5. ([\#8664](matrix-org/synapse#8664))
- Don't pull event from DB when handling replication traffic. ([\#8669](matrix-org/synapse#8669))
- Abstract some invite-related code in preparation for landing knocking. ([\#8671](matrix-org/synapse#8671), [\#8688](matrix-org/synapse#8688))
- Clarify representation of events in logfiles. ([\#8679](matrix-org/synapse#8679))
- Don't require `hiredis` package to be installed to run unit tests. ([\#8680](matrix-org/synapse#8680))
- Fix typing info on cache call signature to accept `on_invalidate`. ([\#8684](matrix-org/synapse#8684))
- Fail tests if they do not await coroutines. ([\#8690](matrix-org/synapse#8690))
- Improve start time by adding an index to `e2e_cross_signing_keys.stream_id`. ([\#8694](matrix-org/synapse#8694))
- Re-organize the structured logging code to separate the TCP transport handling from the JSON formatting. ([\#8697](matrix-org/synapse#8697))
- Use Python 3.8 in Docker images by default. ([\#8698](matrix-org/synapse#8698))
- Remove the "draft" status of the Room Details Admin API. ([\#8702](matrix-org/synapse#8702))
- Improve the error returned when a non-string displayname or avatar_url is used when updating a user's profile. ([\#8705](matrix-org/synapse#8705))
- Block attempts by clients to send server ACLs, or redactions of server ACLs, that would result in the local server being blocked from the room. ([\#8708](matrix-org/synapse#8708))
- Add metrics the allow the local sysadmin to track 3PID `/requestToken` requests. ([\#8712](matrix-org/synapse#8712))
- Consolidate duplicated lists of purged tables that are checked in tests. ([\#8713](matrix-org/synapse#8713))
- Add some `mdui:UIInfo` element examples for `saml2_config` in the homeserver config. ([\#8718](matrix-org/synapse#8718))
- Improve the error message returned when a remote server incorrectly sets the `Content-Type` header in response to a JSON request. ([\#8719](matrix-org/synapse#8719))
- Speed up repeated state resolutions on the same room by caching event ID to auth event ID lookups. ([\#8752](matrix-org/synapse#8752))


Synapse 1.22.1 (2020-10-30)
===========================

Bugfixes
--------

- Fix a bug where an appservice may not be forwarded events for a room it was recently invited to. Broke in v1.22.0. ([\#8676](matrix-org/synapse#8676))
- Fix `Object of type frozendict is not JSON serializable` exceptions when using third-party event rules. Broke in v1.22.0. ([\#8678](matrix-org/synapse#8678))
clokep added a commit that referenced this pull request Dec 14, 2020
This code is never called as of #8607.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow simultaneously enabling structured logging and "standard" logging
2 participants