Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add instrumentation for postgresql (pg gem) #659

Merged
merged 17 commits into from
Apr 15, 2021
Merged

feat: Add instrumentation for postgresql (pg gem) #659

merged 17 commits into from
Apr 15, 2021

Conversation

ahayworth
Copy link
Contributor

@ahayworth ahayworth commented Mar 16, 2021

This commit adds tracing for postgresql, via the pg gem.

The following methods are traced:

  • exec / query / sync_exec / async_exec
  • exec_params / sync_exec_params / async_exec_params
  • prepare / sync_prepare / async_prepare
  • exec_prepared / sync_exec_prepared / async_exec_prepared

We instrument all of these methods via meta-programming, since otherwise
it'd be incredibly repetitive. The methods can actually be broken down
into three groups - "exec-ish", "prepare-ish", and "exec prepared-ish"
methods. Every method within a group takes the same arguments, and so
we can trace each method in a group in an identical fashion.

Note that the 'sync' and 'async' variants require no special handling,
because they are not async interfaces per se - rather, they are
non-blocking. See the pg docs for more info: https://www.rubydoc.info/gems/pg/PG%2FConnection:exec

We've implemented SQL sanitization the same way that the mysql2 PR did
it - which seems to be done by copying NewRelic. We should consider
extracting that from the NewRelic agent and making it a common shared
library here.

Possibly worth discussion is how we've handled the span name. We've
chosen the format OPERATION_NAME DATABASE_NAME, taking the first word
in the SQL string as the operation if it matches a known good list we
pulled from the Postgres docs. This is similar to how the python and
javascript postgres implementations behave, but is slightly different
than how the ruby mysql instrumentation does it. When falling back to
safe value, we also choose to simply use DATABASE_NAME, which seems to
be in line with current semantic conventions.

Definitely worth discussion is the LRU cache added in d119f8a - this tracks
the 50 most recently-prepared statements that we've traced; so that we can
attach it to the span as db.statement. Otherwise, if you're using a framework
like Rails that will aggressively prepare statements when the database supports
it, then you just see a lot of 'EXECUTE' calls with no real information attached.

While tests pass and we think this is safe, it should be noted that we
haven't actually put this into production testing yet - there could be
horrible bugs still. :)

Fixes #523

This commit adds tracing for postgresql, via the `pg` gem.

The following methods are traced:
- exec / query / sync_exec / async_exec
- exec_params / sync_exec_params / async_exec_params
- prepare / sync_prepare / async_prepare
- exec_prepared / sync_exec_prepared / async_exec_prepared

We instrument all of these methods via meta-programming, since otherwise
it'd be incredibly repetitive. The methods can actually be broken down
into three groups - "exec-ish", "prepare-ish", and "exec prepared-ish"
methods. Every method within a group takes the same arguments, and so
we can trace each method in a group in an identical fashion.

Note that the 'sync' and 'async' variants require no special handling,
because they are not async interfaces _per se_ - rather, they are
non-blocking. See the `pg` docs for more info: https://www.rubydoc.info/gems/pg/PG%2FConnection:exec

We've implemented SQL sanitization the same way that the Mysql2 gem did
it - which seems to be done by copying NewRelic. We should consider
extracting that from the NewRelic agent and making it a common shared
library here.

Possibly worth discussion is how we've handled the span name. We've
chosen the format `OPERATION_NAME DATABASE_NAME`, taking the first word
in the SQL string as the operation if it matches a known good list we
pulled from the Postgres docs. This is similar to how the python and
javascript postgres implementations behave, but is slightly different
than how the ruby mysql instrumentation does it. When falling back to
safe value, we also choose to simply use `DATABASE_NAME`, which seems to
be in line with current semantic conventions.

While tests pass and we think this is safe, it should be noted that we
haven't actually put this into production testing yet - there could be
horrible bugs still. :)

Fixes #523
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 16, 2021

CLA Signed

The committers are authorized under a signed CLA.

@ahayworth

This comment has been minimized.

@ahayworth

This comment has been minimized.

@fbogsany
Copy link
Contributor

We've implemented SQL sanitization the same way that the mysql2 PR did
it - which seems to be done by copying NewRelic. We should consider
extracting that from the NewRelic agent and making it a common shared
library here.

👍 we can do that now or when we hit the next use-case. I asked that we initially tailor it specifically to MySQL to start with, and generalize it later. While we are not required to by the license, it is probably a nice gesture to ask NewRelic if they want to donate that code to OpenTelemetry, if we're going to copy it wholesale.

This uses an LRU cache to store the 50 most recently-prepared database
statements. The goal is to provide something more descriptive than
'EXECUTE a1', which is basically what you get when Rails aggressively
prepares and executes statements.

50 seems to me like a reasonable trade-off between usefulness and
capping memory growth. However, it is technically unbounded if you are
not sanitizing SQL - although we could decide to truncate unconditionally.
@ahayworth ahayworth marked this pull request as ready for review March 17, 2021 21:37
@ahayworth
Copy link
Contributor Author

👍 we can do that now or when we hit the next use-case. I asked that we initially tailor it specifically to MySQL to start with, and generalize it later. While we are not required to by the license, it is probably a nice gesture to ask NewRelic if they want to donate that code to OpenTelemetry, if we're going to copy it wholesale.

Agreed! Do you know anyone there that might be interested in weighing in? Otherwise I can just go open an issue and see how they feel.

@ahayworth ahayworth changed the title [WIP] feat: Add instrumentation for postgresql (pg gem) feat: Add instrumentation for postgresql (pg gem) Mar 17, 2021

# From:
# https://github.com/newrelic/newrelic-ruby-agent/blob/9787095d4b5b2d8fcaf2fdbd964ed07c731a8b6b/lib/new_relic/agent/database/obfuscator.rb
# https://github.com/newrelic/newrelic-ruby-agent/blob/9787095d4b5b2d8fcaf2fdbd964ed07c731a8b6b/lib/new_relic/agent/database/obfuscation_helpers.rb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hey @mwlang hope all is well. At the SIG today we were chatting on how useful this obfuscation code is for orm/db instrumentation in ruby. It would be wonderful to be able to re-use the Obfuscation Module in opentelemetry-ruby directly. Would you be open to making a contribution to add this if you have bandwidth? And if not, any feedback or guidance on this approach?

We implement the bare minimum of an LRU cache ourselves, instead.
@ahayworth
Copy link
Contributor Author

Update: I've removed the dependency on lru_redux, and implemented our own minimal LRU cache instead.

@johnnyshields
Copy link
Contributor

FYI using this in production, looking very good thanks.

@ahayworth
Copy link
Contributor Author

@johnnyshields Do you have any thoughts on the number of spans its creating? We're not just instrumenting common things like SELECT, UPDATE, etc - I'm curious if you have any thoughts on if it's perhaps too noisy.

Also, yay! I'm glad it's working well for you!

@johnnyshields
Copy link
Contributor

johnnyshields commented Mar 27, 2021

@ahayworth having extra items like BEGIN, COMMIT etc. are very useful to see. In fact, having these extra spans helped us debug an issue yesterday. In the below example you can see clearly that when using transactions the latency doesn't happen on the INSERT step but on the COMMIT.

image

@ahayworth
Copy link
Contributor Author

@johnnyshields Thanks - that's good feedback, I appreciate it!

@renchap
Copy link

renchap commented Apr 3, 2021

Would it make sense to report a transaction as a whole span (starting with a BEGIN (or START TRANSACTION) and ending with a COMMIT or ROLLBACK, rather than having two different spans?
BEGIN is never supposed to block AFAIK, so we could have a transaction span, and children spans for the individual queries + the final COMMIT statement.

@ahayworth
Copy link
Contributor Author

Would it make sense to report a transaction as a whole span (starting with a BEGIN (or START TRANSACTION) and ending with a COMMIT or ROLLBACK, rather than having two different spans?

The only real downside I see there is that we'd need to do much more parsing of the SQL statements we're getting in the library (they're just strings at this point before they get handed off to the lower-level C extensions in pg). That would complicate the PR quite a bit, so I think I'd want to leave that out for now.

@ahayworth
Copy link
Contributor Author

Update: I've been running this in production now for awhile, and have seen no problems from it thus far.

@johnnyshields
Copy link
Contributor

johnnyshields commented Apr 13, 2021

Same here, have had it in prod for 2 weeks at TableCheck with no problem and lots of volume. Let's merge!

Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We have multiple people running this in prod with good results.

@mwear
Copy link
Member

mwear commented Apr 14, 2021

Any objections to me merging this? If not, I will, but this branch needs to have main merged in.

Copy link
Contributor

@robertlaurin robertlaurin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good me. Thanks a bunch for this contribution :)

@ahayworth
Copy link
Contributor Author

@mwear main is merged and the PR is updated to deal with the new opentelemetry-instrumentation-base extraction. 😄

@mwear
Copy link
Member

mwear commented Apr 15, 2021

🚀 Thanks @ahayworth!

@mwear mwear merged commit 463c14c into open-telemetry:main Apr 15, 2021
@johnnyshields
Copy link
Contributor

Great job everyone, really appreciate this. Please cut a release soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Postgresql
7 participants