-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add instrumentation for postgresql (pg gem) #659
Conversation
This commit adds tracing for postgresql, via the `pg` gem. The following methods are traced: - exec / query / sync_exec / async_exec - exec_params / sync_exec_params / async_exec_params - prepare / sync_prepare / async_prepare - exec_prepared / sync_exec_prepared / async_exec_prepared We instrument all of these methods via meta-programming, since otherwise it'd be incredibly repetitive. The methods can actually be broken down into three groups - "exec-ish", "prepare-ish", and "exec prepared-ish" methods. Every method within a group takes the same arguments, and so we can trace each method in a group in an identical fashion. Note that the 'sync' and 'async' variants require no special handling, because they are not async interfaces _per se_ - rather, they are non-blocking. See the `pg` docs for more info: https://www.rubydoc.info/gems/pg/PG%2FConnection:exec We've implemented SQL sanitization the same way that the Mysql2 gem did it - which seems to be done by copying NewRelic. We should consider extracting that from the NewRelic agent and making it a common shared library here. Possibly worth discussion is how we've handled the span name. We've chosen the format `OPERATION_NAME DATABASE_NAME`, taking the first word in the SQL string as the operation if it matches a known good list we pulled from the Postgres docs. This is similar to how the python and javascript postgres implementations behave, but is slightly different than how the ruby mysql instrumentation does it. When falling back to safe value, we also choose to simply use `DATABASE_NAME`, which seems to be in line with current semantic conventions. While tests pass and we think this is safe, it should be noted that we haven't actually put this into production testing yet - there could be horrible bugs still. :) Fixes #523
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
👍 we can do that now or when we hit the next use-case. I asked that we initially tailor it specifically to MySQL to start with, and generalize it later. While we are not required to by the license, it is probably a nice gesture to ask NewRelic if they want to donate that code to OpenTelemetry, if we're going to copy it wholesale. |
This uses an LRU cache to store the 50 most recently-prepared database statements. The goal is to provide something more descriptive than 'EXECUTE a1', which is basically what you get when Rails aggressively prepares and executes statements. 50 seems to me like a reasonable trade-off between usefulness and capping memory growth. However, it is technically unbounded if you are not sanitizing SQL - although we could decide to truncate unconditionally.
Agreed! Do you know anyone there that might be interested in weighing in? Otherwise I can just go open an issue and see how they feel. |
instrumentation/pg/lib/opentelemetry/instrumentation/pg/patches/connection.rb
Show resolved
Hide resolved
|
||
# From: | ||
# https://github.com/newrelic/newrelic-ruby-agent/blob/9787095d4b5b2d8fcaf2fdbd964ed07c731a8b6b/lib/new_relic/agent/database/obfuscator.rb | ||
# https://github.com/newrelic/newrelic-ruby-agent/blob/9787095d4b5b2d8fcaf2fdbd964ed07c731a8b6b/lib/new_relic/agent/database/obfuscation_helpers.rb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 Hey @mwlang hope all is well. At the SIG today we were chatting on how useful this obfuscation code is for orm/db instrumentation in ruby. It would be wonderful to be able to re-use the Obfuscation Module in opentelemetry-ruby directly. Would you be open to making a contribution to add this if you have bandwidth? And if not, any feedback or guidance on this approach?
We implement the bare minimum of an LRU cache ourselves, instead.
Update: I've removed the dependency on |
FYI using this in production, looking very good thanks. |
@johnnyshields Do you have any thoughts on the number of spans its creating? We're not just instrumenting common things like Also, yay! I'm glad it's working well for you! |
@ahayworth having extra items like |
@johnnyshields Thanks - that's good feedback, I appreciate it! |
Would it make sense to report a transaction as a whole span (starting with a |
The only real downside I see there is that we'd need to do much more parsing of the SQL statements we're getting in the library (they're just strings at this point before they get handed off to the lower-level C extensions in |
Update: I've been running this in production now for awhile, and have seen no problems from it thus far. |
Same here, have had it in prod for 2 weeks at TableCheck with no problem and lots of volume. Let's merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We have multiple people running this in prod with good results.
Any objections to me merging this? If not, I will, but this branch needs to have main merged in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good me. Thanks a bunch for this contribution :)
instrumentation/pg/lib/opentelemetry/instrumentation/pg/version.rb
Outdated
Show resolved
Hide resolved
@mwear |
🚀 Thanks @ahayworth! |
Great job everyone, really appreciate this. Please cut a release soon. |
This commit adds tracing for postgresql, via the
pg
gem.The following methods are traced:
We instrument all of these methods via meta-programming, since otherwise
it'd be incredibly repetitive. The methods can actually be broken down
into three groups - "exec-ish", "prepare-ish", and "exec prepared-ish"
methods. Every method within a group takes the same arguments, and so
we can trace each method in a group in an identical fashion.
Note that the 'sync' and 'async' variants require no special handling,
because they are not async interfaces per se - rather, they are
non-blocking. See the
pg
docs for more info: https://www.rubydoc.info/gems/pg/PG%2FConnection:execWe've implemented SQL sanitization the same way that the mysql2 PR did
it - which seems to be done by copying NewRelic. We should consider
extracting that from the NewRelic agent and making it a common shared
library here.
Possibly worth discussion is how we've handled the span name. We've
chosen the format
OPERATION_NAME DATABASE_NAME
, taking the first wordin the SQL string as the operation if it matches a known good list we
pulled from the Postgres docs. This is similar to how the python and
javascript postgres implementations behave, but is slightly different
than how the ruby mysql instrumentation does it. When falling back to
safe value, we also choose to simply use
DATABASE_NAME
, which seems tobe in line with current semantic conventions.
Definitely worth discussion is the LRU cache added in d119f8a - this tracks
the 50 most recently-prepared statements that we've traced; so that we can
attach it to the span as
db.statement
. Otherwise, if you're using a frameworklike Rails that will aggressively prepare statements when the database supports
it, then you just see a lot of 'EXECUTE' calls with no real information attached.
While tests pass and we think this is safe, it should be noted that we
haven't actually put this into production testing yet - there could be
horrible bugs still. :)
Fixes #523