Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-sdk v2 & v3 integration #40

Merged
merged 11 commits into from
Nov 4, 2019
Merged

aws-sdk v2 & v3 integration #40

merged 11 commits into from
Nov 4, 2019

Conversation

ajvondrak
Copy link
Contributor

screenshot

This might technically be compatible with aws-sdk v2 as well, since v3 is supposed to be API-compatible. But I'm also not excited at supporting old software. 😅 Amazon is still apparently supporting v2 of the gem, but it seems to largely be automated maintenance (when they update the files that define the APIs/docs).

This started out by largely cribbing from the implementations of Aws::Plugins::ClientMetricsPlugin and Aws::Plugins::ClientSendMetricsPlugin. At first I thought that maybe we could just configure client metrics with a custom publisher that could feed spans to Honeycomb, but decided against that:

  • The client metrics publisher batches and sends events all at once. We wouldn't easily be able to set the libhoney event timestamps + durations after-the-fact without exposing the guts of the Honeycomb::Span. I figured I didn't want to change the beeline internals that much.

  • Hooking into the client metrics plugin means our available data would be dictated by the data in Aws::ClientSideMonitoring::RequestMetrics. As I started building out the Honeycomb spans, it became clearer to me that I'd want more control over the fields being added. E.g., instead of sending a write-time synthesized field for final_error_retryable, we're free to decorate the Honeycomb span with raw configuration values for number of retries, the retry limit, and the final error type.

  • Hand-rolling a new plugin + handlers just gives us more control to add a lot of rich metadata. Notably, the aws-api error handling is better than in Aws::Plugins::ClientMetricsSendPlugin, since I went through the effort of parsing response errors. But even just adding the aws.params field is already way more useful for diagnosing slow requests. The user can always configure a presend_hook to scrub fields they don't want. And if we want to add more information, we don't have to open aws-sdk-ruby pull requests or monkey-patch the CSM implementation.

The only big question marks in my mind are:

  • Naming of the fields. I can't find a whole lot of consistency between different beelines, but I'm still not sure if the names I picked are running afoul of any strong conventions or anything.

  • The development dependency was easy to test locally, but the appraisals + circleci configuration were added mostly blindly to parrot the other setups. Not sure if they'll work or if I missed anything.

@findchris
Copy link

We are looking to use aws-sdk v3 exclusively going forward, and this seems to check all of the boxes. @martin308 - Can you have a look?

The development dependency was added to facilitate an upcoming aws-sdk
integration. The appraisals + circleci configuration were added mostly
blindly to parrot the other setups, assuming this will be needed for the
sake of continuous integration. The .ruby-version checked into the repo
points at Ruby 2.6 as of this writing, whereas the gemspec declares
support for Ruby >= 2.2.0, so it seems prudent for *someone* to be
testing these versions. Might as well be Circle.

We might technically try to include tests for aws-sdk v2 as well, since
v3 is supposed to be API-compatible [1]. I'm thinking that's probably
more effort than it's worth, though. Amazon is still apparently
supporting v2 of the gem [2], but it seems to largely be automated
maintenance (when they update the files that define the APIs). I'll
worry about developing against v3 first and foremost before even
entertaining the thought of backporting to v2.

References:

[1] https://github.com/aws/aws-sdk-ruby/blob/master/V3_UPGRADING_GUIDE.md

[2] https://github.com/aws/aws-sdk-ruby/commits/version-2
This largely cribs from the implementations of
`Aws::Plugins::ClientMetricsPlugin` [1] and
`Aws::Plugins::ClientSendMetricsPlugin` [2]. At first I thought that
maybe we could just configure client metrics with a custom publisher [3]
that could feed spans to Honeycomb, but decided against that:

- The client metrics publisher batches and sends events all at once. We
  wouldn't easily be able to set the libhoney event timestamps +
  durations after-the-fact without exposing the guts of the
  `Honeycomb::Span`. I figured I didn't want to change the beeline
  internals that much.

- Hooking into the client metrics plugin means our available data would
  be dictated by the data in `Aws::ClientSideMonitoring::RequestMetrics`
  [4]. As I started building out the Honeycomb spans, it became clearer
  to me that I'd want more control over the fields being added. E.g.,
  instead of sending a write-time synthesized field for
  `final_error_retryable` [5], we're free to decorate the Honeycomb span
  with raw configuration values for number of retries, the retry limit,
  and the final error type.

- Hand-rolling a new plugin + handlers just gives us more control to add
  a lot of rich metadata. Notably, the aws-api error handling is better
  than in `Aws::Plugins::ClientMetricsSendPlugin` [6], since I went
  through the effort of parsing response errors. But even just adding
  the `aws.params` field is already way more useful for diagnosing slow
  requests. The user can always configure a `presend_hook` to scrub
  fields they don't want. And if we want to add more information, we
  don't have to open aws-sdk-ruby pull requests or monkey-patch the CSM
  implementation.

The only big question mark in my mind is the naming of the fields. I
can't find a whole lot of consistency between different beelines, but
I'm still not sure if the names I picked are running afoul of any strong
conventions or anything.

References:

[1] https://github.com/aws/aws-sdk-ruby/blob/d0c5f6e5a3e83eeda2d1c81f5dd80e5ac562a6dc/gems/aws-sdk-core/lib/aws-sdk-core/plugins/client_metrics_plugin.rb

[2] https://github.com/aws/aws-sdk-ruby/blob/master/gems/aws-sdk-core/lib/aws-sdk-core/plugins/client_metrics_send_plugin.rb

[3] https://github.com/aws/aws-sdk-ruby/blob/d0c5f6e5a3e83eeda2d1c81f5dd80e5ac562a6dc/gems/aws-sdk-core/lib/aws-sdk-core/client_side_monitoring/publisher.rb

[4] https://github.com/aws/aws-sdk-ruby/blob/d0c5f6e5a3e83eeda2d1c81f5dd80e5ac562a6dc/gems/aws-sdk-core/lib/aws-sdk-core/client_side_monitoring/request_metrics.rb

[5] https://github.com/aws/aws-sdk-ruby/blob/d0c5f6e5a3e83eeda2d1c81f5dd80e5ac562a6dc/gems/aws-sdk-core/lib/aws-sdk-core/plugins/client_metrics_plugin.rb#L144-L150

[6] https://github.com/aws/aws-sdk-ruby/blob/d0c5f6e5a3e83eeda2d1c81f5dd80e5ac562a6dc/gems/aws-sdk-core/lib/aws-sdk-core/plugins/client_metrics_send_plugin.rb#L63-L68
For my work on the aws integration. :)
https://circleci.com/gh/honeycombio/beeline-ruby/1133 was failing
because the Errno::ETIMEDOUT message apparently differs between
operating systems. The actual error message isn't important, so I've
just replaced it with a wildcard matcher.
Copy link
Member

@martin308 martin308 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajvondrak wow thanks so much for this! It looks pretty good to me. The naming looks good to me. Just one thing in the gem spec that needs a change. I would be interested to chuck in testing of v2 of the aws-sdk into appraisal and circle and see if anything breaks. According to this the majority of downloads are still of v2. If all the tests pass then 💯 otherwise we can say that we just support v3 for now

@@ -40,6 +40,7 @@ Gem::Specification.new do |spec|
spec.add_dependency "libhoney", "~> 1.8"

spec.add_development_dependency "appraisal"
spec.add_development_dependency "aws-sdk", "~> 3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't add the framework dependencies to the gem spec here, adding it to appraisal is all that's needed

@isnotajoke
Copy link

This looks really cool!

So, let's say I have a specific SDK call in my codebase, and I know that one of the parameters of that call is sensitive and shouldn't be sent outside of the application (maybe it's a password or a credit card number or something like that). It looks like this helpfully puts parameters and some lower-level HTTP details on the span. It seems like there's a potential for sensitive parameters to be sent by that logic, but I might be missing something. If that actually is a legitimate concern, is there to exclude certain operations from this reporting (at a specific callsite, for a particular parameter name, etc)?

@ajvondrak
Copy link
Contributor Author

According to this the majority of downloads are still of v2.

Heyyy, how's that for dog food? So cool! 😀

@martin308
Copy link
Member

martin308 commented Nov 1, 2019

@isnotajoke any sensitive parameters added to your spans can be scrubbed using a presend_hook. Does that look like it will resolve your concerns?

@ajvondrak it's a really cool dataset! I've used it a bunch to make decisions about ruby versions and other library versions to support 😄

As it turns out, aws-sdk v2 tests break circa this commit. However, the
failures are narrow enough that we should be able to provide backwards
compatibility. This is important because aws-sdk v2 still remains very
widely used, according to rubygems.org data.
The gem name + version aren't present in the aws-sdk v2 context, since
it was all just one huge gem at that point. So in v2, the meta.package
should be "aws-sdk" and the meta.package_version will be ~> 2.
These tests were failing under the aws-2 appraisal because the context
propagation was only added circa v3 for the sake of the client-side
metrics plugin.

I also enhanced the tests by providing separate cases for getting the
redirect region from the headers vs body, since otherwise we wouldn't be
covering the freshly-duplicated logic.
In patching the S3 region redirects, I realized it was worth codifying
the behavior around run-of-the-mill 307 redirects. Mercifully, we
needn't add any new code to handle this, though: all the right fields
get added to the resultant aws-api spans regardless.
While the STS stub data worked in the aws-3 appraisal, it was breaking
under the aws-2 appraisal with the error

  NoMethodError: undefined method `set?' for #<Aws::STS::Types::GetSessionTokenResponse:0x00007fcdf72d06f0>
  0: aws-sdk-core-2.11.388/lib/aws-sdk-core/plugins/request_signer.rb:110:in `missing_credentials?'
  1: aws-sdk-core-2.11.388/lib/aws-sdk-core/plugins/request_signer.rb:102:in `require_credentials'
  2: aws-sdk-core-2.11.388/lib/aws-sdk-core/plugins/s3_request_signer.rb:14:in `call'

Apparently some interface changed along the way so that the STS data
Just Works in v3. But a data type that works across both v2 *and* v3 is
the plain old Aws::Credentials class, which we can easily use to stub a
session token (just to make sure it appears in the spans when set).
The way this description read was bugging me, and I only have myself to
blame. 😅
@ajvondrak ajvondrak requested a review from martin308 November 4, 2019 17:39
@ajvondrak
Copy link
Contributor Author

ajvondrak commented Nov 4, 2019

OK, added the appraisal for v2. Broke some of the tests, but they were pretty easy to fix, so I went ahead and patched them. Now we support v2 and v3! 🎉

@ajvondrak ajvondrak changed the title aws-sdk v3 integration aws-sdk v2 & v3 integration Nov 4, 2019
Copy link
Member

@martin308 martin308 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajvondrak This looks great, well done on adding v2 support!

@martin308 martin308 merged commit 2d56e3c into honeycombio:master Nov 4, 2019
@ajvondrak ajvondrak deleted the aws branch November 4, 2019 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants