Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First-pass M1 AWS backend #49

Closed
wants to merge 15 commits into from
Closed

First-pass M1 AWS backend #49

wants to merge 15 commits into from

Conversation

danotorrey
Copy link
Contributor

@danotorrey danotorrey commented May 30, 2019

This PR was closed and #115 was created as a draft to prevent merging for now.

Please review and merge #92 and #93 before reviewing this PR.

Please don't merge this branch right now.

We were planning to keep all of the backend code on a sub-branch, then review from there, but we integrated to the aws branch sooner to allow frontend integration to begin. So, we don't actually want to merge this branch to master right now. I guess we can just close this PR without merging once we finish this review.

Backend code for the AWS integration. Currently contains most of the needed setup API endpoints in https://github.com/Graylog2/graylog-plugin-integrations/blob/aws/src/main/java/org/graylog/integrations/aws/resources/AWSResource.java.

We're mostly just looking for overall feedback on structure and anything obvious you see. We've done initial testing on API endpoints, and will continue to test them when integrating with the UI.

More backend code is currently in development in other branches, but the plan is to not merge them until the review has been completed. We'll do more regular review of each future bigger incremental backend PR.

Dan Torrey and others added 12 commits April 26, 2019 14:31
Explicitly include commons-codec in this POM, since the AWS SDK v2 internally remaps commons-codec to another internal package. This makes the commons-codec from graylog-project-parent unavailable.
See https://aws.amazon.com/blogs/developer/java-sdk-bundle
Explicitly include commons-codec in this POM, since the AWS SDK v2 internally remaps commons-codec to another internal package. This makes the commons-codec from graylog-project-parent unavailable.
See https://aws.amazon.com/blogs/developer/java-sdk-bundle
* Simple Clickthrough without API

* Clickable Skeleton All Steps

* Feedback

* Lint
Add initial backend API calls for CloudWatch integration: getRegions, getLogGroups, getStreams, retrieveKinesisMessages, healthCheck

* AWS Cloud Watch services and resources (#24)

Adds beginnings of API endpoint and Kinesis/Cloudwatch services

Includes a structure that we will continue to build from.

* Rework organization of classes for unified structure

The goal is to establish some structure that we can implement AWS API calls within. There's now one resource (for API calls), one service (for business logic), and one AWSClient (for all AWS/API SDK interactions).

* Cleanup and add comments

* Cleanup code around log message auto-detection

* Aws cloudwatch client (#31)

* Add CloudWatchService class

* Add AWSConfigSettings class

* Add UserCredentials class

* Add temporary Main class

* Include the latest version of the Apache Http Client

* Improve HTTP Client dependency comment

* Update AWSConfigSettings class

* Update CloudWatchService class

* Update temporary Main class

* Fix inject error in AWSResource

* Fix merge conflicts

* Restructure Resources and Service classes

* Add getRegion into AWSResource (#43)

* Add getRegion into AWSResource

* Update api paths

* Increase max retry limit for stream get to 1000

Runaway looping Stopping at 100 is probably to small.

* Use underscores instead of camel case for json names

It is a general project standard to use underscores.

* Add paging functionality to getLogGroupNames (#53)

* Add paging functionality to getLogGroupNames

* Add unit test for CloudWatch log groups

* Add comments in log group name unit test

* Code clean up and remove unneeded code

* Revert unintended change from unit test commit

* Add retrievelogs for Kinesis Healthcheck (#76)

* Refactor getKinesisStreams in KinesisService

* Add validCredentials method in AWSService class

* Temporary Main class added to test putting records into a Kinesis stream

* Add retrieveKinesisLogs in KinesisService

* Update temporary Main class

* Update retrieveKinesisLogs to loop through shard iterators

* Update Main class

* Update pom file

* Update validateCredentials in AWSService class

* Add createKinesisClient method in KinesisService class

* Add testGetStreamCredentials and update testGetStreams

* Healthcheck merge (#81)

* empty commit to push branch

* Add Kinesis Healthcheck (#45)

* Improve organization for Flow Log message detection

* Improve Flow Log test

It now tests for a message with too many and too few spaces.

* Add TODOs for healthCheck method logic

* Add beginnings of Kinesis healthChecker

This will pull a establish a Kinesis subscription and pull a single message from a Kinesis stream.

* Fix failing unit test

* Continue developing KinesisHealthCheck

- Remove unneeded metric tracking
- Remove extra parsing logic (this object should just hand back payload and not do any parsing)
- Improve application name handling
- Add comments

* Add detection logic for raw vs. CloudWatch logs

* Remove KinesisHealthCheck class

The KinesisConsumer does not work well for the health check (designed for realtime processing, takes a long time to start, cannot detect empty stream, and is really hard to use in a quick API request). Now, we're planning to directly retrieve the messages using the Kinesis client. This is the most straight-forward thing  We might revisit this later.

* Fix JSON parsing of Kinesis CloudWatch subscription record

Parse the record just as was done in the existing AWS plugin. The logic now includes autodetection of compressed/vs not compressed. Mock Kinesis CloudWatch subscription record included for testing purposes.

* Add CloudWatch logs codec and tests from existing AWS plugin

* Parse Flow Log message into object

* Load appropriate codec during healthCheck process

When the message type is detected, load the respective codec for that message type.

* Parse message with appropriate codec

Once the log message type is detected, then the codec is looked up. The message is then parsed with the codec.

* Supply log group name with the response

* Improve comments, logging, and error checking

The log group name is now also included in the response.

* Add Flow Log codec test

* Use AutoValue for CloudWatchLogEntry class

* Use AutoValue for all remaining CloudWatch value classes

* Cleanup merge conflicts after rebasing

* Specify constants for all JsonProperty annotations

* Delete uneeded KinesisDTO

All data will be stored in the input

* Establish a base AWSRequest JSON class

* Fix Guice injection error for KinesisService

* Add sample cURL command for healthCheck method

A similar cURL command will be used for other methods, so that it is clear how the UI will use them.

* Remove unneeded Kinesis Client 1.x dependency

* Add formatted message summary in the Health Check response

* Cleanup formatting and TODOs

* Minor cleanup after rebasing and merging

* Fix failing unit tests
* Fix incorrect pass of AWS key instead of secret

Also improve comments for fake message retrieval with TODOs.

* Update and connect retrieveRecords

* Add handleCompressedMessages method

* Delete temporary main class

* Update retrieveRecords to only return sample size

* Update KinesisService for healthCheck to function properly

* Add unit test for selecting random record

* Add unit test for retrieveRecords
* Resolves #50: Add Available Services API call

* Add a test

* Add missing spaces, change Amazon > AWS
@danotorrey danotorrey changed the title AWS Cloudwatch First-pass M1 AWS backend Jun 19, 2019
@danotorrey danotorrey requested a review from bernd June 19, 2019 19:09
Dan Torrey and others added 3 commits June 25, 2019 12:20
* Require POST object containing region and credentials for all requests

Specifically adds a POST body requirement for the getKinesisSteams and getLogGroupNames methods.

* Use snake_case for paths

* Update region api call (#110)

* Migrate Regions request from a list to a full response object with total

* Update Region API call to include label and value

* Reformat code
* Add AWSPermissions class and update AWSResource with permission checks

* Rename and remove permissions in AWSPermissions

* Register AWSPermissions in IntegrationsModule

* Remove space between methods
* First-pass structure for saving AWS input

* Add more structure for general AWS input

- Add a type enum to differentiate the various types of log messages that are possible.
- Add metacodec that will eventually differentiate between the types of log messages.
- Fill configuration values when saving the input.

* Resolve merge conflicts after rebasing over latest aws branch

* Consolidate log type detection and input type identification

There's no longer a need to use two enums for this. Also added healthCheck tests covering all message types: flow log, raw Cloud Watch and raw Kinesis.

* Clean up saveInput request parameters and handling

* Fix invalid type specification that was preventing input save

* Add unit test for saving input

* Fix incorrectly specified arguments

* Minor cleanup

* Cleanup for PR review

* Remove uneeded log statements
* Make save AWS input path and description more specific

* Indicate that the save request is specifically for Kinesis

In the future, each type of AWS input will likely require it's own request object and endpoint due to the fact that unique fields will probably be required for each.

* AWSMessageType cleanup

* Remove uneeded isFlowLog, isRaw methods.
* Remove invalid AWSMessageType.Source.CLOUD_WATCH enum value. Messages are always read from Kinesis, and therefore the source is always Kinesis. Source is meant to differentiate messages from Kinesis and S3 for example.
* Improve comments for AWSMessageType.Source enum class and method.

* Remove typo

* Generify the create AWS integration endpoint

The naming, description, and comment now reflect that a generic AWS input is being created.

* Return InputSummary response entity for AWS input creation request

Also remove unneeded AWSResourceTest

* More cleanup of healthCheck after input creation changes

- Remove unneeded log_group field for health_check request. Resolves #108
- Add Kinesis stream name as a field in both raw and CloudWatch messages

This change is lumped in with the other changes related to saving the input, since lots of healthCheck changes were already made there. This fixes some problems, so might as well have these improvements included with the review.

* Clarify that AWSMetaCodec is a general AWS codec

This class no longer erroneously extends AbstractKinesisCodec, which was only intended for Kinesis-specific codecs.

* Use DateTime instead of long in KinesisLogEntry

It turns out that Kinesis Record objects do have an arrival ime Instant timestamp. This is now being used instead of just using the date/time when the message was read by Graylog.
@danotorrey danotorrey closed this Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants