Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamoDB: Issue/1682: Adds retries on DynamoDB failed requests as per the DynamoDB documentation #1712

Closed
wants to merge 25 commits into from

Conversation

mavericksid
Copy link
Contributor

@mavericksid mavericksid commented May 21, 2019

Pull Request Checklist

Purpose

To add retry mechanism for failed requests in DynamoDB as per the following document:
Error Handling

References

References #1682

Changes

  • Retry settings to be configured via retry.maximum-attempts, retry.initial-timeout and retry.strategy
  • Failures are recovered using a similar pattern as followed in Flow.recoverWithRetries but with the addition of TimerGraphStageLogic. Not really sure if this is even a sane solution since it's pretty new to me, will be looking forward to any suggestions :)

Background Context

It seemed to be a clean solution without making a lot of changes to the existing codebase.

* Made retries and its strategy configurable from conf
* Added retries using a similar pattern as available on
Flow.recoverWithRetries but with TimerGraphStageLogic
@mavericksid
Copy link
Contributor Author

Hi @ennru

Please review the PR once you can spare some time :)

@mavericksid
Copy link
Contributor Author

Hi @2m

Build is failing on some Cassandra step which does not concern this PR, how to rerun the build?

@2m
Copy link
Contributor

2m commented May 23, 2019

Thanks for the PR. We will take a look.

It is not a problem with unrelated test failures. We will still merge it, if the failure is of an unrelated test.

@mavericksid
Copy link
Contributor Author

mavericksid commented May 28, 2019

Hi @2m

The build failure was fixed from the last commit.

Copy link
Contributor

@2m 2m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think supporting retries for failed requests is really important for DynamoDb connector, as requests can fail quite sporadically with quota exceeded error. Backoff retries really help in that case.

This PR is a great start, and we should focus to have the retry logic decoupled from a particular connector.

import scala.compat.java8.OptionConverters._

@deprecated("Use AwsDynamoSettings instead", "")
final class DynamoSettings private (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this deprecated in favor of AwsDynamoSettings?

You can add the retrySettings to this class instead.

case _ @(_: InternalServerErrorException | _: ItemCollectionSizeLimitExceededException |
_: LimitExceededException | _: ProvisionedThroughputExceededException |
_: RequestLimitExceededException) =>
Source.single(op).via(opFlow)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to improve the retry logic here. As it currently stands, if the stage fails, the retry will happen with the op that was passed into the flowOp factory at the moment o the flow creation. So it could happen that quite a few operations went through that flow, and then one failed. The retry will happen with the operation that was initially given during the creation of the flow instead of the one, that failed.

I think we should adapt DynamoDb connector flows to be composable with a retry flow in akka-stream-contrib: https://github.com/akka/akka-stream-contrib/blob/master/src/main/scala/akka/stream/contrib/Retry.scala#L42 instead.

settings.retrySettings.backoffStrategy, {
case _ @(_: InternalServerErrorException | _: ItemCollectionSizeLimitExceededException |
_: LimitExceededException | _: ProvisionedThroughputExceededException |
_: RequestLimitExceededException) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to set supervision strategy on a stream operator as a stream attribute. We should use that instead so users can customize supervision strategies: https://doc.akka.io/docs/akka/current/stream/stream-error.html#supervision-strategies

@2m
Copy link
Contributor

2m commented Aug 20, 2019

Thank you for the initial PR. This has sparked a discussion which has materialized a general RetryFlow stage.

This PR has been superseded by #1896 where DynamoDb connector is changed to compose with a general RetryFlow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants