Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KPL goes into a continous retry storm if the stream is deleted and re-created #318

Merged
merged 2 commits into from
Dec 10, 2020

Conversation

matchav
Copy link
Contributor

@matchav matchav commented Nov 2, 2020

Issue #, if available:
The Issue is that when a customer deletes a stream after scaling the stream for multiple times and then creates the stream with same number of shards. The KPL producer is going into continuous retry storm with error "Wrong Shard ErrorMessages: [Record did not end up in expected shard ... " but no "Record went to shard x instead of shard y.." log message.
This means, KPL has never tried to update the shard map, which is done at line 208 in retrier.cc file.

This is a bug where KPL assumes that a record can go into a different shard only when the same stream has scaled and then updates the shard Map. Hence removing this check ("The actual destination shard is newer than the predicted shard.")

Description of changes:
Hence removing the extra check "The actual destination shard is newer than the predicted shard." to update shard map. Hence the shard map will be updated every time there is invalidate on incorrect shard.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@yatins47 yatins47 closed this Nov 24, 2020
@yatins47 yatins47 deleted the branch awslabs:master November 24, 2020 15:20
@yatins47 yatins47 reopened this Nov 25, 2020
Copy link
Contributor

@isurues isurues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me. This happens only if the stream is scaled > deleted > re-created with the same number of shards.

@isurues isurues merged commit 1d2f475 into awslabs:master Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants