-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer group offset can reset during rebalance if underreplicated #1181
Comments
@muirrn isn't consumer reading till the high-watermark so that means the data is available in replica too. |
It isn't in our server config so it should be false by default for our version of kafka. I think the issue is the consumer group reads the group's offset from the group's coordinator broker, but |
Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. |
Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. |
Fixed by #2252 and the |
Versions
Sarama Version: 7479983
Kafka Version: 1.0, 2.0
Go Version: 1.11
Problem Description
If you have underreplication for whatever reason (e.g. publishing messages at sarama.WaitForLocal instead of WaitForAll), a rebalance can end up resetting a consumer group partition offset back to the initial position. This happens when you consume up to, say, offset 100 on replica A, but replica B only has data up to 99 due to temporary underreplication. When rebalancing to replica B, the client will return
ErrOffsetOutOfRange
trying to subscribe to offset 100, and that causes the consumer to reset to the initial offset.Is there a reason ErrOffsetOutOfRange is not propagated up to the user? Are there any cases the user would want to silently reset the consumer offset?
Note that we experienced this behavior using sarama-cluster. I am not able to reproduce the error consistently and have not reproduced it with the sarama consumer yet. However, the code seems to behave the same as sarama-cluster in this regard.
The text was updated successfully, but these errors were encountered: