-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bitnami/redis] Sentinel cluster doesn't elect new master after master pod deletion #6165
Comments
Hi @wilsoniya, I have not been able to reproduce the issue, maybe is related to this issue of helm helm/helm#7997, could you check it? |
Thanks for your reply, @Mauraza :) I've never had problems with So I'd be surprised if my issue was related. Thanks again! |
Hi @wilsoniya, I was digging a little more, I think maybe your issue is related to this #3700, could you confirm that? |
@Mauraza Thank you for continuing to work with me on this. I think this comment by @dustinrue is pretty similar to what I'm seeing: #3700 (comment)
However, their message implies that eventually a new master is elected by the remaining pods, and eventually the restarted pod is able to rejoin the cluster. This isn't the behavior I'm seeing. Instead, I see the remaining two pods fail to ever elect a new master, instead thinking the IP of the old (deleted) master still exists and is valid. This causes the replacement pod to fail to start because it's resolving the IP of the deleted pod as master from the two remaining pods |
Hi, Could it be because of quorum issues? I see that there are only two pods doing the election. |
@javsalgar thanks for the reply. I don't know enough about the workings of sentinel to answer, tbh, though that sounds plausible. I believe I have quorum set at 2; wouldn't that be sufficient? |
Hi @wilsoniya, There is a new major version of the chart, could you try it? |
@Mauraza thanks for letting me know about the new major version. The upgrade seems to be a major improvement, and I wasn't able to reproduce the issue by deleting the master pod or rolling the statefulset. However, I was able to reproduce the issue by ungracefully deleting the master pod:
This resulted in the remaining sentinels continuing to think the IP of the deleted pod was master, thus preventing the new pod from discovering a functioning master. While this seems to be an improvement, I think in general we can't depend on master pods shutting down gracefully. For example, what happens if the k8s node serving the master pod suddenly disappears? |
Hi @wilsoniya, thanks for a try with the new version, I was able to reproduce it. |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
Hey @Mauraza, do you know of any updates on this issue? Are there any other issues which might represent the work to fix the underlying issue? Thanks! |
Hi @wilsoniya, sorry, is still a work in progress, when we have more information we will update the issue |
Hi @wilsoniya! We've finally got time to investigate this issue, here's our best guess: It seems closely related to #3700 (comment) still, a race condition involving the redis master sentinel. Once the pod containing the redis master server and sentinel (1) is forcefully deleted, the period until a new master is elected is:
it gets the IP of 1 since a new master has not been elected and enters on a Crashloopbackoff. Pod restarts have an exponential back off delay, so eventually this delay is bigger than the 1min 18sec and that's when the sentinels try to elect a new master. However, when trying to elect the master they enter the Mitigation: We've been able to mitigate this issue by reducing We are aware this is somewhat nondeterministic but avoiding these race conditions programatically seems not trivial, unfortunately. I'm going to mention #6320 and #6484 so they are also aware of this and if any of you folks can come up with a solution we would be happy to review your contributions! |
Looks like this works, I was able to workaround the issue by setting these two config options to
|
Sorry for my misunderstanding but what exactly to you mean by "pod restart delay" in the comment:
Do you mean the termination grace period? or the retry backoff on the original master that will not come back up? Also thank you for the configuration example:
I will try these for now but I was just wondering if they could be tweaked higher and still keep things working? Hence the question above. Thanks all! edit: could the sentinel reset command be causing an in-progress failover to abort? I am sure its there for a good reason though.... |
Hi @rlees85, Sorry, by pod restart delay I meant the time needed for the creation of the new pod after the original master dies. If the other replicas are able to elect the new master before that, then the new pod will be added as a replica of the newly elected master. Hope that clears things up! |
Can confirm this issue running Redis chart with sentinel in Azure AKS. Sometimes i can see this in the logs: redis container:
|
I experienced a similar issue where deleting node-0 caused a loop crash and no new master was elected. I tried several recommendations mentioned here and in other issues but still the same problem. In my case I debugged the sentinel log contents and found that the slaves where all registering notifying the same IP, which was the IP of the node where the current master is located, we do not have similar problems in other pods so I have no idea why the master reports for all the slaves the node IP. Due to the above IP reporting, the slaves register produced some inconsistences, the redis-cli "info" command reported "slaves=1,sentinels=3", instead of "slaves=2". This also caused several troubles for sentinel when I delete a slave pod or the master master one, which tried to reconnect to its older master IP, no master was re-elected and the whole cluster went down. I have also applied the fix mentioned in #4082 based on the "replica-announce-ip": replica: persistence: enabled: false preExecCmds: | echo "" >> #/opt/bitnami/redis/etc/replica.conf echo "replica-announce-ip $POD_IP" >> /opt/bitnami/redis/etc/replica.conf extraEnvVars: - name: "POD_IP" valueFrom: fieldRef: fieldPath: status.podIP The above fix caused each replica to correctly report their IP and slaves=2 were register, now the cluster recovers correctly after the master deletion, I hope this fix someone else problem. |
Thanks for sharing it @Jacq! @mblaschke did any of the above suggestions fixed your problem? |
I think the problem lies in the prestop-sentinel.sh script: charts/bitnami/redis/templates/scripts-configmap.yaml Lines 303 to 307 in 309c7c6
I suspect I tried to verify this by replacing # If there are more than one IP, use the first IPv4 address
if [[ "$myip" = *" "* ]]; then
myip=$(echo $myip | awk '{if ( match($0,/([0-9]+\.)([0-9]+\.)([0-9]+\.)[0-9]+/) ) { print substr($0,RSTART,RLENGTH); } }')
fi |
Thanks for that, I'll look into it and report back what I find. |
Hi, you have a syntax error here |
Thank you @rjasper-frohraum for your comment. We have noticed the same problem today. We fixed the Issue by adding this to the top of the script.
|
Just wanted to note, that the problem I described is most likely not the same as the OP's. It just has similar symptoms. To my knowledge, the |
Just wanted to drop this here, I was also facing the same issue where killing the master pod causes a race condition and the cluster was not able to elect a new master. By applying
I was able to fix the issue since a new master was now elected before the (old) master pod is resurrected and the new pod joined as a replica. I didn't need to make the |
Think this also fixed my issue. (Chart version 14.1.1) |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary. |
Fixed by #7835. |
Which chart:
Chart:
bitnami/redis
Version:
13.0.1
Describe the bug
When a master pod is manually deleted, occasionally the remaining replicas appear to continue re-electing the nonexistent master. When the replacement pod reappears, it's unable to connect to the existing master as reported by the remaining replicas, which corresponds to the IP of the now nonexistent previous master pod.
To Reproduce
I'm not able to deterministically reproduce the behavior described above. I'd say the errant behavior occurs ~20% of the time.
Steps to reproduce the behavior:
Expected behavior
When a pod is deleted the cluster members should elect a new master among themselves and the replacement pod should be able to connect to the elected master when the replacement comes online.
Version of Helm and Kubernetes:
helm version
:kubectl version
:Additional context
values
installation command
cluster log output
The output below occurs on an otherwise healthy sentinel cluster after I run
kubectl delete pod redis-node-2
(please note: the logging is collected viastern
which I believe explains theunexpected error: stream error: stream ID 19; INTERNAL_ERROR
occurrences).The text was updated successfully, but these errors were encountered: