Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/redis] Sentinel cluster doesn't elect new master after master pod deletion #6165

Closed
wilsoniya opened this issue Apr 20, 2021 · 29 comments
Labels
stale 15 days without activity

Comments

@wilsoniya
Copy link

Which chart:
Chart: bitnami/redis
Version: 13.0.1

Describe the bug
When a master pod is manually deleted, occasionally the remaining replicas appear to continue re-electing the nonexistent master. When the replacement pod reappears, it's unable to connect to the existing master as reported by the remaining replicas, which corresponds to the IP of the now nonexistent previous master pod.

To Reproduce
I'm not able to deterministically reproduce the behavior described above. I'd say the errant behavior occurs ~20% of the time.

Steps to reproduce the behavior:

  1. Create a sentinel cluster with the values below and wait for it to come online
  2. Determine which pod is master and delete it
  3. (with some probability) the replacement pod can't start redis because it can't connect to the master IP reported by the remaining sentinels because they still think the IP of the now deleted pod is still master

Expected behavior
When a pod is deleted the cluster members should elect a new master among themselves and the replacement pod should be able to connect to the elected master when the replacement comes online.

Version of Helm and Kubernetes:

  • Output of helm version:
version.BuildInfo{Version:"v3.4.0", GitCommit:"7090a89efc8a18f3d8178bf47d2462450349a004", GitTreeState:"clean", GoVersion:"go1.14.10"}
  • Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.11", GitCommit:"d94a81c724ea8e1ccc9002d89b7fe81d58f89ede", GitTreeState:"clean", BuildDate:"2020-03-12T21:08:59Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.15", GitCommit:"73dd5c840662bb066a146d0871216333181f4b64", GitTreeState:"clean", BuildDate:"2021-01-13T13:14:05Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Additional context

values
## Bitnami Redis(TM) image version
## ref: https://hub.docker.com/r/bitnami/redis/tags/
##
image:
  registry: docker.io
  repository: bitnami/redis
  ## Bitnami Redis(TM) image tag
  ## ref: https://github.com/bitnami/bitnami-docker-redis#supported-tags-and-respective-dockerfile-links
  ##
  tag: "6.2.1-debian-10-r36"
  ## Specify a imagePullPolicy
  ## Defaults to 'Always' if image tag is 'latest', else set to 'IfNotPresent'
  ## ref: http://kubernetes.io/docs/user-guide/images/#pre-pulling-images
  ##
  pullPolicy: IfNotPresent

## Cluster settings
##
cluster:
  enabled: true
  slaveCount: 3

## Use redis sentinel in the redis pod. This will disable the master and slave services and
## create one redis service with ports to the sentinel and the redis instances
##
sentinel:
  enabled: true
  ## Require password authentication on the sentinel itself
  ## ref: https://redis.io/topics/sentinel
  ##
  usePassword: true
  ## Bitnami Redis(TM) Sentintel image version
  ## ref: https://hub.docker.com/r/bitnami/redis-sentinel/tags/
  ##
  image:
    registry: docker.io
    repository: bitnami/redis-sentinel
    ## Bitnami Redis(TM) image tag
    ## ref: https://github.com/bitnami/bitnami-docker-redis-sentinel#supported-tags-and-respective-dockerfile-links
    ##
    tag: "6.2.1-debian-10-r35"
    ## Specify a imagePullPolicy
    ## Defaults to 'Always' if image tag is 'latest', else set to 'IfNotPresent'
    ## ref: http://kubernetes.io/docs/user-guide/images/#pre-pulling-images
    ##
    pullPolicy: IfNotPresent

## Use password authentication
##
usePassword: true
## Redis(TM) password (both master and slave)
## Defaults to a random 10-character alphanumeric string if not set and usePassword is true
## ref: https://github.com/bitnami/bitnami-docker-redis#setting-the-server-password-on-first-run
##
password: "password"

##
## Redis(TM) Master parameters
##
master:
  ## Comma-separated list of Redis(TM) commands to disable
  ##
  ## Can be used to disable Redis(TM) commands for security reasons.
  ## Commands will be completely disabled by renaming each to an empty string.
  ## ref: https://redis.io/topics/security#disabling-of-specific-commands
  ##
  disableCommands:
  # - FLUSHDB
  # - FLUSHALL

  ## Redis(TM) Master additional pod labels and annotations
  ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
  ##
  podLabels: {}
  podAnnotations:
    # Datadog redis metrics autodiscovery
    # See: https://docs.datadoghq.com/agent/kubernetes/integrations/?tab=kubernetes#datadog-redis-integration
    ad.datadoghq.com/redis.check_names: '["redisdb"]'
    ad.datadoghq.com/redis.init_configs: '[{}]'
    ad.datadoghq.com/redis.instances: |
      [
        {
          "host": "%%host%%",
          "port":"6379",
          "password":"{{ .Values.secrets.rms.cache.backend_config.password }}"
        }
      ]

  ## Redis(TM) Master resource requests and limits
  ## ref: http://kubernetes.io/docs/user-guide/compute-resources/
  resources:
    requests:
      memory: 512Mi
      cpu: 300m
    limits:
      memory: 1024Mi
      cpu: 600m

  ## Enable persistence using Persistent Volume Claims
  ## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
  ##
  persistence:
    enabled: false

##
## Redis(TM) Slave properties
## Note: service.type is a mandatory parameter
## The rest of the parameters are either optional or, if undefined, will inherit those declared in Redis(TM) Master
##
slave:
  ## List of Redis(TM) commands to disable
  ##
  disableCommands:
  # - FLUSHDB
  # - FLUSHALL

  ## Redis(TM) slave Resource
  resources:
    requests:
      memory: 512Mi
      cpu: 300m
    limits:
      memory: 1024Mi
      cpu: 600m

  podAnnotations:
    # Datadog redis metrics autodiscovery
    # See: https://docs.datadoghq.com/agent/kubernetes/integrations/?tab=kubernetes#datadog-redis-integration
    ad.datadoghq.com/redis.check_names: '["redisdb"]'
    ad.datadoghq.com/redis.init_configs: '[{}]'
    ad.datadoghq.com/redis.instances: |
      [
        {
          "host": "%%host%%",
          "port":"6379",
          "password":"{{ .Values.secrets.rms.cache.backend_config.password }}"
        }
      ]

  ## Enable persistence using Persistent Volume Claims
  ## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
  ##
  persistence:
    enabled: false

## Sysctl InitContainer
## used to perform sysctl operation to modify Kernel settings (needed sometimes to avoid warnings)
##
sysctlImage:
  enabled: true
  command:
    - /bin/sh
    - -c
    - |-
      sysctl -w net.core.somaxconn=10000
      echo never > /host-sys/kernel/mm/transparent_hugepage/enabled
  registry: docker.io
  repository: bitnami/bitnami-shell
  tag: "10"
  pullPolicy: Always
  mountHostSys: true
installation command
helm install redis . -f custom-values.yaml  --atomic --namespace redis-test
cluster log output

The output below occurs on an otherwise healthy sentinel cluster after I run kubectl delete pod redis-node-2 (please note: the logging is collected via stern which I believe explains the unexpected error: stream error: stream ID 19; INTERNAL_ERROR occurrences).

redis-node-2 redis 1:signal-handler (1618955501) Received SIGTERM scheduling shutdown...
redis-node-2 redis 1:M 20 Apr 2021 21:51:41.945 # User requested shutdown...
redis-node-2 redis 1:M 20 Apr 2021 21:51:41.945 * Calling fsync() on the AOF file.
redis-node-2 redis 1:M 20 Apr 2021 21:51:41.945 # Redis is now ready to exit, bye bye...
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:41.949 # Executing user requested FAILOVER of 'mymaster'
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:41.949 # +new-epoch 6
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:41.949 # +try-failover master mymaster 10.42.12.213 6379
redis-node-1 redis 1:S 20 Apr 2021 21:51:41.954 # Connection with master lost.
redis-node-1 redis 1:S 20 Apr 2021 21:51:41.954 * Caching the disconnected master state.
redis-node-1 redis 1:S 20 Apr 2021 21:51:41.954 * Reconnecting to MASTER 10.42.12.213:6379
redis-node-1 redis 1:S 20 Apr 2021 21:51:41.954 * MASTER <-> REPLICA sync started
redis-node-1 redis 1:S 20 Apr 2021 21:51:41.955 # Error condition on socket for SYNC: Connection refused
redis-node-0 redis 1:S 20 Apr 2021 21:51:41.952 # Connection with master lost.
redis-node-0 redis 1:S 20 Apr 2021 21:51:41.952 * Caching the disconnected master state.
redis-node-0 redis 1:S 20 Apr 2021 21:51:41.952 * Reconnecting to MASTER 10.42.12.213:6379
redis-node-0 redis 1:S 20 Apr 2021 21:51:41.952 * MASTER <-> REPLICA sync started
redis-node-0 redis 1:S 20 Apr 2021 21:51:41.953 # Error condition on socket for SYNC: Connection refused
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:41.993 # +vote-for-leader bc33c65f6d573da2c50da570ccf4dc629a32426d 6
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:41.993 # +elected-leader master mymaster 10.42.12.213 6379
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:41.993 # +failover-state-select-slave master mymaster 10.42.12.213 6379
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:42.054 # +selected-slave slave 10.42.9.18:6379 10.42.9.18 6379 @ mymaster 10.42.12.213 6379
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:42.054 * +failover-state-send-slaveof-noone slave 10.42.9.18:6379 10.42.9.18 6379 @ mymaster 10.42.12.213 6379
redis-node-2 sentinel 1:signal-handler (1618955502) Received SIGTERM scheduling shutdown...
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:42.121 # User requested shutdown...
redis-node-2 sentinel 1:X 20 Apr 2021 21:51:42.121 # Sentinel is now ready to exit, bye bye...
redis-node-0 redis 1:S 20 Apr 2021 21:51:42.175 * Connecting to MASTER 10.42.12.213:6379
redis-node-0 redis 1:S 20 Apr 2021 21:51:42.175 * MASTER <-> REPLICA sync started
redis-node-0 redis 1:S 20 Apr 2021 21:51:42.177 # Error condition on socket for SYNC: Connection refused
redis-node-1 redis 1:S 20 Apr 2021 21:51:42.310 * Connecting to MASTER 10.42.12.213:6379
redis-node-1 redis 1:S 20 Apr 2021 21:51:42.310 * MASTER <-> REPLICA sync started
redis-node-1 redis 1:S 20 Apr 2021 21:51:42.312 # Error condition on socket for SYNC: Connection refused
- redis-node-2 › redis
- redis-node-2 › sentinel
redis-node-0 redis 1:S 20 Apr 2021 21:51:43.185 * Connecting to MASTER 10.42.12.213:6379
redis-node-0 redis 1:S 20 Apr 2021 21:51:43.185 * MASTER <-> REPLICA sync started
redis-node-1 redis 1:S 20 Apr 2021 21:51:43.328 * Connecting to MASTER 10.42.12.213:6379
redis-node-1 redis 1:S 20 Apr 2021 21:51:43.328 * MASTER <-> REPLICA sync started
redis-node-1 sentinel 1:X 20 Apr 2021 21:51:52.310 # +reset-master master mymaster 10.42.12.213 6379
+ redis-node-2 › sentinel
+ redis-node-2 › redis
redis-node-2 sentinel  21:51:52.24 INFO  ==> redis-headless.redis-test.svc.cluster.local has my IP: 10.42.12.214
redis-node-2 sentinel  21:51:52.29 INFO  ==> Cleaning sentinels in sentinel node: 10.42.9.18
redis-node-2 sentinel Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-node-2 sentinel 1
redis-node-2 redis  21:51:51.92 INFO  ==> redis-headless.redis-test.svc.cluster.local has my IP: 10.42.12.214
redis-node-2 redis Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-node-2 redis Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-node-1 sentinel 1:X 20 Apr 2021 21:51:53.211 * +sentinel sentinel b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 10.42.16.216 26379 @ mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:51:57.322 # +reset-master master mymaster 10.42.12.213 6379
redis-node-2 sentinel  21:51:57.31 INFO  ==> Cleaning sentinels in sentinel node: 10.42.16.216
redis-node-2 sentinel Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-node-2 sentinel 1
redis-node-0 sentinel 1:X 20 Apr 2021 21:51:59.379 * +sentinel sentinel 11f8f53ef3e904a0cfe2822709d6d6ca611daaf6 10.42.9.18 26379 @ mymaster 10.42.12.213 6379
redis-node-2 sentinel  21:52:02.32 INFO  ==> Sentinels clean up done
redis-node-2 sentinel Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-node-2 sentinel Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:12.350 # +sdown master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.333 # +sdown master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.388 # +odown master mymaster 10.42.12.213 6379 #quorum 2/2
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.388 # +new-epoch 6
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.388 # +try-failover master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.397 # +vote-for-leader b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 6
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:17.407 # +new-epoch 6
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.420 # 11f8f53ef3e904a0cfe2822709d6d6ca611daaf6 voted for b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 6
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:17.422 # +vote-for-leader b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 6
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.480 # +elected-leader master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.480 # +failover-state-select-slave master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.556 # -failover-abort-no-good-slave master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:17.623 # Next failover delay: I will not start a failover before Tue Apr 20 21:52:54 2021
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:17.716 # +odown master mymaster 10.42.12.213 6379 #quorum 2/2
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:17.716 # Next failover delay: I will not start a failover before Tue Apr 20 21:52:53 2021
unexpected error: stream error: stream ID 19; INTERNAL_ERROR
unexpected error: stream error: stream ID 29; INTERNAL_ERROR
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:49.303 # +reset-master master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:49.700 # -odown master mymaster 10.42.12.213 6379
redis-node-1 sentinel 1:X 20 Apr 2021 21:52:50.232 * +sentinel sentinel b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 10.42.16.216 26379 @ mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:54.314 # +reset-master master mymaster 10.42.12.213 6379
unexpected error: stream error: stream ID 33; INTERNAL_ERROR
redis-node-0 sentinel 1:X 20 Apr 2021 21:52:54.329 * +sentinel sentinel 11f8f53ef3e904a0cfe2822709d6d6ca611daaf6 10.42.9.18 26379 @ mymaster 10.42.12.213 6379
redis-node-1 sentinel 1:X 20 Apr 2021 21:53:09.384 # +sdown master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.328 # +sdown master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.411 # +odown master mymaster 10.42.12.213 6379 #quorum 2/2
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.411 # +new-epoch 7
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.411 # +try-failover master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.422 # +vote-for-leader b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 7
redis-node-1 sentinel 1:X 20 Apr 2021 21:53:14.437 # +new-epoch 7
redis-node-1 sentinel 1:X 20 Apr 2021 21:53:14.450 # +vote-for-leader b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 7
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.448 # 11f8f53ef3e904a0cfe2822709d6d6ca611daaf6 voted for b942a249aa6aaca842ead4ff6ad2fd01cdd6797b 7
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.488 # +elected-leader master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.488 # +failover-state-select-slave master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.550 # -failover-abort-no-good-slave master mymaster 10.42.12.213 6379
redis-node-0 sentinel 1:X 20 Apr 2021 21:53:14.640 # Next failover delay: I will not start a failover before Tue Apr 20 21:53:51 2021
redis-node-1 sentinel 1:X 20 Apr 2021 21:53:14.695 # +odown master mymaster 10.42.12.213 6379 #quorum 2/2
redis-node-1 sentinel 1:X 20 Apr 2021 21:53:14.695 # Next failover delay: I will not start a failover before Tue Apr 20 21:53:51 2021
- redis-node-2 › redis
- redis-node-2 › sentinel
+ redis-node-2 › sentinel
+ redis-node-2 › redis

@Mauraza
Copy link
Contributor

Mauraza commented Apr 21, 2021

Hi @wilsoniya,

I have not been able to reproduce the issue, maybe is related to this issue of helm helm/helm#7997, could you check it?

@wilsoniya
Copy link
Author

Thanks for your reply, @Mauraza :)

I've never had problems with helm install or helm list like those mentioned in the issue you referenced. That is, I never see helm commands return errors mentioning Context timeouts, or which take a long time. The redis charts I install always result in a healthy sentinel cluster. It's only after deleting the master pod that I sometimes see my issue occur, and deleting a single pod only involves a kubectl command, not helm.

So I'd be surprised if my issue was related. Thanks again!

@Mauraza
Copy link
Contributor

Mauraza commented Apr 22, 2021

Hi @wilsoniya,

I was digging a little more, I think maybe your issue is related to this #3700, could you confirm that?

@wilsoniya
Copy link
Author

@Mauraza Thank you for continuing to work with me on this.

I think this comment by @dustinrue is pretty similar to what I'm seeing: #3700 (comment)

After more digging I discovered that the new pod is getting the old master info back because the remaining pods haven't yet selected a new master. The new pod then gets stuck, unable to determine what to connect to in order to move forward. I put in a PR that just causes the liveness check to fail and force the sentinel container to restart. Hopefully once this has happened the remaining pods have selected a new master.

However, their message implies that eventually a new master is elected by the remaining pods, and eventually the restarted pod is able to rejoin the cluster.

This isn't the behavior I'm seeing. Instead, I see the remaining two pods fail to ever elect a new master, instead thinking the IP of the old (deleted) master still exists and is valid. This causes the replacement pod to fail to start because it's resolving the IP of the deleted pod as master from the two remaining pods

@javsalgar
Copy link
Contributor

javsalgar commented Apr 23, 2021

Hi,

Could it be because of quorum issues? I see that there are only two pods doing the election.

@wilsoniya
Copy link
Author

@javsalgar thanks for the reply.

I don't know enough about the workings of sentinel to answer, tbh, though that sounds plausible. I believe I have quorum set at 2; wouldn't that be sufficient?

@Mauraza
Copy link
Contributor

Mauraza commented Apr 26, 2021

Hi @wilsoniya,

There is a new major version of the chart, could you try it?

@wilsoniya
Copy link
Author

@Mauraza thanks for letting me know about the new major version.

The upgrade seems to be a major improvement, and I wasn't able to reproduce the issue by deleting the master pod or rolling the statefulset.

However, I was able to reproduce the issue by ungracefully deleting the master pod:

kubectl delete pod redis-node-0 --force --grace-period=0 

This resulted in the remaining sentinels continuing to think the IP of the deleted pod was master, thus preventing the new pod from discovering a functioning master.

While this seems to be an improvement, I think in general we can't depend on master pods shutting down gracefully. For example, what happens if the k8s node serving the master pod suddenly disappears?

@Mauraza
Copy link
Contributor

Mauraza commented Apr 30, 2021

Hi @wilsoniya,

thanks for a try with the new version, I was able to reproduce it.
I'm going to create an internal task to investigate this. We update this thread when we have more information.

@github-actions
Copy link

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label May 16, 2021
@wilsoniya
Copy link
Author

Hey @Mauraza, do you know of any updates on this issue? Are there any other issues which might represent the work to fix the underlying issue?

Thanks!

@Mauraza Mauraza added on-hold Issues or Pull Requests with this label will never be considered stale and removed stale 15 days without activity labels May 17, 2021
@Mauraza
Copy link
Contributor

Mauraza commented May 18, 2021

Hi @wilsoniya,

sorry, is still a work in progress, when we have more information we will update the issue

@pablogalegoc
Copy link
Contributor

Hi @wilsoniya! We've finally got time to investigate this issue, here's our best guess:

It seems closely related to #3700 (comment) still, a race condition involving the redis master sentinel. Once the pod containing the redis master server and sentinel (1) is forcefully deleted, the period until a new master is elected is: sentinel.downAfterMilliseconds + sentinel.failoverTimeout (which in this case and defaults to 60000ms + 18000ms). This is 1min and 18sec where the sentinels of the replicas (pods 2 and 3) think that the killed master (1) still lives, so when a new pod is up (4) and its sentinel

  1. issues a SENTINEL RESET to all the other sentinels, then
  2. queries 2 and 3 for the master

it gets the IP of 1 since a new master has not been elected and enters on a Crashloopbackoff. Pod restarts have an exponential back off delay, so eventually this delay is bigger than the 1min 18sec and that's when the sentinels try to elect a new master. However, when trying to elect the master they enter the -failover-abort-no-good-slave loop. So it then becomes a question of “is there a good candidate to be master?” (redis/redis#7825). Unfortunately, we've not been able to debug the specific reason why replicas 2 and 3 end up not being suitable to be promoted to master.

Mitigation: We've been able to mitigate this issue by reducing sentinel.downAfterMilliseconds and sentinel.failoverTimeout to beat the pod restart delay.

We are aware this is somewhat nondeterministic but avoiding these race conditions programatically seems not trivial, unfortunately. I'm going to mention #6320 and #6484 so they are also aware of this and if any of you folks can come up with a solution we would be happy to review your contributions!

@pablogalegoc pablogalegoc removed the on-hold Issues or Pull Requests with this label will never be considered stale label Jun 2, 2021
@bluecrabs007
Copy link

Mitigation: We've been able to mitigate this issue by reducing sentinel.downAfterMilliseconds and sentinel.failoverTimeout to beat the pod restart delay.

Looks like this works, I was able to workaround the issue by setting these two config options to

-  downAfterMilliseconds: 60000
-  failoverTimeout: 18000
+  downAfterMilliseconds: 4000
+  failoverTimeout: 2000 

@rlees85
Copy link

rlees85 commented Jun 24, 2021

Sorry for my misunderstanding but what exactly to you mean by "pod restart delay" in the comment:

Mitigation: We've been able to mitigate this issue by reducing sentinel.downAfterMilliseconds and sentinel.failoverTimeout to beat the pod restart delay.

Do you mean the termination grace period? or the retry backoff on the original master that will not come back up?

Also thank you for the configuration example:

-  downAfterMilliseconds: 60000
-  failoverTimeout: 18000
+  downAfterMilliseconds: 4000
+  failoverTimeout: 2000 

I will try these for now but I was just wondering if they could be tweaked higher and still keep things working? Hence the question above.

Thanks all!

edit: could the sentinel reset command be causing an in-progress failover to abort? I am sure its there for a good reason though....

@pablogalegoc
Copy link
Contributor

Hi @rlees85,

Sorry, by pod restart delay I meant the time needed for the creation of the new pod after the original master dies. If the other replicas are able to elect the new master before that, then the new pod will be added as a replica of the newly elected master. Hope that clears things up!

@mblaschke
Copy link
Contributor

mblaschke commented Jun 29, 2021

Can confirm this issue running Redis chart with sentinel in Azure AKS.

Sometimes i can see this in the logs:

redis container:

Could not connect to Redis at redis.XXXXX.svc.cluster.local:26379: Connection timed out            
Could not connect to Redis at -p:6379: Name or service not known

@Jacq
Copy link

Jacq commented Jun 30, 2021

I experienced a similar issue where deleting node-0 caused a loop crash and no new master was elected. I tried several recommendations mentioned here and in other issues but still the same problem.

In my case I debugged the sentinel log contents and found that the slaves where all registering notifying the same IP, which was the IP of the node where the current master is located, we do not have similar problems in other pods so I have no idea why the master reports for all the slaves the node IP.

Due to the above IP reporting, the slaves register produced some inconsistences, the redis-cli "info" command reported "slaves=1,sentinels=3", instead of "slaves=2". This also caused several troubles for sentinel when I delete a slave pod or the master master one, which tried to reconnect to its older master IP, no master was re-elected and the whole cluster went down.
I have applied @bluecrabs007 delay reduction but still, it did not fix the problem.

I have also applied the fix mentioned in #4082 based on the "replica-announce-ip":

replica:
  persistence:
    enabled: false
  preExecCmds:  |
    echo "" >>  #/opt/bitnami/redis/etc/replica.conf
    echo "replica-announce-ip $POD_IP" >> /opt/bitnami/redis/etc/replica.conf
  extraEnvVars:
    - name: "POD_IP"
        valueFrom:
          fieldRef:
            fieldPath: status.podIP

The above fix caused each replica to correctly report their IP and slaves=2 were register, now the cluster recovers correctly after the master deletion, I hope this fix someone else problem.
Cheers

@pablogalegoc
Copy link
Contributor

Thanks for sharing it @Jacq!

@mblaschke did any of the above suggestions fixed your problem?

@rjasper-frohraum
Copy link

rjasper-frohraum commented Jul 7, 2021

I think the problem lies in the prestop-sentinel.sh script:

failover_finished() {
REDIS_SENTINEL_INFO=($(run_sentinel_command get-master-addr-by-name "{{ .Values.sentinel.masterSet }}"))
REDIS_MASTER_HOST="${REDIS_SENTINEL_INFO[0]}"
[[ "$REDIS_MASTER_HOST" != "${myip}" ]]
}

I suspect $myip is not set at that point.

I tried to verify this by replacing ${myip} with $(hostname -i) which worked for me. Nevertheless, I think a proper solution should use something like the snippet below similar to the other scripts:

    # If there are more than one IP, use the first IPv4 address
    if [[ "$myip" = *" "* ]]; then
        myip=$(echo $myip | awk '{if ( match($0,/([0-9]+\.)([0-9]+\.)([0-9]+\.)[0-9]+/) ) { print substr($0,RSTART,RLENGTH); } }')
    fi

@pablogalegoc
Copy link
Contributor

Hi @rjasper-frohraum!

Thanks for that, I'll look into it and report back what I find.

@Vanosz
Copy link

Vanosz commented Jul 13, 2021

I experienced a similar issue where deleting node-0 caused a loop crash and no new master was elected. I tried several recommendations mentioned here and in other issues but still the same problem.

In my case I debugged the sentinel log contents and found that the slaves where all registering notifying the same IP, which was the IP of the node where the current master is located, we do not have similar problems in other pods so I have no idea why the master reports for all the slaves the node IP.

Due to the above IP reporting, the slaves register produced some inconsistences, the redis-cli "info" command reported "slaves=1,sentinels=3", instead of "slaves=2". This also caused several troubles for sentinel when I delete a slave pod or the master master one, which tried to reconnect to its older master IP, no master was re-elected and the whole cluster went down.
I have applied @bluecrabs007 delay reduction but still, it did not fix the problem.

I have also applied the fix mentioned in #4082 based on the "replica-announce-ip":

replica:
persistence:
enabled: false
preExecCmds: |
echo "" >> #/opt/bitnami/redis/etc/replica.conf
echo "replica-announce-ip $POD_IP" >> /opt/bitnami/redis/etc/replica.conf
extraEnvVars:
- name: "POD_IP"
valueFrom:
fieldRef:
fieldPath: status.podIP
The above fix caused each replica to correctly report their IP and slaves=2 were register, now the cluster recovers correctly after the master deletion, I hope this fix someone else problem.
Cheers

Hi, you have a syntax error here
echo "" >> #/opt/bitnami/redis/etc/replica.conf
as I see, should be without '#'
I've tried to use your method and have good results (in old points), BTW I'm not testing it in all cases but will give soon the updated info

@ThWoywod
Copy link

Thank you @rjasper-frohraum for your comment. We have noticed the same problem today.
In my opinion, the real problem here is that myip is never set in the "prestop-sentinel.sh" script. (https://github.com/bitnami/charts/blob/master/bitnami/redis/templates/scripts-configmap.yaml#L290)

We fixed the Issue by adding this to the top of the script.

    myip=$(hostname -i)
    
    # If there are more than one IP, use the first IPv4 address
    if [[ "$myip" = *" "* ]]; then
        myip=$(echo $myip | awk '{if ( match($0,/([0-9]+\.){3}[0-9]+/) ) { print substr($0,RSTART,RLENGTH); } }')
    fi

@rjasper-frohraum
Copy link

Just wanted to note, that the problem I described is most likely not the same as the OP's. It just has similar symptoms. To my knowledge, the $myip bug was introduced in chart version 14.2.0 (OP's is 13.0.1).

@rush-skills
Copy link

Just wanted to drop this here, I was also facing the same issue where killing the master pod causes a race condition and the cluster was not able to elect a new master. By applying

sentinel:
  downAfterMilliseconds: 10000 
  failoverTimeout: 5000 
  livenessProbe:
    enabled: true
    initialDelaySeconds: 120

I was able to fix the issue since a new master was now elected before the (old) master pod is resurrected and the new pod joined as a replica. I didn't need to make the myip changes however (Chart version 14.6.2)

@manisha-tanwar
Copy link

Just wanted to drop this here, I was also facing the same issue where killing the master pod causes a race condition and the cluster was not able to elect a new master. By applying

sentinel:
  downAfterMilliseconds: 10000 
  failoverTimeout: 5000 
  livenessProbe:
    enabled: true
    initialDelaySeconds: 120

I was able to fix the issue since a new master was now elected before the (old) master pod is resurrected and the new pod joined as a replica. I didn't need to make the myip changes however (Chart version 14.6.2)

Think this also fixed my issue. (Chart version 14.1.1)

f-w added a commit to bcgov/NotifyBC that referenced this issue Jul 28, 2021
@github-actions
Copy link

github-actions bot commented Aug 5, 2021

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label Aug 5, 2021
@github-actions
Copy link

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

@igorwwwwwwwwwwwwwwwwwwww
Copy link
Contributor

Fixed by #7835.

mhaswell-bcgov pushed a commit to bcgov/des-notifybc-helmonly that referenced this issue Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale 15 days without activity
Projects
None yet
Development

Successfully merging a pull request may close this issue.