-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcdHighNumberOfFailedGRPCRequests alert spam from kube-prometheus-stack #239
Comments
I can confirm that rolling out etcd 3.5 greatly reduces the # of
So that etcd error code fix combined with the etcd 3.5 monitoring rule improvements should be effective at reducing the alert spam. |
Will keep using a custom |
Underlying kube-prometheus-stack alerts are now updated. |
This issue has been discussed in many different places, e.g. etcd-io/etcd#13147 Basically, the etcdHighNumberOfFailedGRPCRequests rule matches canceled etcd rpcs.
https://github.com/etcd-io/etcd/blob/release-3.5/contrib/mixin/mixin.libsonnet currently has a fix for this, however it is not straight forward to include in
kube-prometheus-stack
due to prometheus-community/helm-charts#1155For now, i've silenced
etcdHighNumberOfFailedGRPCRequests
and am going with a custom alert instead from ce56fbb(This issue tracks cleanup for my specific cluster, waiting for a propper upstream fix to kube-prometheus-stack)
The text was updated successfully, but these errors were encountered: