-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many prometheus metrics disappeared after upgrade #3053
Comments
This works as expected. The prometheus metrics are not persistent. That means when you upgrade the version, there are no stats. You need traffic to get data from prometheus |
But I do have traffic. This nginx IC is used in production and is working fine. The upgrade was over 24 hours ago and still no metrics. Are you sure there is no regression in the code?... Note that I have |
The metrics work with or without dynamic mode |
I was able to reproduce in the dev environment. Downgrade to 0.18 -> it works. Go back to 0.19 -> metrics gone. |
I hit this as well. I was running 0.17.1 and upgraded to 0.19.0, and several metrics apparently stopped being reported by the metrics endpoint. As mentioned by @gjcarneiro, downgrading to 0.18.0 also restored the lost metrics on my end (I used the latest Metrics dump for both versions (some labels were redacted): I checked the changelog between 0.18.0 and 0.19.0, but could not find any change that would explain this, so any help is greatly appreciated. |
I have the same problem. After upgrade to 0.19.0 i lost many metrics. |
1 similar comment
I have the same problem. After upgrade to 0.19.0 i lost many metrics. |
Same here. New installation on 0.19.0 and I've been struggling through all the examples of metrics online without seeing them in my nginx-ingress controller, now I've seen this ticket. |
I used a custom template configuration |
I'm not using custom templates in my configuration. In my case, as I said before, I tried the exact same configuration with the exact same |
I've just hit this too.... Nginx Ingress: 0.20.0 Any work around? edit: using standard templates, but with custom snippets |
I have the same problem in our production environment. Prometheus can't collect many metrics.
Also |
I'm using the default setting for |
Same issue here on |
it works fine on |
I have the same issue with 0.19.0 and 0.20.0, No problem with 0.18.0 |
Is everyone missing only Nginx metrics and have no issue with controller metrics? If not is there a pattern what metrics are missing? Are you using custom template? Do you see one or more of the following messages in the logs? Do you see any other Nginx error in the logs? Can you strace a Nginx worker and see whether it's writing to
Can you also strace controller process and see whether it's reading from the same socket? |
I'm hitting the same issue on Version 0.20.0: With |
@ElvinEfendi I think I can provide some more context. We noticed this after upgrading from chart version 0.17.2 to 0.29.1. We do indeed have (I don't think v0.17.2 or 0.29.1 is particularly important, it's just what we had deployed. As others have noted, this problem seems to have first appeared in chart version 0.19 and only with dynamic configuration disabled.) So I took a snapshot of
I then did a diff on the metric names: So to answer your first set of questions, we're definitely getting some metrics. Also appears the naming scheme changed a bit between these releases. We used to be able to, say, get the total number of responses by
But I don't see any way to accomplish this given the new set of metrics. As to your second set of questions, we did not see any instance of I've haven't spent too much time digging into the running containers yet, but with chart 0.17.2 there actually is no |
We experience the same issue here with 0.19.0 version, with custom nginx.tmpl we need that |
From the changelog, it seems that To be honest, I've had a poor experience of dynamic configuration enabled, so I have a lot of misgivings about this development path. So, I guess if |
@gjcarneiro I know it can be frustrating when things don't work as expected. We are doing our best to make ingress-nginx better. There were valid reasons to switch to dynamic mode and many users benefit from it. Going forward to support both modes is not feasible for us few maintainers. I see that you've referenced previous version's changelog, but have you tried the latest version https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.21.0 (dynamic mode only)? We have fixed several bugs in that release.
Instead of taking a step backwards and going to non dynamic mode can you try the latest version ( |
I am facing the same issue on nginx-ingress-controller:0.20.0. All the major upstream related metrics and nginx are gone from the scrape service endpoints. Is this issue got addressed in latest version (0.21.0). Any permanent solutions ? . |
We have different k8s clusters with ingress-nginx 0.21.0. (kubernetes 1.9.10) The problem is that keys I am just getting /metrics from ingress-nginx via Kubernetes API (with kubectl proxy). We run nginx-ingress-controller with parameters like:
Configuration for different clusters almost the same (except domains/certificates etc) The main difference I can see between clusters is the errors like:
for cluster where If somebody have any suggestions I'll be happy to hear them. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Can somebody share the resolution? |
Well, meanwhile I upgraded to the latest version, and all metrics are there. I'm pretty sure this is fixed now. No idea which exact version fixed it. |
@gjcarneiro which version do you currently use? [redacted] redacted@ip-redacted:$ curl -sS 192.169.192.75:10254/metrics | wc -l One deployment with 2 replicas, one of the pods is missing the ssl expiry metrics which i am interested in. This happens regardless of how many times i will recreate the pod. |
Also using 0.26.1. I don't know if all the metrics have been preserved, but they're essentially there. For the case of ssl expiry, we have metrics:
|
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): no
What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.): grafana, prometheus
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
NGINX Ingress controller version: 0.19
Kubernetes version (use
kubectl version
): 1.11.0Environment:
uname -a
):quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.19.0
What happened:
I was on 0.17, I had a nice looking grafana dashboard. When I upgraded to 0.19, half of the panels have no data.
What you expected to happen:
metrics shouldn't disappear on upgrade.
How to reproduce it (as minimally and precisely as possible):
Scrape the prometheus metrics endpoint:
The grep returns empty. I have many metrics, but many are also missing. Here are the metrics it is returning now:
metrics.txt
Anything else we need to know:
Container arguments:
And configmap:
The text was updated successfully, but these errors were encountered: