-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flagger constantly updates service port causing LB flapping #513
Comments
The Kubernetes service should be generated by Flagger and no other controller should mutate it. Flagger detects changes in the service definition and tries to reconcile it that's why you see all those logs. |
This is how it could work with ALB:
|
No other controllers are changing the service, I have been watching the service log and only Flagger is changing the service to reflect a port change. Are you saying it's because there was already a service beforehand and that I should only do a deployment and let flagger create the service? But I cannot create a canary object without the service definition. |
What I'm saying is the opposite. Do not create a Kubernetes service on your own, let Flagger create it based on the |
Hmmm, I see that. So, we heavily use type: LoadBalancer today, is there no way to continue to use that with Flagger? It seems to work fine on setup other than the nodePort: 31192 constantly changing. We have seen that before when things try to apply a service with no nodePort defined and kubernetes just keeps finding new ones that haven't been used. |
What happens if you disable auto discovery, as in:
Does Flagger updates the service every time or not? |
Yes, I did a capture between two updates and you can see the only thing changing is the node port. For some reason it thinks something needs to change, but doesn't actually change anything but the node port.
|
Looking in the code, I am guessing this is why?
And because the type NodePort and LoadBalancer have a nodePort that the portsDiff will always not match. And then when it applies, it doesn't have the existing nodePort because the Ports are being overwritten by: This would be a great feature to not have this overriding the nodePort everytime, would allow Flagger to be inserted for type LoadBalancer easily and NodePort applications that exist today. |
@jtolsma can you please try |
@stefanprodan I tested a few different scenarios and it looks like flagger is no longer changing the node port and I am not getting the messages about constantly wanting to update the service. Looks good! Does this need to be applied to other providers besides kubernetes? I will be looking to do istio in the future and we will still have some type loadbalancers there. |
No since all providers are using the underlying Kubernetes service code. Fixing this would fix it for all providers. |
Ah! Great! Thanks for looking at this. |
I have been playing with Flagger and I noticed that when I rollout a canary (basic kubernetes), Flagger constantly updates the service causing a change in port. Combine this with the ALB ingress controller with target groups and craziness happens with timeouts and constantly shifting target groups. Tested this was the recently release rc2 and previous 3 versions and they all show the same behavior with both regular LB and ALB. What am I missing?
Steps to reproduce:
Create hello world deployment/service here: https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address/. Adding app to the labels of course so that flagger can pick it up.
Initial deploy:
kubectl get service,deployment hello-world
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/hello-world LoadBalancer 100.67.12.162 a9836931169f811ea884d1262646600e-400633381.us-east-1.elb.amazonaws.com 8080:31517/TCP 10m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.extensions/hello-world 5/5 5 5 23m
Deploy simple canary:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: hello-world
namespace: jtolsma
spec:
analysis:
interval: 15s
iterations: 1
threshold: 1
webhooks: []
progressDeadlineSeconds: 300
provider: kubernetes
targetRef:
apiVersion: apps/v1
kind: Deployment
name: hello-world
service:
name: hello-world
port: 8080
portDiscovery: true
portName: http
targetPort: 8080
Watch logs of flagger:
{"level":"info","ts":"2020-03-19T15:58:56.133Z","caller":"flagger/main.go:110","msg":"Starting flagger version 1.0.0-rc.2 revision 52d9951 mesh provider istio"}
{"level":"info","ts":"2020-03-19T15:58:56.165Z","caller":"flagger/main.go:381","msg":"Connected to Kubernetes API v1.13.11"}
{"level":"info","ts":"2020-03-19T15:58:56.165Z","caller":"flagger/main.go:237","msg":"Waiting for canary informer cache to sync"}
{"level":"info","ts":"2020-03-19T15:58:56.265Z","caller":"flagger/main.go:244","msg":"Waiting for metric template informer cache to sync"}
{"level":"info","ts":"2020-03-19T15:58:56.365Z","caller":"flagger/main.go:251","msg":"Waiting for alert provider informer cache to sync"}
{"level":"info","ts":"2020-03-19T15:58:56.465Z","caller":"flagger/main.go:151","msg":"Watching namespace jtolsma"}
{"level":"info","ts":"2020-03-19T15:58:56.542Z","caller":"flagger/main.go:161","msg":"Connected to metrics server http://prometheus.istio-system:9090"}
{"level":"info","ts":"2020-03-19T15:58:56.542Z","caller":"flagger/main.go:326","msg":"Notifications enabled for https://hooks.slack.com/servic"}
{"level":"debug","ts":"2020-03-19T15:58:56.542Z","caller":"controller/controller.go:76","msg":"Creating event broadcaster"}
{"level":"info","ts":"2020-03-19T15:58:56.542Z","caller":"server/server.go:29","msg":"Starting HTTP server on port 8080"}
{"level":"info","ts":"2020-03-19T15:58:56.543Z","caller":"controller/controller.go:150","msg":"Starting operator"}
{"level":"info","ts":"2020-03-19T15:58:56.543Z","caller":"controller/controller.go:159","msg":"Started operator workers"}
{"level":"info","ts":"2020-03-19T16:00:34.812Z","caller":"controller/controller.go:236","msg":"Synced jtolsma/hello-world"}
{"level":"info","ts":"2020-03-19T16:00:36.585Z","caller":"router/kubernetes_default.go:137","msg":"Service hello-world-canary.jtolsma created","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:00:36.608Z","caller":"router/kubernetes_default.go:137","msg":"Service hello-world-primary.jtolsma created","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:00:36.640Z","caller":"canary/deployment_controller.go:281","msg":"Deployment hello-world-primary.jtolsma created","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:00:36.645Z","caller":"controller/events.go:27","msg":"IsPrimaryReady failed: primary daemonset hello-world-primary.jtolsma not ready: waiting for rollout to finish: observed deployment generation less then desired generation","canary":"hello-world.jtolsma"}
{"level":"debug","ts":"2020-03-19T16:00:36.646Z","logger":"event-broadcaster","caller":"record/event.go:281","msg":"Event(v1.ObjectReference{Kind:"Canary", Namespace:"jtolsma", Name:"hello-world", UID:"c4817bad-69fa-11ea-884d-1262646600e6", APIVersion:"flagger.app/v1beta1", ResourceVersion:"37092574", FieldPath:""}): type: 'Warning' reason: 'Synced' IsPrimaryReady failed: primary daemonset hello-world-primary.jtolsma not ready: waiting for rollout to finish: observed deployment generation less then desired generation"}
{"level":"info","ts":"2020-03-19T16:00:51.654Z","caller":"canary/deployment_controller.go:45","msg":"Scaling down Deployment hello-world.jtolsma","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:00:51.700Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:00:51.732Z","caller":"controller/events.go:15","msg":"Initialization done! hello-world.jtolsma","canary":"hello-world.jtolsma"}
{"level":"debug","ts":"2020-03-19T16:00:51.732Z","logger":"event-broadcaster","caller":"record/event.go:281","msg":"Event(v1.ObjectReference{Kind:"Canary", Namespace:"jtolsma", Name:"hello-world", UID:"c4817bad-69fa-11ea-884d-1262646600e6", APIVersion:"flagger.app/v1beta1", ResourceVersion:"37092574", FieldPath:""}): type: 'Normal' reason: 'Synced' Initialization done! hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:01:06.669Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:01:21.607Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:01:36.609Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:01:51.678Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:02:06.640Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
{"level":"info","ts":"2020-03-19T16:02:21.606Z","caller":"router/kubernetes_default.go:159","msg":"Service hello-world updated","canary":"hello-world.jtolsma"}
Everytime a "Service hello-world updated" message is printed, the service port changes on the service causing an LB update.
while true ; do date ; kubectl get service hello-world ; sleep 15 ; done
Thu Mar 19 12:07:54 EDT 2020
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-world LoadBalancer 100.67.12.162 a9836931169f811ea884d1262646600e-400633381.us-east-1.elb.amazonaws.com 8080:31262/TCP 22m
Thu Mar 19 12:08:09 EDT 2020
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-world LoadBalancer 100.67.12.162 a9836931169f811ea884d1262646600e-400633381.us-east-1.elb.amazonaws.com 8080:30813/TCP 23m
Thu Mar 19 12:08:24 EDT 2020
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-world LoadBalancer 100.67.12.162 a9836931169f811ea884d1262646600e-400633381.us-east-1.elb.amazonaws.com 8080:30972/TCP 23m
The text was updated successfully, but these errors were encountered: