-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return 503 on empty backend #1748
Return 503 on empty backend #1748
Conversation
3cccc1d
to
3577218
Compare
@marco-jantke If I look at the Kubernetes provider (which we kind of implicitly used as the canonical implementation for a proper 404/503 distinction), what we do there is start off with a non-empty backend as soon as we have a legitimate Ingress, remove the entire frontend if the Service turns out to be missing, and populate the backend servers if we have found proper Endpoints. With this approach, the Kubernetes provider produces So overall, this should work. @dtomcej WDYT? |
3577218
to
5bb3e69
Compare
I agree with the scenario @timoreimann proposed. |
Ok, when I understand it correctly, than this PR should actually deliver the needs we have for the 503 responses. 👍 for the quick responses, thanks a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One test design suggestion, but otherwise :+1
server/server_test.go
Outdated
}, | ||
}, | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a parameterizable helper function or (maybe even better) apply the builder pattern to construct the frontends and backends for test purposes?
We have tests like the ones in provider/kubernetes/kubernetes_test.go
which are using inline structs so extensively, it has become fairly hard to understand and extend the tests. I'd like to make sure we're not doing the same for newish tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-jantke see the excellent article from @vdemeester http://vincent.demeester.fr/posts/2017-01-01-go-testing-functionnal-builders/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice article recommendation. I applied the pattern there and I think the readability of the tests improved quite a lot. Furthermore I tried to keep the builder generic, so that they can easily be reused and extended by other tests in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big improvement indeed! Me ❤️ it too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@timoreimann what about adding logs ? an "empty backend" is not a nominal state, and should throw a WARNING, just like events "no route to host" or "connect timeout" |
@djalal the events preceeding a potential 503 should eventually be logged in each provider. The Kubernetes one does this already, others should follow. I think this should be the first priority as you likely want to have a logging notification before messages actually get routed incorrectly. I saw you created a dedicated issue too, so I'll place the second part of my answer there. :-) |
We still have to complete the design review :)
@emilevauge what part of the design review is still missing from your perspective? We seem to have discussed the topic to great length on Slack. |
Is there anything I can do to bring this PR forward? |
@marco-jantke Should we return 503 each time |
@emilevauge I checked the implementation of the load balancers in detail now. The case I am describing holds true for the RoundRobin LB as well as for the Rebalancer LB implementation. They call the error handler you pass them basically in two cases:
I think for case number 2 even the former implementation is not complete. I can't see anything that would invalidate the cookie again at some point and clients with that cookie won't be able to connect to a new Server, when the server list of the Backend changes. AFAICS we should just remove the error handling in the case this happens and try to connect to a new Server in the backend list, for the following two places. When it can select a new Server it will set the cookie again, effectively overriding the old cookie and in case no Server could be found we want to still have our 503: WDYT about it? |
@marco-jantke I'm sorry but I don't agree to use |
I get your point. I actually did not perceive using the LoadBalancers As additional point, WDYT about the "potential problem" I described above regarding sticky sessions and a change of the Server list?
I had no time yet to proof this point and I am not sure whether I will have time in the near future as sticky sessions are not on my list of required features, but maybe it is worth opening an issue for this to enable further investigation in the future or by others? |
Indeed, that's a simple solution. The only issue is that healthchecks can add/remove servers from a lb dynamically. So this solution can't be that simple :) On the sticky session thing, couldn't we just make the cookie expire in this case? |
From what I can see, the sticky session implementation does reroute to a different server if the one associated with the cookie is knowingly gone. Here's the flow how I see it:
|
975dbf0
to
b04da6d
Compare
@timoreimann thanks for the clarification on the cookie issue! I in fact misinterpreted the behaviour and apart of the already open issues there seems to nothing wrong. To clarify a bit more, it is right to use the |
@marco-jantke agreed to moving the cookie discussion to a different issue. Final comment: |
@marco-jantke can we also agree to change the design, and directly use the number of servers in backend (being careful to health-checks) ? |
Yes, I had a look at the implementation and this sounds reasonable to me. So I would write a middleware that gets a I will update this PR accordingly. |
ab87dad
to
fcf7761
Compare
@emilevauge @timoreimann PTAL again. I refactored the implementation now to a middleware that knows about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit-pick left.
I really like the middleware-approach. ❤️ Middleware all the things!
handler := NewEmptyBackendHandler(&healthCheckLoadBalancer{test.amountServer}, nextHandler) | ||
|
||
recorder := httptest.NewRecorder() | ||
req := httptest.NewRequest("GET", "http://localhost", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/"GET"/http.MethodGet/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always.. So thanks for reminding me constantly Timo. At some point I will also learn it, don't give up the hope :D 🐼
Oops, missed that design review is still outstanding. Withdrawing my approval for formal reasons, even though I like the PR from both design and code review perspectives.
Design LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reinstating my code review LGTM. 👍
2ca3ff7
to
f1cad9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐯
f1cad9d
to
0c2a899
Compare
This PR changes the response code Traefik sends to 503, given it has an empty backend.
This should resolve the issue #1688. In the issue it is furthermore mentioned that a 503 should also be sent when a backend is completely missing. At the moment though, a frontend is not loaded in the
loadConfig
method, given there is no dedicated backend for it. See here. This looks to me that this case is actually not intentioned to happen. @dtomcej and @timoreimann you had quite some discussions about this topic, WDYT about that case?Fixes #1688