Return 503 on empty backend #1748

m3co-code · 2017-06-14T10:12:15Z

This PR changes the response code Traefik sends to 503, given it has an empty backend.

This should resolve the issue #1688. In the issue it is furthermore mentioned that a 503 should also be sent when a backend is completely missing. At the moment though, a frontend is not loaded in the loadConfig method, given there is no dedicated backend for it. See here. This looks to me that this case is actually not intentioned to happen. @dtomcej and @timoreimann you had quite some discussions about this topic, WDYT about that case?

Fixes #1688

timoreimann · 2017-06-14T10:57:37Z

@marco-jantke If I look at the Kubernetes provider (which we kind of implicitly used as the canonical implementation for a proper 404/503 distinction), what we do there is start off with a non-empty backend as soon as we have a legitimate Ingress, remove the entire frontend if the Service turns out to be missing, and populate the backend servers if we have found proper Endpoints. With this approach, the Kubernetes provider produces nil backends only if the Ingress doesn't include any paths (because this loop will never enter its body), and otherwise creates backends that fit the specification.

So overall, this should work. nil backends not causing a frontend to be built (and, in turn, leading to a 404) sounds like a reasonable assumption since it's easy to end up with an error. Reaching a 503 should require some (minimum) effort with each provider.

@dtomcej WDYT?

dtomcej · 2017-06-14T15:44:53Z

I agree with the scenario @timoreimann proposed.

m3co-code · 2017-06-14T15:49:05Z

Ok, when I understand it correctly, than this PR should actually deliver the needs we have for the 503 responses. 👍 for the quick responses, thanks a lot!

timoreimann

One test design suggestion, but otherwise :+1

timoreimann · 2017-06-14T16:45:10Z

server/server_test.go

+						},
+					},
+				}
+			},


Should we add a parameterizable helper function or (maybe even better) apply the builder pattern to construct the frontends and backends for test purposes?

We have tests like the ones in provider/kubernetes/kubernetes_test.go which are using inline structs so extensively, it has become fairly hard to understand and extend the tests. I'd like to make sure we're not doing the same for newish tests.

@marco-jantke see the excellent article from @vdemeester http://vincent.demeester.fr/posts/2017-01-01-go-testing-functionnal-builders/

Thanks for the nice article recommendation. I applied the pattern there and I think the readability of the tests improved quite a lot. Furthermore I tried to keep the builder generic, so that they can easily be reused and extended by other tests in the future.

Big improvement indeed! Me ❤️ it too!

timoreimann

LGTM.

djalal · 2017-06-16T10:27:10Z

@timoreimann what about adding logs ?

an "empty backend" is not a nominal state, and should throw a WARNING, just like events "no route to host" or "connect timeout"

timoreimann · 2017-06-17T07:15:35Z

@djalal the events preceeding a potential 503 should eventually be logged in each provider. The Kubernetes one does this already, others should follow. I think this should be the first priority as you likely want to have a logging notification before messages actually get routed incorrectly.

I saw you created a dedicated issue too, so I'll place the second part of my answer there. :-)

We still have to complete the design review :)

timoreimann · 2017-06-22T13:16:06Z

@emilevauge what part of the design review is still missing from your perspective?

We seem to have discussed the topic to great length on Slack.

m3co-code · 2017-07-03T07:07:59Z

Is there anything I can do to bring this PR forward?

emilevauge · 2017-07-04T21:59:35Z

@marco-jantke Should we return 503 each time RebalancerErrorHandler is called? Does RebalancerErrorHandler really fit our needs here?

m3co-code · 2017-07-05T12:47:04Z

@emilevauge I checked the implementation of the load balancers in detail now.

The case I am describing holds true for the RoundRobin LB as well as for the Rebalancer LB implementation. They call the error handler you pass them basically in two cases:

nextServer returns an error. This is because they can not find any valid server in the list they can route to. For that case it is correct to return the 503.
when sticky sessions are enabled and the server in the cookie is not yet available anymore. For that case currently we also serve a 503, but before we served a 500 because of the default error handler implementation that was used.

I think for case number 2 even the former implementation is not complete. I can't see anything that would invalidate the cookie again at some point and clients with that cookie won't be able to connect to a new Server, when the server list of the Backend changes. AFAICS we should just remove the error handling in the case this happens and try to connect to a new Server in the backend list, for the following two places. When it can select a new Server it will set the cookie again, effectively overriding the old cookie and in case no Server could be found we want to still have our 503:

WDYT about it?

emilevauge · 2017-07-05T13:41:22Z

@marco-jantke I'm sorry but I don't agree to use RebalancerErrorHandler / ErrorHandler as a workaround to know if a backend is empty or not. We should not assume why an error is set in the errHandler by the load balancer. It could even change in the future.
Couldn't you use the number of servers in a backend directly?

m3co-code · 2017-07-05T16:59:23Z

I get your point. I actually did not perceive using the LoadBalancers ErrorHandler as a workaround, as they are kind of the canonical place to keep track of the service place. Anyway I think we can just directly check when building the Backend whether it has no Servers and use an http.Handler that directly returns 503 without the need of even setting up LoadBalancers or the Forwarder. Does this sound like an approach you imagine? I would give it a shot, should be easy to verify as my tests should not change.

As additional point, WDYT about the "potential problem" I described above regarding sticky sessions and a change of the Server list?

I think for case number 2 even the former implementation is not complete. I can't see anything that would invalidate the cookie again at some point and clients with that cookie won't be able to connect to a new Server, when the server list of the Backend changes. AFAICS we should just remove the error handling in the case this happens and try to connect to a new Server in the backend list, for the following two places. When it can select a new Server it will set the cookie again, effectively overriding the old cookie and in case no Server could be found we want to still have our 503. Rebalancer LoadBalancer

I had no time yet to proof this point and I am not sure whether I will have time in the near future as sticky sessions are not on my list of required features, but maybe it is worth opening an issue for this to enable further investigation in the future or by others?

emilevauge · 2017-07-06T13:49:49Z

Anyway I think we can just directly check when building the Backend whether it has no Servers and use an http.Handler that directly returns 503 without the need of even setting up LoadBalancers or the Forwarder. Does this sound like an approach you imagine? I would give it a shot, should be easy to verify as my tests should not change.

Indeed, that's a simple solution. The only issue is that healthchecks can add/remove servers from a lb dynamically. So this solution can't be that simple :)

On the sticky session thing, couldn't we just make the cookie expire in this case?

timoreimann · 2017-07-06T15:40:16Z

From what I can see, the sticky session implementation does reroute to a different server if the one associated with the cookie is knowingly gone.

Here's the flow how I see it:

The ServeHTTP method is invoked.
GetBackend is invoked with the list of currently known servers (which should also be the one that the health check logic manipulates).
For a known cookie, GetBackend checks if the if the recorded server is still alive. This is the case if the URL from the cookie can be found within the list of servers (called haystack in the code).
If the server cannot be found, we try another one unless we have depleted them all. Only in the latter case we seem to call the error handler.
While we do update the sticky cookie with the new URL, we unfortunately call http.SetCookie(). Unlike the name of the function may imply, it only adds another cookie header (of which they can be multiple) but does not set/overwrite an existing one.

m3co-code · 2017-07-07T09:54:17Z

@timoreimann thanks for the clarification on the cookie issue! I in fact misinterpreted the behaviour and apart of the already open issues there seems to nothing wrong. To clarify a bit more, it is right to use the Set-Cookie header in the Servers response. In fact there is no other way to set the cookie from the server-side AFAIK. In the issue #1744 its only the problem that multiple different paths are specified and therefore the cookie gets seemingly re-created. A cookie is considered unique and will be overwritten in case the cookie_name, domain and path parts are equal. Said this, I think further communication about the cookie issue should be moved to the dedicated issue then.

timoreimann · 2017-07-07T13:06:11Z

@marco-jantke agreed to moving the cookie discussion to a different issue.

Final comment: SetCookie does not seem to overwrite identical (per your definition) cookies.

emilevauge · 2017-07-07T14:33:40Z

@marco-jantke can we also agree to change the design, and directly use the number of servers in backend (being careful to health-checks) ?

m3co-code · 2017-07-08T09:49:04Z

Yes, I had a look at the implementation and this sounds reasonable to me. So I would write a middleware that gets a healthcheck.LoadBalancer and on each request checks whether there are active Servers. If this is not the case it will directly serve a 503 response. Does this sound reasonable to you?

I will update this PR accordingly.

m3co-code · 2017-07-10T10:21:36Z

@emilevauge @timoreimann PTAL again. I refactored the implementation now to a middleware that knows about the healthcheck.LoadBalancer implementation and responds with a 503, given a Backend is empty.

timoreimann

One nit-pick left.

I really like the middleware-approach. ❤️ Middleware all the things!

timoreimann · 2017-07-10T10:31:09Z

middlewares/empty_backend_handler_test.go

+			handler := NewEmptyBackendHandler(&healthCheckLoadBalancer{test.amountServer}, nextHandler)
+
+			recorder := httptest.NewRecorder()
+			req := httptest.NewRequest("GET", "http://localhost", nil)


s/"GET"/http.MethodGet/

Always.. So thanks for reminding me constantly Timo. At some point I will also learn it, don't give up the hope :D 🐼

Oops, missed that design review is still outstanding. Withdrawing my approval for formal reasons, even though I like the PR from both design and code review perspectives.

emilevauge · 2017-07-14T15:58:49Z

Design LGTM
Thanks @marco-jantke :)

timoreimann

Reinstating my code review LGTM. 👍

ldez

LGTM

dtomcej

LGTM

vdemeester

LGTM 🐯

m3co-code force-pushed the 1688-respond-503-on-missing-or-empty-backend branch from 3cccc1d to 3577218 Compare June 14, 2017 10:12

ldez added kind/enhancement a new or improved feature. status/1-needs-design-review labels Jun 14, 2017

m3co-code force-pushed the 1688-respond-503-on-missing-or-empty-backend branch from 3577218 to 5bb3e69 Compare June 14, 2017 13:59

timoreimann reviewed Jun 14, 2017

View reviewed changes

timoreimann previously approved these changes Jun 15, 2017

View reviewed changes

timoreimann mentioned this pull request Jun 17, 2017

feature request: add a log on not matched frontend rule #1762

Closed

m3co-code force-pushed the 1688-respond-503-on-missing-or-empty-backend branch from 975dbf0 to b04da6d Compare July 7, 2017 08:47

m3co-code force-pushed the 1688-respond-503-on-missing-or-empty-backend branch 3 times, most recently from ab87dad to fcf7761 Compare July 10, 2017 10:20

timoreimann previously approved these changes Jul 10, 2017

View reviewed changes

emilevauge added status/2-needs-review and removed status/1-needs-design-review labels Jul 14, 2017

ldez added the area/middleware label Jul 14, 2017

timoreimann approved these changes Jul 17, 2017

View reviewed changes

m3co-code force-pushed the 1688-respond-503-on-missing-or-empty-backend branch from 2ca3ff7 to f1cad9d Compare July 18, 2017 09:54

ldez approved these changes Jul 18, 2017

View reviewed changes

dtomcej approved these changes Jul 19, 2017

View reviewed changes

ldez added status/3-needs-merge and removed status/2-needs-review labels Jul 19, 2017

vdemeester approved these changes Jul 19, 2017

View reviewed changes

respond with 503 on empty backend

0c2a899

ldez force-pushed the 1688-respond-503-on-missing-or-empty-backend branch from f1cad9d to 0c2a899 Compare July 19, 2017 17:11

ldez removed the status/3-needs-merge label Jul 19, 2017

ldez merged commit 074b31b into traefik:master Jul 19, 2017

ldez changed the title ~~return 503 on empty backend~~ Return 503 on empty backend Aug 14, 2017

ldez added this to the 1.4 milestone Oct 1, 2017

ngnix mentioned this pull request May 18, 2021

Traefik always returns a 404 HTTP code even if no backend is currently configured #8141

Closed

sfudeus mentioned this pull request Aug 10, 2022

Traefik return 404 in K8s when Ingress is configured, just no endpoint is healthy #9242

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return 503 on empty backend #1748

Return 503 on empty backend #1748

m3co-code commented Jun 14, 2017 •

edited by emilevauge

Loading

timoreimann commented Jun 14, 2017

dtomcej commented Jun 14, 2017

m3co-code commented Jun 14, 2017

timoreimann left a comment

timoreimann Jun 14, 2017

ldez Jun 14, 2017 •

edited

Loading

m3co-code Jun 15, 2017

timoreimann Jun 15, 2017

timoreimann left a comment

djalal commented Jun 16, 2017

timoreimann commented Jun 17, 2017

timoreimann commented Jun 22, 2017 •

edited

Loading

m3co-code commented Jul 3, 2017

emilevauge commented Jul 4, 2017

m3co-code commented Jul 5, 2017 •

edited

Loading

emilevauge commented Jul 5, 2017 •

edited

Loading

m3co-code commented Jul 5, 2017

emilevauge commented Jul 6, 2017

timoreimann commented Jul 6, 2017 •

edited

Loading

m3co-code commented Jul 7, 2017 •

edited

Loading

timoreimann commented Jul 7, 2017

emilevauge commented Jul 7, 2017

m3co-code commented Jul 8, 2017

m3co-code commented Jul 10, 2017

timoreimann left a comment

timoreimann Jul 10, 2017

m3co-code Jul 10, 2017 •

edited

Loading

emilevauge commented Jul 14, 2017

timoreimann left a comment

ldez left a comment

dtomcej left a comment

vdemeester left a comment

Return 503 on empty backend #1748

Return 503 on empty backend #1748

Conversation

m3co-code commented Jun 14, 2017 • edited by emilevauge Loading

timoreimann commented Jun 14, 2017

dtomcej commented Jun 14, 2017

m3co-code commented Jun 14, 2017

timoreimann left a comment

Choose a reason for hiding this comment

timoreimann Jun 14, 2017

Choose a reason for hiding this comment

ldez Jun 14, 2017 • edited Loading

Choose a reason for hiding this comment

m3co-code Jun 15, 2017

Choose a reason for hiding this comment

timoreimann Jun 15, 2017

Choose a reason for hiding this comment

timoreimann left a comment

Choose a reason for hiding this comment

djalal commented Jun 16, 2017

timoreimann commented Jun 17, 2017

timoreimann commented Jun 22, 2017 • edited Loading

m3co-code commented Jul 3, 2017

emilevauge commented Jul 4, 2017

m3co-code commented Jul 5, 2017 • edited Loading

emilevauge commented Jul 5, 2017 • edited Loading

m3co-code commented Jul 5, 2017

emilevauge commented Jul 6, 2017

timoreimann commented Jul 6, 2017 • edited Loading

m3co-code commented Jul 7, 2017 • edited Loading

timoreimann commented Jul 7, 2017

emilevauge commented Jul 7, 2017

m3co-code commented Jul 8, 2017

m3co-code commented Jul 10, 2017

timoreimann left a comment

Choose a reason for hiding this comment

timoreimann Jul 10, 2017

Choose a reason for hiding this comment

m3co-code Jul 10, 2017 • edited Loading

Choose a reason for hiding this comment

emilevauge commented Jul 14, 2017

timoreimann left a comment

Choose a reason for hiding this comment

ldez left a comment

Choose a reason for hiding this comment

dtomcej left a comment

Choose a reason for hiding this comment

vdemeester left a comment

Choose a reason for hiding this comment

m3co-code commented Jun 14, 2017 •

edited by emilevauge

Loading

ldez Jun 14, 2017 •

edited

Loading

timoreimann commented Jun 22, 2017 •

edited

Loading

m3co-code commented Jul 5, 2017 •

edited

Loading

emilevauge commented Jul 5, 2017 •

edited

Loading

timoreimann commented Jul 6, 2017 •

edited

Loading

m3co-code commented Jul 7, 2017 •

edited

Loading

m3co-code Jul 10, 2017 •

edited

Loading