[Feature] add ability to serve custom error responses #1178

skippy · 2017-06-27T15:08:06Z

for envoy to serve as an external proxy, I believe it needs the ability to return custom error responses. For example, instead of returning a generic http body Service Unavailable, I can have it return an error msg consistent with the formatting of my public facing api, such as:

{
  "error": {
    "type": "service_error",
    "status_code": 503,
    "message": "The service is temporarily unavailable."
  }
}

to do this in haproxy, one adds something like errorfile 503 /etc/haproxy/503.json to the config. Can we add similar functionality to envoy? (to use envoy right now for my public facing api, I actually have to put haproxy in front to serve custom error files based upon envoy's response codes.

As for the best place to put it, since it is an http attribute, perhaps it should belong as an http filter?

related to #378 (for what it is worth, I don't think envoy should serve general static files)

The text was updated successfully, but these errors were encountered:

mattklein123 · 2017-06-27T16:13:32Z

At a high level this feature makes sense to me. This actually will require some thought, because we need to differentiate between a "local origin" reply vs. a routed reply, so I don't think a filter is the way to go here. It probably makes sense to allow configuration in the route/vhost level for message overrides on a per response code basis, and then beef up the code in envoy that is used to send "local origin" replies.

mattklein123 · 2018-03-20T23:51:08Z

@junr03 can you take a look at this in the context of the conversation we had today about increasing the fidelity of Envoy error responses in certain cases? We should put together a small design that covers this and the common cases.

For everyone else, at Lyft we would like the ability to have better control over what Envoy returns in error cases. For example, can we return not only a 503, but also potentially JSON that carries additional information about what happened. E.g., a circuit breaker was hit. This will allow apps to potentially have much better error messages and take appropriate action.

I think we can potentially provide some built-in options as well as provide additional customizations if we do this right.

shubhaat · 2018-03-29T22:34:14Z

At Cloud Foundry we have also seen operators and app developers asking for more control on the error returned to the downstream client.

The specific feature we were looking to build out in the near term, was to be able to distinguish between Envoy being aware of a route and the backend misbehaving (503 error) and the Envoy not being aware of a route (404 error). It's been requested by users in our community, ref here.

stale · 2018-06-28T03:01:38Z

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

wjessop · 2018-08-06T11:23:25Z

I've just hit this seeming limitation of envoy. We need to serve a custom 503 page during maintenance, but it seems like the only way to do this simply is to serve it from the web servers, and to special case the envoy health checks so they they don't 503 when the maintenance page is up (so the web servers aren't removed from rotation causing envoy to serve a generic single line error message).

Automatic merge from submit-queue. circleci: update bazel to 0.11, install clang **What this PR does / why we need it**: envoyproxy#1176 need bazel 0.11 (0.10.1 or later) envoyproxy#1124 need clang **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # **Special notes for your reviewer**: **Release note**: ```release-note None ```

qiannawang · 2018-11-21T21:01:52Z

@mattklein123 is there a timeline of adding this feature? Being able to override the response of a routed request became a higher demand to the product my team is working on.

mattklein123 · 2018-11-22T14:19:37Z

@qiannawang can you sync up with @junr03 next week? I think his work in this area got de-prioritized but I'm not sure. If so, perhaps you can pick this up.

qiannawang · 2018-11-23T00:00:26Z

hmm, looks like we can call StreamDecoderFilterCallbacks::sendLocalReply to override the entire HTTP response from the upstream endpoint. At least, my local experiment proves this override works.

This is kind of unexpected to me to call the decoder callbacks in the response path, like encodeHeaders. WDYT, @mattklein123 ? Would the decoder callbacks be destructed once the request is routed (or forwarded) to the upstream endpoint?

mattklein123 · 2018-11-24T14:39:14Z

@qiannawang can you describe a bit more about what your exact use case is? I'm a little confused. (It might be true that sendLocalReply() can override a response, but that is not what this issue is tracking, which is to override the response that Envoy sends for locally originated responses such as 503, 404, etc.

qiannawang · 2018-11-24T21:46:53Z

We use plugged-in filters in the order of X and Y. Then, the envoy.router forwards the request to the upstream endpoint, which might respond with 200s or 500s for example.

In our case, we would like the encoder filter X to override the response with a 503, no matter what the response was returned by other filters or the upstream endpoint. It seems that StreamDecoderFilterCallbacks::sendLocalReply does achieve this. I am wondering if it is reasonable to invoke the decoder filter callbacks in the encoder path.

mattklein123 · 2018-11-26T13:55:32Z

@qiannawang you should be able to do what you are looking for with the existing filter interface. E.g., turning a response into headers only, adding trailers, changing/removing body, etc. Can you describe what you can do? It's possible that sendlocalReply() "works" but that's likely accidental for your use case. As I said already, this issue tracks a different feature request which is to allow modifying the responses that Envoy sends itself. If you have further questions can you please open a new issue?

euroelessar · 2019-07-17T09:20:56Z

Is it fair at this moment to assume that all replies generated by Envoy itself are going through StreamDecoderFilterCallbacks::sendLocalReply?
It would make it a reasonable injection point to override the output (status code, headers, body) based on the configuration.

mattklein123 · 2019-07-17T15:18:21Z

@euroelessar yes, agreed. See also the convo in #7537

bmgoau · 2019-09-06T06:14:29Z

I know this wont suit a lot of the use cases mentioned in this thread, but at least for the complete failure of a set of endpoints being actively health checked by Envoy we did the following:

Setup healthchecks on the endpoints for a given cluster:

      - interval: 5s
        no_traffic_interval: 45s
        timeout: 5s
        unhealthy_threshold: 3
        healthy_threshold: 3
        reuse_connection: yes
        http_health_check:
          path: /healthcheck

Have a infinitely high priority endpoint in the cluster that point back at envoy itself on a unique listener:

      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: host.docker.internal
                port_value: 9001
        priority: 128

Have that point at a listener that only serves a custom error response page/content and 200 OK (so that the healthcheck for this endpoint succeeds), no matter what the path/HTTP call is:

  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 9001
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          codec_type: auto
          stat_prefix: ingress_http
          route_config: {}
          http_filters:
          - name: envoy.lua
            config:
              inline_code: |
                local failurecontent = require("lib.envoy.lua.failurecontent")
                function envoy_on_request(request_handle)
                  request_handle:respond(
                    {[":status"] = "200",
                     ["envoy-fallback"] = "true"},
                    failurecontent.htmlcontent())
                end
          - name: envoy.gzip
          - name: envoy.router

Finally on your main listener have a bit of lua which captures the envoy-fallback response and turns it into a 500 for clients

  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 80
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          codec_type: auto
          route_config:
            name: local_route
            virtual_hosts:
            - name: backend
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/service/1"
                route:
                  cluster: service1
              - match:
                  prefix: "/service/2"
                route:
                  cluster: service2
          http_filters:
          - name: envoy.lua
            config:
              inline_code: |
                function envoy_on_response(response_handle)
                  if response_handle:headers():get("envoy-fallback") == "true" then
                    response_handle:headers():replace(":status", "500")
                    response_handle:headers():remove("envoy-fallback")
                  end
                end
          - name: envoy.gzip
          - name: envoy.router

It's messy but it works well.

daninthewoods · 2020-01-27T17:08:55Z

Please deliver this as soon as possible. We're wanting to move from haproxy to envoy for many reasons, but lack of custom error pages (even if not different per content type) is preventing us from moving forward.

zimmertr · 2020-01-27T19:08:36Z

This would be relevant for my team as well. Thanks for the work on Envoy everyone!

cudneys · 2020-02-19T16:17:05Z

This is a big problem for us as well.

vemod · 2020-04-16T11:48:26Z

We have another use case for this which blocks as going with envoy as our public api gateway:

Our services respond with json for errors and for success responses. For error responses we use zalando's problem format. So when a user is not authorized to access a resource he gets a json response with http code 401 from backend service.
We could simply proxy the response through. Thats fine. But we want to use envoys' JWT validation feature so that the invalid requests don't get through to backend services. But in this case the response from envoy is not matching the expected 401 response which backend service would have served. It would be perfect if could set static error response which matches our backend service responses.

Ideally respecting Accept headers, to enable xml and json responses, otherwise Accept: application/xml gets json as response.

Thanks for making envoy the best proxy in the market

mattklein123 · 2020-05-23T23:40:49Z

Fixed by #11007. Please open new issues with specific requests.

prakasa-tkpd · 2020-09-16T06:14:22Z

Those PR are very awesome, but want to know. Actually in the nginx we know that we can route to the other service when main service/upstream down with adding any additional data we need.

After checking documentation here. I did not found any option for us to routes traffic on error to another service with additional data. Or I miss something in any other places

Description: Also includes iOS wiring for on-error invocation path. Risk Level: Moderate Testing: Integration and unit tests, CI and local Signed-off-by: Mike Schore <[email protected]> Signed-off-by: JP Simard <[email protected]>

mattklein123 added the enhancement Feature requests. Not bugs or questions. label Jun 27, 2017

mattklein123 added the help wanted Needs help! label Jul 12, 2017

mattklein123 mentioned this issue Oct 25, 2017

Envoy error handling causes calling grpc clients confusion #1934

Closed

jared2501 mentioned this issue Nov 15, 2017

Support for non-http-remoting generated 503/504/429/etc palantir/conjure-java-runtime#631

Closed

mattklein123 removed the help wanted Needs help! label Mar 20, 2018

mattklein123 assigned junr03 Mar 20, 2018

mattklein123 added this to the 1.7.0 milestone Mar 20, 2018

mattklein123 mentioned this issue Apr 27, 2018

Return ResponseFlags in response headers #3239

Closed

mattklein123 modified the milestones: 1.7.0, 1.8.0 May 28, 2018

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jun 28, 2018

mattklein123 added the help wanted Needs help! label Jun 28, 2018

stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jun 28, 2018

htuch mentioned this issue Jul 4, 2018

Debugging 503s #3771

Closed

junr03 mentioned this issue Aug 28, 2018

Framework for Richer Information in Responses #4283

Closed

mattklein123 modified the milestones: 1.8.0, 1.9.0 Sep 21, 2018

mattklein123 mentioned this issue Oct 11, 2018

Method to override filter-generated responses #4692

Closed

yuval-k mentioned this issue Aug 15, 2019

ability to customize Gloo error messages kgateway-dev/kgateway#995

Closed

flands mentioned this issue Sep 4, 2019

Custom error page emissary-ingress/emissary#1724

Closed

deliahu mentioned this issue Sep 5, 2019

Improve istio error messages cortexlabs/cortex#432

Open

mattklein123 modified the milestones: 1.12.0, 1.13.0 Oct 10, 2019

mattklein123 added the area/http label Dec 5, 2019

mattklein123 modified the milestones: 1.13.0, 1.14.0 Dec 5, 2019

mattklein123 mentioned this issue Dec 5, 2019

Should sendLocalReply send error response in JSON format? #6166

Closed

mergeconflict removed their assignment Feb 19, 2020

junr03 removed their assignment Feb 19, 2020

mattklein123 modified the milestones: 1.14.0, 1.15.0 Mar 10, 2020

mattklein123 self-assigned this Apr 1, 2020

mattklein123 closed this as completed May 23, 2020

arpt mentioned this issue May 28, 2021

Support modifying gRPC application errors GoogleCloudPlatform/esp-v2#522

Open

dennissetiawan mentioned this issue Nov 23, 2023

Add custom error page for unavailable upstream pomerium/pomerium#4761

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] add ability to serve custom error responses #1178

[Feature] add ability to serve custom error responses #1178

skippy commented Jun 27, 2017

mattklein123 commented Jun 27, 2017

mattklein123 commented Mar 20, 2018

shubhaat commented Mar 29, 2018

stale bot commented Jun 28, 2018

wjessop commented Aug 6, 2018

qiannawang commented Nov 21, 2018

mattklein123 commented Nov 22, 2018

qiannawang commented Nov 23, 2018

mattklein123 commented Nov 24, 2018

qiannawang commented Nov 24, 2018

mattklein123 commented Nov 26, 2018

euroelessar commented Jul 17, 2019

mattklein123 commented Jul 17, 2019

bmgoau commented Sep 6, 2019

daninthewoods commented Jan 27, 2020

zimmertr commented Jan 27, 2020

cudneys commented Feb 19, 2020

vemod commented Apr 16, 2020 •

edited

Loading

mattklein123 commented May 23, 2020

prakasa-tkpd commented Sep 16, 2020

[Feature] add ability to serve custom error responses #1178

[Feature] add ability to serve custom error responses #1178

Comments

skippy commented Jun 27, 2017

mattklein123 commented Jun 27, 2017

mattklein123 commented Mar 20, 2018

shubhaat commented Mar 29, 2018

stale bot commented Jun 28, 2018

wjessop commented Aug 6, 2018

qiannawang commented Nov 21, 2018

mattklein123 commented Nov 22, 2018

qiannawang commented Nov 23, 2018

mattklein123 commented Nov 24, 2018

qiannawang commented Nov 24, 2018

mattklein123 commented Nov 26, 2018

euroelessar commented Jul 17, 2019

mattklein123 commented Jul 17, 2019

bmgoau commented Sep 6, 2019

daninthewoods commented Jan 27, 2020

zimmertr commented Jan 27, 2020

cudneys commented Feb 19, 2020

vemod commented Apr 16, 2020 • edited Loading

mattklein123 commented May 23, 2020

prakasa-tkpd commented Sep 16, 2020

vemod commented Apr 16, 2020 •

edited

Loading