Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set / document default envoy circuit_breakers #1574

Open
devinrsmith opened this issue Nov 15, 2021 · 1 comment
Open

Set / document default envoy circuit_breakers #1574

devinrsmith opened this issue Nov 15, 2021 · 1 comment
Assignees
Labels
bug Something isn't working triage
Milestone

Comments

@devinrsmith
Copy link
Member

devinrsmith commented Nov 15, 2021

Old title, TLDR: Leaking server-side StreamObservers leads to "overflow" failures

While testing out #1440 via a java client WIP, I noticed that I was not getting server-side responses to my input table requests. Turns out there is a server-side bug where the StreamObserver is not completing (to be fixed in #1565), essentially causing the server-side to "leak" StreamObservers.

For testing my java client, I continued on, and started ignoring the response from the server so I could continue on without the server-side fix.

Things seemed to work just fine, until I noticed that the first 1024 requests "worked" (I was able to see the input table changes in the UI), but future requests would not work. I was able to reproduce this 100% of the time by restarting the client.

I then instrumented my client to not ignore errors, and got back:

UNAVAILABLE: upstream connect error or disconnect/reset before headers. reset reason: overflow

after 1024 requests.

Either of the following "fixed" the issue

  • Fixing the server side leak
  • Bypassing envoy, and connecting directly to grpc-api server

When trying multiple clients at the same time, the overflow was reach when the sum-total of leaked observers reached 1024, which seemed to suggest that this was some sort of global envoy setting.

@niloc132 researched, and found https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/circuit_breaking:

Circuit breakers are enabled by default and have modest default values, e.g. 1024 connections per cluster. To disable circuit breakers, set the thresholds to the highest allowed values.

Given our push-based subscription model, this likely means that, by default, our envoy configuration would start failing once a server had 1024 subscriptions, and likely sooner due to other concurrent requests.

This patch:

index 0cc44fa2c..383e0ca53 100644
--- a/envoy/contents/envoy.yaml
+++ b/envoy/contents/envoy.yaml
@@ -88,6 +88,11 @@ static_resources:
                     socket_address:
                       address: grpc-api # here we assume the name of the websocket proxy
                       port_value: 8080
+      circuit_breakers:
+        thresholds:
+          max_connections: 1000000
+          max_pending_requests: 1000000
+          max_requests: 1000000
     - name: web
       connect_timeout: 10s
       type: LOGICAL_DNS

pushed that limit up to 1 million.

We'll likely want some documentation on the proper envoy circuit breaker limit depending on deployment needs.

@devinrsmith devinrsmith added bug Something isn't working triage labels Nov 15, 2021
@devinrsmith devinrsmith added this to the Backlog milestone Nov 15, 2021
@devinrsmith devinrsmith changed the title Leaking server-side StreamObservers leads to "overflow" failures Document envoy circuit_breakers Nov 15, 2021
@devinrsmith devinrsmith changed the title Document envoy circuit_breakers Set / document default envoy circuit_breakers Nov 15, 2021
@devinrsmith
Copy link
Member Author

Proposal:

mofojed pushed a commit that referenced this issue Oct 25, 2023
Release notes https://github.com/deephaven/web-client-ui/releases/tag/v0.51.0

# [0.51.0](deephaven/web-client-ui@v0.50.0...v0.51.0) (2023-10-24)


### Bug Fixes

* Adjusted Monaco "white" colors ([#1594](deephaven/web-client-ui#1594)) ([c736708](deephaven/web-client-ui@c736708)), closes [#1592](deephaven/web-client-ui#1592)
* cap width of columns with long names ([#1574](deephaven/web-client-ui#1574)) ([876a6ac](deephaven/web-client-ui@876a6ac)), closes [#1276](deephaven/web-client-ui#1276)
* Enabled pointer capabilities for Firefox in Playwright ([#1589](deephaven/web-client-ui#1589)) ([f440a38](deephaven/web-client-ui@f440a38)), closes [#1588](deephaven/web-client-ui#1588)
* Remove @deephaven/app-utils from @deephaven/dashboard-core-plugins dependency list ([#1596](deephaven/web-client-ui#1596)) ([7b59763](deephaven/web-client-ui@7b59763)), closes [#1593](deephaven/web-client-ui#1593)
* Tab in console input triggers autocomplete instead of indent ([#1591](deephaven/web-client-ui#1591)) ([fbe1e70](deephaven/web-client-ui@fbe1e70))


### Features

* Theming - Spectrum Provider ([#1582](deephaven/web-client-ui#1582)) ([a4013c0](deephaven/web-client-ui@a4013c0)), closes [#1543](deephaven/web-client-ui#1543)
* Theming Iris Grid ([#1568](deephaven/web-client-ui#1568)) ([ed8f4b7](deephaven/web-client-ui@ed8f4b7))
* web-client-ui changes required for deephaven.ui ([#1567](deephaven/web-client-ui#1567)) ([94ab25c](deephaven/web-client-ui@94ab25c))
* Widget plugins ([#1564](deephaven/web-client-ui#1564)) ([94cc82c](deephaven/web-client-ui@94cc82c)), closes [#1455](deephaven/web-client-ui#1455) [#1167](deephaven/web-client-ui#1167)


### BREAKING CHANGES

- `usePlugins` and `PluginsContext` were moved from
`@deephaven/app-utils` to `@deephaven/plugin`.
- `useLoadTablePlugin` was moved from `@deephaven/app-utils` to
`@deephaven/dashboard-core-plugins`.
- `useConnection` and `ConnectionContext` were moved from
`@deephaven/app-utils` to `@deephaven/jsapi-components`.
- `DeephavenPluginModuleMap` was removed from `@deephaven/redux`. Use
`PluginModuleMap` from `@deephaven/plugin` instead.
* Enterprise will need ThemeProvider for the css
variables to be available





Co-authored-by: deephaven-internal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants