Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making //test/integration:hds_integration_test less flaky #4149

Merged
merged 1 commit into from
Aug 15, 2018

Conversation

markatou
Copy link
Contributor

@markatou markatou commented Aug 14, 2018

In #4135, //test/integration:hds_integration_test failed 7/100 times. I changed it such that the upstreams are allowed to unexpectedly disconnect, and the timeouts are longer.

Testing:
I run the test locally, and it passed 1000/1000 times.

Risk Level:
Low

This work is for #1310.

Signed-off-by: Lilika Markatou <[email protected]>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing unexpected disconnects from HDS makes sense as I think we do with all fake xDS, however the increase timeouts still seem a bit scary. Is there any way to wait on things such that the timeouts aren't needed? Or should we make the timeouts impossibly large?

@htuch
Copy link
Member

htuch commented Aug 15, 2018

@mattklein123 we can go over the timeout robustness tomorrow, should we merge this to reduce test flakes meanwhile?

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. SGTM. Happy to see a follow up to potentially extend the timeouts or do something a bit less fragile.

@mattklein123 mattklein123 merged commit 6c08fb4 into envoyproxy:master Aug 15, 2018
@markatou markatou deleted the flaky branch August 15, 2018 14:08
htuch pushed a commit that referenced this pull request Aug 22, 2018
…obust (#4164)

Follow up on #4149.

I increased any unused timers to 100 seconds. Additionally, instead of waiting on a timer to go off, all but one tests wait on cluster stats to change.
I decreased the timer wait times, reducing the test runtime from 25 seconds to 5.

Exceptions:

While TCP health checking, if an endpoint responds with a wrong phrase, the cluster failure stats do not change. So, the one test that examines this case still depends on timers.
There are 2 tests that test hosts timing out. I set those timeouts to 0.1 seconds (vs 100). However, both tests still wait on the cluster stats to change.
Risk Level:
Low

Testing:
I run the test locally, and it passed 3000/3000 times.

This work is for #1310.

Signed-off-by: markatou <[email protected]>
rshriram pushed a commit to rshriram/envoy that referenced this pull request Oct 30, 2018
…#1938)

This is far from finished, but it reduces memory usage by ~10%.

Pulling the following changes from github.com/envoyproxy/envoy:

c1cc68d stats: refactoring MetricImpl without strings (envoyproxy#4190)
36809d8 fuzz: coverage profile generation instructions. (envoyproxy#4193)
ba40cc9 upstream: rebuild cluster when health check config is changed (envoyproxy#4075)
05c0d52 build: use clang-6.0. (envoyproxy#4168)
01f403e thrift_proxy: introduce header transport (envoyproxy#4082)
564d256 tcp: allow connection pool callers to store protocol state (envoyproxy#4131)
3e1d643 configs: match latest API changes (envoyproxy#4185)
bc6a10c Fix wrong mock function name. (envoyproxy#4187)
e994c1c Bump yaml-cpp so it builds with Visual Studio 15.8 (envoyproxy#4182)
3d1325e Converting envoy configs to V2 (envoyproxy#2957)
8d0680f Add timestamp to HealthCheckEvent definition (envoyproxy#4119)
497efb9 server: handle non-EnvoyExceptions safely if thrown in constructor. (envoyproxy#4173)
6d6fafd config: strengthen validation for gRPC config sources. (envoyproxy#4171)
132302c fuzz: reduce log level when running under fuzz engine. (envoyproxy#4161)
7c04ac2 test: fix V6EmptyOptions in coverage with IPv6 enable (envoyproxy#4155)
1b2219b ci: remove deprecated bazel --batch option (envoyproxy#4166)
2db6a4c ci: update clang to version 6.0 in the Ubuntu build image. (envoyproxy#4157)
71152b7 ratelimit: Add ratelimit custom response headers (envoyproxy#4015)
3062874 ssl: make Ssl::Connection const everywhere (envoyproxy#4179)
706e262 Handle validation of non expiring tokens in jwt_authn filter (envoyproxy#4007)
f06e958 fuzz: tag trivial fuzzers with no_fuzz. (envoyproxy#4156)
27fb1d3 thrift_proxy: add service name matching to router implementation (envoyproxy#4130)
8c189a5 Make over provisioning factor configurable (envoyproxy#4003)
6c08fb4 Making test less flaky (envoyproxy#4149)
592775b fuzz: bare bones HCM fuzzer. (envoyproxy#4118)

For istio/istio#7912.

Signed-off-by: Piotr Sikora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants