-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: run network fault tests with proxy #9081
Conversation
@hexfusion @spzala Can you take a look when you have time? The plan is merge this after 3.3 release (early February), so not urgent. Thanks! |
@gyuho thanks and sure, in mid of understanding couple other issues related work but will be reviewing afterwards considering no urgency here. |
Also cc @jpbetz. Can you take a look? This is first step to make functional-tester deploy easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting this done! A couple suggestions.
hack/scripts-dev/Makefile
Outdated
@@ -23,11 +23,32 @@ clean: | |||
rm -f ./clientv3/integration/127.0.0.1:* ./clientv3/integration/localhost:* | |||
rm -f ./clientv3/ordering/127.0.0.1:* ./clientv3/ordering/localhost:* | |||
|
|||
|
|||
|
|||
_GO_VERSION = 1.9.2 | |||
ifdef GO_VERSION | |||
_GO_VERSION = $(GO_VERSION) | |||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In make, just GO_VERSION?=1.9.2
should be all you need to have a configurable env var with a default. The ifdef
and _*
var appear unnecessary here. Same for a few other of the below vars.
hack/scripts-dev/Makefile
Outdated
build-docker-functional-tester: | ||
$(info GO_VERSION: $(_GO_VERSION)) | ||
$(info ETCD_VERSION: $(_ETCD_VERSION)) | ||
@cat ./Dockerfile-functional-tester | sed s/REPLACE_ME_GO_VERSION/$(_GO_VERSION)/ \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, sed -i.bak 's|TEMPLATE_VAR|$(REPLACEMENT)|g' filename
also works and is slightly more compact. Not sure if we care.
tools/functional-tester/README.md
Outdated
Or | ||
|
||
```bash | ||
$ rm -rf `pwd`/agent-1 && mkdir -p `pwd`/agent-1 && docker run \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a bit of script-style code. Should we maintain it in .sh files somewhere instead of in the README and reference the scripts from the README? Same for below.
var httpPort int | ||
var verbose bool | ||
|
||
func main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about TLS? Maybe future work?
tools/etcd-test-proxy/main.go
Outdated
flag.StringVar(&to, "to", "localhost:2379", "Address URL to forward.") | ||
flag.IntVar(&httpPort, "http-port", 2378, "Port to serve etcd-test-proxy API.") | ||
flag.BoolVar(&verbose, "verbose", false, "'true' to run proxy in verbose mode.") | ||
flag.Parse() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a short program description that is output with help
?
2e370f0
to
0915c00
Compare
@jpbetz All fixed. PTAL. Thanks! |
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
0915c00
to
c61b660
Compare
@gyuho tons of changes :) ..looks good to me but I am gonna try to test using the readme you have provided, if that's helpful. Thanks! |
@spzala Easiest way to run is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks!
@gyuho just fyi, I have run the test successfully in my env. Thanks!
|
Problem
Current functional-tester depends on Linux utilities:
iptables
,iproute2
, andtc
. Which complicates and slows down tester deployments (require root access, impossible to test locally, hard to containerize, etc.). Also we never test real network partitions in CIs. We should simulate common network conditions to simplify test workflow (ideally package everything in container image).Solution (what this PR does)
proxy/transport.Proxy
that simulates network faults, also useful forclientv3
integration network partition tests(Test clientv3 balancer under network partitions, other failures #8711) with the futureembed
package migration(use embed in integration package #8416).tools/etcd-test-proxy/README.md
).iptables
).Other Changes
When network faults are injected, we expect cluster disrupts after election timeouts. Currently there's no waiting before recovering injected failures e.g. add latency and remove right away; nothing was being tested.
failureDelay
wrapper was there but never used. This PR adds delays to trigger leader elections, and ensures cluster continue to operate under such circumstances.Future Work