Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

functional-tests: Provide more advanced network failure injections #5614

Closed
chancez opened this issue Jun 9, 2016 · 6 comments
Closed

functional-tests: Provide more advanced network failure injections #5614

chancez opened this issue Jun 9, 2016 · 6 comments

Comments

@chancez
Copy link
Contributor

chancez commented Jun 9, 2016

Someone in the kubernetes cluster-sig-ops group wrote the following set of scripts to do black box fauult injection against etcd: https://github.com/jdumars/etcdeath

I see we have https://github.com/coreos/etcd/blob/master/pkg/netutil/isolate_linux.go today which does total packet loss, and latency, but etcdeath provides some interesting additional failures that would be good to add to our functional tests.

In particular the following:

packet corruption:

        sudo tc qdisc add dev eth0 root handle 1:1 netem corrupt 15%
        sudo tc qdisc add dev eth0 parent 1:1 handle 10:1 netem corrupt 15%

partial packet loss (instead of complete packet loss):

        sudo tc qdisc add dev eth0 root netem loss 15%

packet reordering:

        sudo tc qdisc add dev eth0 root handle 1:1 netem delay 10ms reorder 25% 50%
        sudo tc qdisc add dev eth0 parent 1:1 handle 10:1 delay 10ms reorder 25% 50%

partitioning:

        sudo tc qdisc add dev eth0 root netem delay 30000ms 20ms distribution normal

Those are just the ones I see we currently don't have, but they seem to be genuinely good tests to run, so I'm just suggesting we add these as alternative ways to stress etcd.

@xiang90
Copy link
Contributor

xiang90 commented Jun 9, 2016

@chancez For the ordering, is it actually testing the TCP+HTTP stack?

@heyitsanthony
Copy link
Contributor

@xiang90 I think ordering makes sense among multiple connections. For example, the bridge could buffer data on each connection for a bit, then control the order that it drains the buffers into the receiver.

@xiang90
Copy link
Contributor

xiang90 commented Jun 9, 2016

@heyitsanthony Oh. Sure. You are right.

@xiang90
Copy link
Contributor

xiang90 commented Nov 3, 2016

/cc @fanminshi You might be interested in this one.

@fanminshi
Copy link
Member

@xiang90 cool!

@xiang90 xiang90 added this to the v3.2.0 milestone Nov 9, 2016
fanminshi added a commit to fanminshi/etcd that referenced this issue Dec 1, 2016
add more network failures such as packet corruption, reordering, loss, and network partition.

resolve etcd-io#5614
fanminshi added a commit to fanminshi/etcd that referenced this issue Dec 1, 2016
add more network failures such as packet corruption, reordering, loss, and network partition.

resolve etcd-io#5614
@xiang90 xiang90 modified the milestones: v3.3.0, v3.2.0 Apr 5, 2017
@gyuho gyuho modified the milestones: v3.4.0, v3.3.0 Aug 14, 2017
@gyuho gyuho modified the milestones: etcd-v3.4, etcd-v3.5 Aug 5, 2019
@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@stale stale bot closed this as completed Apr 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

5 participants