-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Connection Drop fault #154
Comments
Possible implementationsThere seems to be multiple ways to implement this. Perhaps the most prominent are:
Solutions that wouldn't work well for this are:
Solution A (plain iptables)This would be the easiest solution, as it only requires setting a netfilter rule. However, it is also the least flexible: We are limited to terminating connections using only the logic that is exposed in the Solution B (capture packets, forge RST)This is what The first caveat is that there are some nuances related to packet injection with libpcap (which tcpbutcher uses, but tcpkill doesn't), which in some experiments caused RSTs to not be correctly sent to the local end of the connection, causing only the remote end to be terminated. I haven't researched this deeply and it might be a solvable issue. The second caveat is that packet capture and forging occur concurrently with the normal connection flow. This is important as, for forged RSTs to work, their SEQ number needs to land in the current TCP window. If the application is processing a large amount of traffic in a short amount of time, it is possible for the window to move past the forged RST's SEQ before it gets sent, rendering it useless.
https://github.com/ggreer/dsniff/blob/2598e49ab1272873e4ea71d9b3163ef7edcc40ea/tcpkill.c#L70-L71 This strategy is an improvement, but still not guaranteed to work. Solution C (nfqueue)The caveat above of injection being concurrent with normal flow of packets can be removed by replacing packet capture with NFQUEUE: Instead of asynchronously capturing traffic, we force netfilter to send every packet to us and wait for us to come to a decision. This way, we can guarantee that our packet is sent before the window moves. This, however, comes at the cost of performance: Our userspace code can become a bottleneck as every potential packet will need to flow through it. We will need to do some experimentation to assess how fast we can process packets. Solution C1 would use libpcap for injecting the RST packet. Solution C2 would, instead, put a flag on the packet so a subsequent iptables rule can Terminating connectionsIndependently of the technical solution, it might also be worth discussing how we want to model connection termination. Apart from some matching criteria (e.g. for a given destination port), we will want to specify how many connections to kill. Option 1: Pure random percentageThe simplest, non-fair approach would be for the user to specify a percentage of connections to be terminated. For each packet matching the criteria, we check whether a random number in [1, 100] is smaller than the percentage and if its, we terminate the connection. This would most likely not be a good solution, as connections with high traffic would have a higher chance to be terminated than connections with lower traffic. Option 2: Percentage using 4-tuple hashThe simplest approach that is fair could be for the user to specify a percentage of connections to be terminated (e.g. 10%), and then compare that with the modulus of the 4-tuple hash for the connection. That way, we can hash the 4-tuple for each packet By checking the 4-tuple, which is constant for a given connection, instead of a per-packet random number, we keep the same chance of termination regardless of throughput. The downside, however, is that a given connection would either be killed instantly, or never killed. This does not map very well to real-world connection dropping cases, and on top of that, over time it will converge to a set of connections that are naturally selected to be never killed Option 3: Percentage using 4-tuple hash and truncated timeTo work around the issues of option 2, we can integrate a truncated timestamp into the 4-tuple hash. This, way, for each packet, we can compute the hash of: This strategy ensures that:
To the user, this would be exposed simply as a percentage and a time (our resolution), and documented as "Terminate % of the connections every T seconds". |
Thanks for the detailed explanation of the alternatives and their tradeoffs. First of all, I think it is important to contextualize this discussion around the API we what we wan to offer to the developers. The way I see the fault injection API for dropping connections would be as shown below: drop a percentage of the connections towards a target port for a duration specified in the fault = {
port: <target port>,
rate: <percentage of connections to drop>
}
disruptor.injectDropConnectionFault(fault, '10s') Based on this requirement and your analysis, I would lean toward exploring the use of As for the performance concerns, I have not investigated this in detail, but I think we can optimize this by using session marks. For example, instruct iptables to forward only non-marked packages to our program and once we process a package for a session (regardless of the decision) mark it to prevent the forwarding of further packages. But definitely, we need to evaluate the overhead. Regarding the mechanism used for deciding to terminate a connection, I think we should only consider two parameters: the source IP and port, as we are disrupting connections towards a fixed destination (IP and port). I'm not sure about the difference between options 2 and 3 (consider the timestamp) I'm not sure if we need to reconsider the decision periodically because we are making this decision on each package. Therefore, if for a package we decide not to drop the connection, we will reevaluate this decision for the next package. Could you elaborate on the scenario you consider this periodic re-evaluation is needed? |
I agree, I think NFQUEUE + marking is the most interesting path to explore.
This is very interesting, I didn't know iptables could "keep track" of per-session marks. However I think this would have the disadvantage mentioned in option 2 (more below)
We can do a dice roll per packet, but I'm not sure we should: If we do, high-throughput connections will get terminated way faster than low-throughput ones, simply because the former will roll the dice many more times than the latter. If we want to simulate a scenario where a server drops connections, I think we would want a behavior that is not sensible to throughput. Option 2 aims to solve that by making the result of the dice roll the same for a given the connection (4-tuple), so it doesn't matter how many times you roll it. Option 3 improves on that by adding a timeframe to ensure that every N seconds, dies are rolled again for each connection. |
I don't follow you here. The idea is to make the decision to drop the session once per session and then stop intercepting packages for that session, this is particularly important for those sessions we decide not to drop as we are not adding overhead.
I agree. We should make make the decision per session not per package. However, If understood correctly, then option 2 fixes the problem of unfairness. Therefore, I still don't understand why we want to re-evaluate that decision. What concerns me is that we can end up dropping more sessions than requested. That is, instead of dropping What we could do is use the requested duration for truncating the timestamp to ensure we don't drop more than the requested percentage over the duration of the fault injection, but I still need to simulate this scenario in my head. The main problem I see is that for a given pool of sessions, it is very likely that only a small number of them are active and we won't see any traffic for the rest of them. Therefore it can happen we don't drop as many sessions as requested just because we don't see them. As I said before, we probably should simulate the different options before making any decision. |
This is true, and might not be entirely expected, However I think the alternative is not ideal either: Let's consider an scenario where an application has a pool of 10 connections, and the test defines a 20% connection drop. Statistically, 8 connections will be left untouched, and 2 will be terminated. However, as the application re-opens those connections, those 2 new connections will have a 20% chance of being terminated, so the most likely scenario is that both survive. We now have 10 healthy connections that will never get terminated, even if the duration of the test is several minutes. I'm starting to think both scenarios can be valid:
As the difference in implementation between these two proposals is very small (either add or not add a truncated timestamp to the hash), I suggest we start with the simplest (not adding the truncated timestamp to the hash) and see what users think. Adding the second scenario or changing the behavior should be pretty easy. |
yes, I think this distinction is important and also that both are potentially valid.
This is good, but I'm more concerned about the developer experience. Can we define them (and their differences) easily? Can we use one or the other using an option in the fault?
What of the above two scenarios does this correspond to? the second one? |
I think that with some documentation effort and careful wording we should be able to differentiate them. For example, we can describe the first scenario as: const networkFault = {
dropRate = 0.1,
};
As for the recurrent case, we could model it as: const networkFault = {
dropRate = 0.1,
dropEvery = "10s",
};
This description is probably not perfect (in particular I do not like the
I would start with first one, which I called above non-recurrent. |
A common fault that affects applications is the drop of open connections to services such as databases. The drops may be caused by network issues or due to saturation in the server side. In any case, the application should be prepared for handing these drops and reestablish the connections. This process can be particularly complex when connection pools are used, because the health of the available connections in the pool must be updated..
In Kubernetes, such services are deployed as Pods (for example, as a stateful set), therefore this fault should be supported by the PodDisruptor.
The text was updated successfully, but these errors were encountered: