Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TCP out of order or duplicate segments sampler via BPF #255

Merged
merged 7 commits into from
Nov 30, 2021

Conversation

rittme
Copy link
Contributor

@rittme rittme commented Nov 22, 2021

Problem

Add two new TCP probes:

  • A probe to measure duplicate TCP segments. This allow us to detect ACK are not being received.
  • A probe to measure out of order TCP segments. Out of order segments causes retransmissions and can affect latency.

Solution

Added two BPF kernel probes to count du and RTO events.

Result

Two new metrics introduced to TCP sampler:

  • tcp/receive/duplicate
  • tcp/receive/out_of_order

binary_path: None,
sub_system: None,
};
let tcp_ooo_probe = Probe {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probe is identical as the above one. We can just introduce a probe called

tcp_validate_incoming_probe and attach it to both telemetries

Self::Duplicate => vec![tcp_validate_incoming_probe],
Self::OutOfOrder => vec![tcp_validate_incoming_probe],

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I was wondering if I could do that :)


// Segment sequence before the expected one
// which means this was a duplicated segment
if ((end_seq - tp->rcv_wup) < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the starting sequence number TCP_SKB_CB(skb)->seq and the expected sequence number is tp->rcv_nxt is more straightforward to compare

if TCP_SKB_CB(skb)->seq is smaller than tp->rcv_nxt, it's confirmed a duplicate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'rcv_wup' records the rcv_nxt on last window update sent, comparing it to the end_seq of this segment does not seem correct to me.

Copy link
Contributor Author

@rittme rittme Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm basing these on this function that is used to validate the segment sequence on the kernel:
https://elixir.bootlin.com/linux/v4.2/source/net/ipv4/tcp_input.c#L3902

From the comments it states:

Also, controls (RST is main one) are accepted using RCV.WUP instead 
of RCV.NXT. Peer still did not advance his SND.UNA when we 
delayed ACK, so that hisSND.UNA<=ourRCV.WUP.

So to validate it seems better if we use RCV.WUP. But maybe for probing duplicates we don't care about this and should use RCV.NXT anyways?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what

!before(end_seq, tp->rcv_wup) &&
 !after(seq, tp->rcv_nxt + tcp_receive_window(tp))

checks eventually is that whether the sender's segment [seq, end_seq] has any overlays with the expected window at all.
The first check finds out if the end_seq is even before the previous rcv_nxt sent, the second check finds out if the start seq is even after the largest possible seq (which is rcv_nxt + window size).

For us we simply wants to check if the segment is a duplicate, and checking end_seq - tp->rcv_wup can't help, for example, when (end_seq - tp->rcv_wup) <0, we can confirm that it's a duplicate, but when (end_seq - tp->rcv_wup) > 0, it can still be a duplicate, as long as seq < tp->rcv_nxt.

So basically to check duplicates, we should be looking seq and tp->rct_nxt. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this makes sense. I made the changes to use RCV.NXT instead.

// Segment sequence after the expected receive window
// which means this segment was received out of order
u32 window_end = tp->rcv_nxt + tcp_receive_window(tp);
if ((window_end - seq) < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, I think we can simply check whether if seq > tp->rcv_nxt here to see if the segment arrived out of order.
window_end < seq only tell that the current segment is totally out of the window, but when there's an overlay, it can't capture.

@brayniac
Copy link
Contributor

brayniac commented Nov 24, 2021

This LGTM from a Rust perspective. I think it can be marked as ready for review at this point?

@WUMUXIAN - please give a shipit for the BCC portion when you're happy with that and I'll give it another look before merging.

Copy link
Contributor

@WUMUXIAN WUMUXIAN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, I think we can upgrade from draft status and get it merged.

@rittme rittme marked this pull request as ready for review November 25, 2021 02:19
@rittme
Copy link
Contributor Author

rittme commented Nov 30, 2021

@brayniac feel free to merge and close this PR when possible. Thanks!

@brayniac brayniac merged commit cb92b00 into twitter:master Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants