Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndt5: emulate Web100 variables required by OONI's UI #161

Open
bassosimone opened this issue Aug 2, 2019 · 11 comments
Open

ndt5: emulate Web100 variables required by OONI's UI #161

bassosimone opened this issue Aug 2, 2019 · 11 comments

Comments

@bassosimone
Copy link
Contributor

bassosimone commented Aug 2, 2019

I noticed that ndt5 only sets MinRTT and zeroes all other variables. There is a bunch of variables that we use in OONI. This issue tries to document what we'll be missing in the new platform and suggest ways to compute these values. The discussion is from the point of view of the variables that need to be computed (and possibly added) to the ndt5 code. All the variables that are listed in this issue cause users to see incorrect values. See #163 for the less critical variables.

Implemented

  1. MinRTT: already computed but I suggest to check the tcpi_min_rtt field of tcp_info (not mapped in the Go code ndt5 is using but definitely in the kernels). I suggest to use that tcp_info field rather than the minimum value found when sampling the RTT.

Figured out 🤞

  1. SumRTT and CountRTT: MK/OONI uses them to compute the average RTT. I suggest to fill these two fields as part of the loop currently used to compute MinRTT.

  2. MaxRTT: I suggest to add the MaxRTT field (it's not one of the MUST-have fields) and to compute it using the loop currently used to compute the MinRTT.

  3. PktsOut: this can be computed from the tcpi_segs_out field [make sure this does not also count the pure ACKs]

  4. CurMSS: use tcpi_snd_mss field of tcp_info

Pessimistic, no clue

  1. CongestionSignals: I time boxed myself and could not find a reliable way of estimating the value of this field from the data in tcp_info. Also, I am pessimistic about this. [help needed]

  2. AckPktsIn: like CongestionSignals (but less pessimistic) [help needed]

  3. DupAcksIn: like *CongestionSignals (less pessimistic) [help needed]

  4. Timeouts: like *CongestionSignals (pessimistic) [help needed]


Discovered by me and discussed with @stephen-soltesz. Added as assignees also @mattmathis and @gfr10598 hoping that they may correct me, if wrong, and help where I am stuck.

@bassosimone bassosimone changed the title ndt5: emulate more Web100 variables ndt5: emulate more Web100 variables (step 1) Aug 2, 2019
@bassosimone bassosimone changed the title ndt5: emulate more Web100 variables (step 1) ndt5: emulate Web100 variables required by OONI's UI Aug 2, 2019
@bassosimone bassosimone added the P1 label Aug 2, 2019
@gfr10598
Copy link
Contributor

gfr10598 commented Aug 5, 2019

Have you also looked at the tcpinfo variables to see whether they could be used? Is ndt5 getting a tcpinfo snapshot already?

Ah - I see that you are looking at the tcpinfo.

We should also work toward reducing redundancy between experiments and tcpinfo, but probably won't be able to do that until late Q3.

@stephen-soltesz
Copy link
Contributor

related to #159

@pboothe
Copy link
Contributor

pboothe commented Sep 9, 2019

@critzo Please put the metrics CIRA cares about here too

@stephen-soltesz
Copy link
Contributor

#163 (comment)

Per the email thread with CIRA, they needs the following metrics (and whatever fields make up the metrics):

  • Jitter (if possible - @mattmathis needs to clarify)
  • Packet loss
  • Upload speed
  • Download speed

@critzo
Copy link

critzo commented Oct 3, 2019

Additionally, here is an updated message from CIRA. In summary, the metrics @stephen-soltesz posted in the previous comment should be the only ones CIRA needs, and if available they may also show TCP_INFO variables as a replacement for web100 ones in the current client.


CIRA SAYS:

If we don’t map (or recreate) web100 values for the NDT test from TCP_INFO, then let’s not show them.

For CIRA (I can’t say about other NDT users), we are not wedded to the NDT web100 variables. We show them. If we instead showed TCP_INFO variables, that works equally well for us.

For CIRA, our main concern is Speed, Latency and Jitter. Our secondary concern is to have the most accurate test for both very low and relatively high speeds. Ie satellite. Our tertiary concern are other TCP issues (ie bufferbloat, etc). If the new NDT 7 would help on any of those. We are also interested.

Our current problem is we have nobody working on IPT at the moment. So other than me trying to keep an eye on things. We have not made a commitment to make a lot of changes to the IPT. But I want to make sure that the new versions of NDT will still meet our needs, so that when I can get a few resources I can make the changes and things are hunky dory._

@pboothe
Copy link
Contributor

pboothe commented Oct 4, 2019

The web100 variables the new server sends are required to be sent by the server as part of the NDT protocol. That said, many of them are no longer applicable in a BBR world. This is a protocol bug. CIRA's client is encouraged to not show them, but the server is not permitted to not send them.

Speed they have (and have always had), latency they have (in the form of MinRTT). "Jitter" is a term they have to define for us - like I have said many times before: "jitter" is an ambiguous term.

The new version of NDT will meet their needs much better than the old one (in terms of improved measurement accuracy, better metrics, reduced error rates, increased ease of integration, and more). They will have to do development work to use it. It sounds like they don't actually need all the old Web100 metrics filled in, which is good.

@pboothe
Copy link
Contributor

pboothe commented Oct 4, 2019

Packet loss is desired by lots of people. I still need to figure out how and whether it is sent.

@pboothe
Copy link
Contributor

pboothe commented Oct 4, 2019

In summary: it sounds like CIRA is flexible enough that I am not longer worried about NDT5 and NDT7 meeting their needs as long as it meets the needs of others.

@bassosimone
Copy link
Contributor Author

Packet loss is desired by lots of people.

For ndt7, I was leaning towards telling people about %-age of bytes retransmitted from TCPInfo. It is not the packet loss rate, but it is something probably close enough. What do you think?

@pboothe
Copy link
Contributor

pboothe commented Oct 21, 2019

Because NDT pretty much only ever sends full packets, the percentage of bytes retransmitted should closely match the percentage of packets retransmitted. This sounds good to me.

@bassosimone
Copy link
Contributor Author

As mentioned in another issue, OONI will switch to ndt7 soon, therefore it is becoming less relevant to us to focus on ndt5. I am again not sure about closing, maybe it's still relevant in other contexts?

@bassosimone bassosimone removed their assignment Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants