-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gnrc_sock_tcp: add gnrc sock tcp #16494
gnrc_sock_tcp: add gnrc sock tcp #16494
Conversation
a732ede
to
4e8063c
Compare
4e8063c
to
43d57ab
Compare
Now this also needs a rebase 😉 You want to fix the last commit message while you are at it:
|
646bb6b
to
134180c
Compare
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! If you use the existing types intended for implementing this you should be able to get rid of this casting show 😉
b1d0cef
to
8f62149
Compare
ba1588b
to
fe77e03
Compare
tests/gnrc_sock_tcp/main.c
Outdated
if (err) { | ||
printf("%s: returns %s\n", name, strerror(err)); | ||
} else { | ||
printf("%s: returns %d\n", name, err); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this. strerror(0)
returns "Success"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not on every platform ;). And the python code calling the functions matches for the specific result. Thats why I handle 0 manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW.: You added an implementation returning OK on 0, a few hours ago. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arg sorry for that - feel free to change to return value to "Success" so you can get rid of this adapter function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep it as it is. I am not sure if the returned string is always "Success" on all platforms. For any error code the return value should always be textual representation of the enum value but for 0 different strings are possible. Since this is test code I think keeping the tiny function is okay.
dump_args(argc, argv); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dump_args(argc, argv); | |
if (argc < 3) { | |
printf("usage: %s <host> <port>\n", argv[0]); | |
return -1; | |
} |
let's be a bit more user friendly. Right now if you just call gnrc_tcp_open
expecting to get usage information, the test will simply crash on you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also when I try
> gnrc_tcp_open fe80::90a7:a6ff:fe4b:2e32 12345
gnrc_tcp_open: argc=3, argv[0] = gnrc_tcp_open, argv[1] = fe80::90a7:a6ff:fe4b:2e32, argv[2] = 12345
gnrc_tcp_open: returns -EAFNOSUPPORT
> gnrc_tcp_open fe80::90a7:a6ff:fe4b:2e32%5 12345
gnrc_tcp_open: argc=3, argv[0] = gnrc_tcp_open, argv[1] = fe80::90a7:a6ff:fe4b:2e32%5, argv[2] = 12345
gnrc_tcp_open: returns -EAFNOSUPPORT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if user-friendlyness should be is the design goal for a test API only called from python code.
For the -EAFNOSUPPORT thing I need to take a deeper look. This test should only expose sock_connect. Did you run this agains the gnrc_tcp tests instead of the gnrc_sock_tests that are the subject of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry I ran tests/gnrc_tcp
instead of tests/gnrc_sock_tcp
😅
I also figured out the format
> sock_tcp_connect [fe80::90a7:a6ff:fe4b:2e32]:12345 6666
sock_tcp_connect: argc=3, argv[0] = sock_tcp_connect, argv[1] = [fe80::90a7:a6ff:fe4b:2e32]:12345, argv[2] = 6666
sock_tcp_connect: returns Unknown error -110
I wanted to run this against netcat since I expected it to behave very much like netcat, just with more manual steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I have to specify the interface as well
> sock_tcp_connect [fe80::90a7:a6ff:fe4b:2e32%5]:12345 7412
sock_tcp_connect: argc=3, argv[0] = sock_tcp_connect, argv[1] = [fe80::90a7:a6ff:fe4b:2e32%5]:12345, argv[2] = 7412
sock_tcp_connect: returns 0
Have you considered using netutils_get_ipv6()
in gnrc_tcp_ep_from_str()
😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm since we use strerror, most of the used errorcodes seem to be Unknown :(
I have not considered netutils_get_ipv6(), but I would argue that this is not in scope of this PR. Cleaning this up should happen as a seperate change. This PR is for the interface between gnrc and sock, not changes in gnrc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm since we use strerror, most of the used errorcodes seem to be Unknown :(
That's because strerror
expects positive error codes
With this
--- a/tests/gnrc_sock_tcp/main.c
+++ b/tests/gnrc_sock_tcp/main.c
@@ -36,7 +36,7 @@ void dump_args(int argc, char **argv)
void print_result(const char *name, int err)
{
if (err) {
- printf("%s: returns %s\n", name, strerror(err));
+ printf("%s: returns %s\n", name, strerror(-err));
} else {
printf("%s: returns %d\n", name, err);
}
I get
sock_tcp_read: returns Resource temporarily unavailable
instead of
sock_tcp_read: returns Unknown error -11
I have not considered netutils_get_ipv6(), but I would argue that this is not in scope of this PR. Cleaning this up should happen as a seperate change.
I agree. But we still should make the CLI interface of the test a bit nicer (print usage instead of crashing if argument count does not match, maybe even use default timeout / local port if the argument was omitted).
There is value in being able to test out individual bits of the API in an exploitative way, it shouldn't be a frustrating experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix it
Yes it blocks. Internally it tries to close the connection. If the other side doesn't send anymore it still has to wait two minutes before it is unblocked. This has something todo with slow roundtrip times from the early internet. So if you try to open a new connection before sock_tcp_disconnect finished it must fail. The entire process is quite complicated to explain in a code review, if you wanna know details, the TCP RFC is your friend. For your test: just print after your disconnect and then call nc again. In addition: you can tweak all TCP Timouts or the number of preallocated buffers via kconfig |
The TCP sock API actually contains an example echo server. I would expect this to work with |
I haven't tried it but after the changes related to SOCK_NO_TIMEOUT, we should test it. |
please squash! (&rebase) |
b979807
to
6fba0d9
Compare
Done. I fixed the NO_TIMOUT mapping and the asserts as well. |
Hmm the echo server example still does not behave as one would expect :/ main.c
Makefile
|
I managed to figure out whats happening here. In short: The example is broken. Here are more details: We see here interesting behavior: On entering Ctrl+C netcat actually tries to teardown the connection normally by sending a FIN that gets an ACK in reply. Receiving a FIN before sending our own FIN puts a TCP connection into a state called CLOSED_WAIT. This state means basically, our peer has sent all its data and excepts from us to send a FIN in return (aka. call close). The zero bytes read returns mean here: You have read all data and from this connection no more data will come. This behavior is not a bug in gnrc_tcp. Infact there was a bug (see #12367) a while ago that this behavior was not implemented this way. If you change:
to
It works as intended. |
That's good news (or maybe bad news as we have a bug in the example code 😕) I moved the example code to a proper example (#16739) and noticed something else: The example client will not disconnect when the example server disconnects: client
server
Maybe there is a bug in the client code too? |
Not necessarily. Depending on the timing in connection teardown, disconnect can take up to two minutes for its return. |
Hm I thought this was only the case if the other side does not properly disconnect? |
Retransmissions of FIN Packets in cases where the ACK sent in response was lost on the way. A TCP connection is practically dead if after two times the maximum segment lifetime no packet was transmitted. Linux handles this this internally and return fast after close and allocates a new TCB for the next connection. RIOT can't do this since everything is statically allocated. Thats why close/disconnect has to wait until the used TCB is reusable again. If you want more details look for TIME-WAIT in the TCP RFC. TCP connection teardown is pretty complicated |
But I'm testing with I don't see this issue with |
Okay. Let me explain the Issue in detail: At some point later B sends a FIN to A. Now A knows that B wants to terminate its side of the connection.
At this Point A never knows if B received the ACK because in either case B would not answer immediately. So if the so called LAST WAIT timer times out and B didn't send anything within the last to minutes -> A knows that B received the ACK before and closes its connection. If A receives a FIN from B triggered by its re-transmit mechanism, during its waiting period, A sends a new ACK, resets the timer and waits again. That's why you observe this waiting period regardless of the underlying network stability. TCP must assume the most unstable underlying connection (TCP was invented around 1981, just assume the network quality and roundtrip times back then ;)). I don't know how LWIP is handling this (maybe they use some hack like sending reset packages on connection teardown, who knows). I thought about how I implement it differently but all solutions I came up with, would enable easily exploitable memory leakage in the packet buffer that's why I didn't try this. You could of course shorten the maximum segment lifetime constant by defines. The gnrc_tcp tests do this to speedup the connection teardown process. |
A possible solution might be to calculate a better fitting and much shorter maximum segment lifetime dynamically based on to the mean round-trip time used by the re-transmission mechanism. This could speed things up, but this is definitely non-standard behavior and out of scope of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really happy yet as the explanation doesn't make the unexpected behavior go away or give a way for an application developer to mitigate that (other than always using timeouts).
However, this is indeed unrelated to the SOCK API adoption that now really just is a thin wrapper. So ACK!
@benpicco - I did some digging regarding that timeout. Current Linux and BSD Systems don't use the 2 Minute MSL anymore. They use one minute instead. We can to this too. This eases the pain a little bit. |
Interestingly enough this has been done already. The current MSL value is 30s. I think I'll take a lock into the dynamic MSL calculation thing. This might be a good but experimental optional solution. |
Contribution description
This PR adds gnrc_sock_tcp, finally adding a sock_tcp implementation based on gnrc_tcp.
Testing procedure
To test this PR a simplified version of the gnrc_tcp testsuite is supplied under tests/gnrc_sock_tcp.
To execute it, just follow the instructions there.
Issues/PRs references
Depends on PR #16459
Depends on PR #16461
Depends on PR #16493
Closes #16493 - TCP Sockets can not be used / built