Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new TCP socket APIs #85

Merged

Conversation

ryan-summers
Copy link
Member

Fixes #52 by exposing a more robust API for TCP sockets.

I recently hit an issue with minimq where the is_connected() interface was determined to be insufficiently expressive to properly managing the TCP socket state.

@ryan-summers
Copy link
Member Author

ryan-summers commented May 10, 2023

CC @MathiasKoch as I think you're the other user of TCP traits? Also @chrysn

@chrysn
Copy link
Contributor

chrysn commented May 11, 2023

I'm having a hard time following the trouble in the MQTT implementation. What is the decision for which "is this socket in the CONNECTED state" does not suffice?

Do the "may send" / "may receive" / "is open" methods suffice to also cover similar cases, or might we wind up needing to express all the FSM states? Can we get them out of a POSIX socket? Can we even get the ones introduced here on a POSIX system? (I'm generally happy to say "then POSIX sockets are a bad API", but I don't want to do that without due consideration).

@MathiasKoch
Copy link
Collaborator

I have to say that personally I am currently in the process of migrating towards the async version, and i think that the general API of the async version is MUCH more rust-like, and should be implemented for the blocking version as well if you ask me.

@ryan-summers
Copy link
Member Author

I'm back with some findings after trying to use the embedded-nal-async approach - thanks for the advice you two :)

I have to say that personally I am currently in the process of migrating towards the async version, and i think that the general API of the async version is MUCH more rust-like, and should be implemented for the blocking version as well if you ask me.

The main issue that I'm running into is that a struct may not own a Tcp::Connection and the TcpStack simultaneously.

  • This happens because the Connection refers to the lifetime of TcpStack, which is equivalent to Self. Rust does not support self-referential structs

In my use case (MQTT), the MQTT client needs to close and open TCP sockets in response to adverse network events (i.e. link loss, etc.), which implies that it has to own the network stack (in order to establish new connections as required). Because of this, the Connection-based API used by the async version doesn't seem to be the right approach for more complex stack requirements.


I'm having a hard time following the trouble in the MQTT implementation. What is the decision for which "is this socket in the CONNECTED state" does not suffice?

The specific case is differntiating between "attempting to connect", "connected", and "no-longer-connected". All 3 of these cases have different required behaviors, but the is_connected() only allows us to detect two states.

  1. "attempting to connect" => We should keep waiting and not do anything until we are connected
  2. "connected" => We are connected to our peer and can actively rx and tx, send our packets now
  3. "no-longer-connected" => We need to close the TCP socket and open a new one

Note that the actions in (1) and (3) are polar opposites. In one case, we actively want to do nothing (i.e. wait for connection). In the other case, we need to close the socket and re-attempt connections.

Do the "may send" / "may receive" / "is open" methods suffice to also cover similar cases, or might we wind up needing to express all the FSM states?

Honestly, the more I think about it, the more it seems like maybe we should just expose the FSM directly.

Can we get them out of a POSIX socket? Can we even get the ones introduced here on a POSIX system? (I'm generally happy to say "then POSIX sockets are a bad API", but I don't want to do that without due consideration).

I think POSIX sockets expose this information primarily through returned error codes. If the TX pipe is closed when you try to send, it will error. Similarly with the RX pipe. Maybe this would be a better approach... Let me give it a shot.

@chrysn
Copy link
Contributor

chrysn commented Jun 12, 2023

I think POSIX sockets expose this information primarily through returned error codes. If the TX pipe is closed when you try to send, it will error. Similarly with the RX pipe. Maybe this would be a better approach... Let me give it a shot.

That sounds like a EAFP instead of LBYL approach, which I appreciate, so I'm looking forward to seeing what comes out of it.

own a Tcp::Connection and the TcpStack simultaneously.

IIRC going towards more many-owners-of-shared-stack was a necessity of the async API because multiple components can have pending operations (say, reads) that overlap in time, which would conflict with the necessity to hold an exclusive reference to the stack in the async model. In a blocking model, where one would either read only from one socket or put all one's sockets into some multiplexer (which'd do a read given a single reference to the stack), that's not necessary. I'd hope that many of the nice properties of the async API can be carried over to the sync world, even if the "main argument" is a (&mut socket, &mut stack) instead of a (&mut socket-including-shared-stack).

@ryan-summers
Copy link
Member Author

ryan-summers commented Jun 12, 2023

I'd hope that many of the nice properties of the async API can be carried over to the sync world, even if the "main argument" is a (&mut socket, &mut stack) instead of a (&mut socket-including-shared-stack).

The issue isn't actually mutability, but rather lifetime issues. If you try the following:

struct MyStruct<T: TcpStack> {
    connection: T::Connection<'?>,
    stack: T,
}

What is the lifetime of connection? It's supposed to be that of stack, but that's not possible to specify because it would be referencing the lifetime of Self for MyStruct

@ryan-summers
Copy link
Member Author

@chrysn Updates are now available on the PR. I opted for a base error type that allows for library users to detect pipe errors that require TCP to be reconnected. It's working well on my local setups and also resolves the original MQTT I was having. Let me know what you think.

If we want to move towards a more socket-directed API, I'd like to keep that off for a new PR. There's a lot of work involved in that

@ryan-summers
Copy link
Member Author

@rust-embedded-community/embedded-nal Anyone available to help with review on this one? :)

@thejpster
Copy link
Member

Seems OK to me, but... Does anyone have a public example showing how they have implemented the new error codes? Can it be done on all the existing stacks, or are you going to break some of them?

@ryan-summers
Copy link
Member Author

Seems OK to me, but... Does anyone have a public example showing how they have implemented the new error codes?

The above-linked minimq PR is a client side impl (quartiq/minimq#126) and the smoltcp-nal (quartiq/smoltcp-nal#42) above is a stack implementation, both working on hardware with stabilizer.

Can it be done on all the existing stacks, or are you going to break some of them?

I'd be surprised if a network stack wasn't capable of telling you the socket is closed when trying to read or write to TCP. In any case, a stack doesn't have to implement these error Codes and can mark everything as other if they have to.

Copy link
Contributor

@chrysn chrysn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

(Not sure if my approval is helpful, but I trust the automation rules know my role here :-) ).

src/stack/tcp.rs Outdated Show resolved Hide resolved
@ryan-summers ryan-summers merged commit 85e025e into rust-embedded-community:master Jun 15, 2023
@ryan-summers ryan-summers deleted the feature/tcp-updates branch June 15, 2023 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TCP socket state/error handling insufficiently specified
4 participants