Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP over libp2p #626

Open
vyzo opened this issue Aug 14, 2024 · 11 comments
Open

IP over libp2p #626

vyzo opened this issue Aug 14, 2024 · 11 comments

Comments

@vyzo
Copy link
Contributor

vyzo commented Aug 14, 2024

I want to gauge interest for developing protocols/specs for implementing IP over libp2p tunneling.

Motivation

There are use cases where private overlay networks are implemented using libp2p, yet there is a need to provide an IP network abstraction (e.g. using a tun interface) to processes running in the overlay network's userland.

For an example, in the nunet network the substrate is built using libp2p, but the applications running on top of this substrate are note aware of libp2p and instead rely on the (usual) TCP/IP interface abstractions.
Unfortunately, with the current suite of libp2p protocols is quite hard to get acceptable performance because IP packets have to go over reliable transport with multiplexing.
See https://gitlab.com/nunet/device-management-service/-/merge_requests/362#note_2040757665 for performance analysis of the PoC in nunet. In short, it is to say politely not great -- at best "only" a 2x slower, which could well be 10x slower in many cases.

Call For Action

So what do we need to do? I want to specifically implement and specify protocol integrations for IP over libp2p.
I am pretty sure other projects are interested in this too, but let's get the discussion started.

I think the main difficulty is the lack of packet transports in libp2p; we end up sending IP packets over reliable transport, with mulitplexing, which completely defeats the purpose as the protocols running on top in userspace are prepared to handle packet loss and expect an unreliable network mechanism underneath.

So in order to resolve the problem of "IP over libp2p" we also need to develop appropriate packet transports, that let you send unreliable datagrams over a connection. QUIC already supports such functionality, so in a sense it is a matter of exposing it. This will also allows to use a plain UDP based protocol without QUIC complexity, but we still need to resolve encryption and authentication, possible with DTLS.

@vyzo
Copy link
Contributor Author

vyzo commented Aug 14, 2024

Related Work: RFC 9484: Proxying IP in HTTP

@marten-seemann
Copy link
Contributor

Related Work: RFC 9484: Proxying IP in HTTP

To add a little bit more context, CONNECT-IP is (conceptually and implementation-wise) very similar to CONNECT-UDP (RFC 9298), which is deployed on a massive scale by iCloud Private Relay (for example). I have an implementation of CONNECT-UDP in masque-go.

This will also allows to use a plain UDP based protocol without QUIC complexity, but we still need to resolve encryption and authentication, possible with DTLS.

The problem is that in addition to an unreliable way of sending data (for UDP or IP packets), you usually also want to have a (reliable) control channel. CONNECT-(IP/UDP) uses (reliable) HTTP streams to communicate where and how packets should be proxied, and then uses (unreliable) HTTP DATAGRAMs for the actual data transfer. If you use DTLS, you'd probably have to build some kind of reliability mechanism..

@derrandz
Copy link

There is kcp which offers reliability over UDP using some partical ARP protocol + packet level anonymization through encryption.

I attempted an initial implementation of kcp for libp2p in https://github.com/libp2p/go-libp2p/pull/2672/files if we want to revisit that.

@vyzo
Copy link
Contributor Author

vyzo commented Aug 14, 2024

Well, reliability is actually a misfeature here; the userspace expects the network to be datagrams to be delivered unreliably, as best effort, and have their own mechanisms for end to end reliability where appropriate.

@MarcoPolo
Copy link
Contributor

Thanks for opening this vyzo. I too am interested in seeing the interest here. I've never been opposed to providing a packet transport, but I've been hesitant to do so without sufficient interest and actual use cases.

Linking RFC 9484 is a good call. There are some subtleties here (e.g. IPv6 has requires a minimum MTU of 1280 bytes, but we'll have some overhead in our encapsulation...), and it's nice to reference a well thought out document that's already walked this path.

I wonder if we should just use RFC 9484. We can easily run an h3 server anywhere we have QUIC deployed. If needed, I think we could even support HTTP/2 with our TCP+TLS stack, however I'd push back on since it doesn't avoid the nested congestion control problem you get with doing IP over a reliable transport. Are there any usecases that wouldn't work if we did this per RFC 9484?

This will also allows to use a plain UDP based protocol without QUIC complexity, but we still need to resolve encryption and authentication, possible with DTLS.

I disagree with the premise that QUIC introduces needless complexity. We should not reinvent the wheel here. To echo Marten's statement, QUIC gives us exactly what we need here: unreliable datagrams and reliable streams.

@vyzo
Copy link
Contributor Author

vyzo commented Aug 14, 2024

Agreed.

After consideration, I think the right way forward is QUIC unreliable datagram streams + the suite of protocols from RFC 3484

I am not opposed to http3 all that much, but i would prefer to do it purely with QUIC.

@marten-seemann
Copy link
Contributor

For datagrams, HTTP/3 just provides a super thin (one integer thin, that is) wrapper around QUIC datagram, see https://www.rfc-editor.org/rfc/rfc9297.html#name-http-3-datagrams. This allows multiplexing multiple datagram flows (i.e. multiple proxied connections, in our case) in a single QUIC connection.

Having HTTP/3 for streams is convenient, since it provides an easy way for the client to send a proxy request: it's just an Extended CONNECT HTTP request, and for the server to respond to that request using HTTP status codes (and HTTP header fields). Really, there isn't any more HTTP than this going on.

If you wanted, you could of course use Protobufs for the request-response part of the protocol, and define your own datagram demultiplexing wire format, and that would work equally well. But you'd basically just reinvent the wheel.

@vyzo
Copy link
Contributor Author

vyzo commented Aug 15, 2024

The concern i have with http3 is that we expand the dependency set to include an http3 server.

I would like to avoid this if possible.

@vyzo
Copy link
Contributor Author

vyzo commented Aug 15, 2024

But maybe it's not a big deal.

@MarcoPolo
Copy link
Contributor

I think leveraging h3 would be very useful for this for a couple of reasons:

  • We could use RFC 9484 as is. Nothing to reinvent here. The work for this issue would be done!
  • We reuse existing and common semantics from HTTP. No need to reinvent these with protobufs.
  • If we didn't use h3, we'd likely have to reimplement many of the same things in order to get similar properties as RFC 9484.
  • h3 is pretty lightweight. Even if you didn't have an h3 server already, I think building the minimal one to support RFC 9484 is fairly straightforward.
  • We could collaborate with a wider pool of developers and users by building on these standards.

@vyzo
Copy link
Contributor Author

vyzo commented Aug 15, 2024

ok, fair enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

No branches or pull requests

4 participants