-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Browser-to-server libp2p reliability #2529
Comments
Check this silkroadnomad/libp2p-relay#3 |
Manually patching the autodialer retry threshold solved most of our problems. Someone else caught this last week and a fix is already on main: 767b23e. (We're mostly running with gossipsub penalties off, so no issues there. Tuning how many peers are grafted helped only marginally.) Still encountering occasional SIGILL crashes, which were propagating up our stack and causing issues with our container host, but it may not be a js-libp2p issue but something lower level (filecoin-lotus users are seeing it too??) so I'll close this issue now. If anyone else reads this while testing their mesh: invest in headless browser network tests using something like docker-compose -- it's not as hard as it sounds and worth it!! |
@raykyri some of the browser interop work is being covered with a demo app at https://github.com/libp2p/universal-connectivity. Browser reliability should increase when webrtc is released in go-libp2p. see libp2p/go-libp2p#2778 |
@raykyri Are you still encountering SIGILL crashes? Are you using js-libp2p for the server host on Node.js? |
We're using js-libp2p for the server host, yep. We haven't seen any SIGILL issues, we eventually traced that to somewhere else. |
We've been running a browser-to-server libp2p mesh for chat applications at https://play.skystrife.xyz, that uses gossipsub to distribute messages and our own service, based on GossipLog and a Prolly tree to sync past messages. We're monitoring logs, Prometheus metrics, and have separate instances that spin up libp2p nodes and connect to our mesh to perform health checks.
Since last week, there have been tens of players online at the same time (occasionally even 100+). We've noticed reliability issues even at the smaller scales - libp2p server nodes will randomly stop accepting messages, or stop listening on the port after a few hours. The cause isn't an OOM or anything else readily apparent from
libp2p:*:error
logging.What's the state of reliability for browser-to-server libp2p right now? We're considering using a separate websocket service and using libp2p for server-to-server sync exclusively as it seems unclear how others have deployed this stack in a browser environment.
We are currently on:
The text was updated successfully, but these errors were encountered: