-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove peers from the peer store when they disconnect #1231
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be careful with leaving garbage in the ds peerstore; other than that LGTM.
For test we can just manually connect and disconnect two peers and check the peerstore.
@@ -9,7 +9,7 @@ require ( | |||
github.com/ipfs/go-log/v2 v2.3.0 | |||
github.com/libp2p/go-libp2p v0.14.4 | |||
github.com/libp2p/go-libp2p-connmgr v0.2.4 | |||
github.com/libp2p/go-libp2p-core v0.11.0 | |||
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will need a stable reference here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I'm waiting with cutting the release, just in case something else comes up that we need to get into v0.16.0. Will release and then rebase all the PRs.
@@ -5,7 +5,7 @@ go 1.16 | |||
require ( | |||
github.com/gogo/protobuf v1.3.2 | |||
github.com/libp2p/go-libp2p v0.14.4 | |||
github.com/libp2p/go-libp2p-core v0.11.0 | |||
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too.
@@ -5,7 +5,7 @@ go 1.16 | |||
require ( | |||
github.com/gdamore/tcell/v2 v2.1.0 | |||
github.com/libp2p/go-libp2p v0.14.1 | |||
github.com/libp2p/go-libp2p-core v0.11.0 | |||
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amd here, lest we foeget.
@@ -23,13 +23,13 @@ require ( | |||
github.com/libp2p/go-libp2p-autonat v0.5.0 | |||
github.com/libp2p/go-libp2p-blankhost v0.2.0 | |||
github.com/libp2p/go-libp2p-circuit v0.4.0 | |||
github.com/libp2p/go-libp2p-core v0.11.0 | |||
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too.
return | ||
} | ||
for { | ||
// Note that this might shut down before the swarm has closed all connections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be problematic with the ds peerstore, we might leave garbage behind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's why I put the comment there. I'm a bit wary of building yet more shutdown logic (we'd need to have a chan
that is closed not when Close
is called, but when the swarm has finished shutting down). Let me see how much work that is.
7fcb3c2
to
f64679b
Compare
ev := e.(event.EvtPeerConnectednessChanged) | ||
if ev.Connectedness == network.NotConnected { | ||
h.Peerstore().RemovePeer(ev.Peer) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason we're doing this instead of letting the peerstore's have their own GC methods since they can just listen on the eventbus?
It's also not clear from here that this doesn't effect the AddrBook and that other peerstore implementations should ignore the AddrBook for RemovePeer. If we did remove peers from the AddrBook on disconect it would be the source of a lot of problems for earlier code written where this assumption wasn't true (e.g. dialing DHT servers we recently connected to, but are not currently).
What do we get out of having the other subcomponents of the peerstore clear out immediately instead of on a timer, or at their own leisure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason we're doing this instead of letting the peerstore's have their own GC methods since they can just listen on the eventbus?
The peerstore is currently a pretty independent object. It doesn't know anything about libp2p hosts, event busses, etc. (note that the constructor doesn't take any arguments: https://github.com/libp2p/go-libp2p-peerstore/blob/986d5ceedb842c664be0bbac66f598006392e2ff/pstoremem/peerstore.go#L20-L29). We could pass in an event bus there, but that would lead to a tighter coupling.
It's also not clear from here that this doesn't effect the AddrBook and that other peerstore implementations should ignore the AddrBook for RemovePeer. If we did remove peers from the AddrBook on disconect it would be the source of a lot of problems for earlier code written where this assumption wasn't true (e.g. dialing DHT servers we recently connected to, but are not currently).
The AddrBook
doesn't have a RemovePeer
method, see libp2p/go-libp2p-core@b18a4c9. We only remove data for PeerMetadata
, Metrics
, ProtoBook
and Keybook
.
What do we get out of having the other subcomponents of the peerstore clear out immediately instead of on a timer, or at their own leisure?
Currently, memory use is growing unboundedly. Clearing it immediately on disconnect seems the easiest way to prevent a DoS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, memory use is growing unboundedly. Clearing it immediately on disconnect seems the easiest way to prevent a DoS.
I see what you're saying however this makes the PeerMetadata
seem largely useless. Perhaps it's already not super useful, but if the information disappears as soon as there's a connection hiccup then even if a connection is reestablished the information is gone. I'm not sure how the Metrics
are used, but any utility they had becomes questionable also as they drop information related to peers as soon as the connection is dropped.
The KeyStore
and Protobook
are usable too, although perhaps less so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd forget this information after some time period instead of immediately.
- Forgetting the bandwidth metrics immediately is problem.
- One of the goals of the peerstore is to remember recently connected peers.
One solution is to record "last seen" times inside the peerstore, but that's going to be a bit of a state-syncing hell.
case e := <-sub.Out(): | ||
ev := e.(event.EvtPeerConnectednessChanged) | ||
if ev.Connectedness == network.NotConnected { | ||
h.Peerstore().RemovePeer(ev.Peer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will race with a reconnect and could leave us without peer info.
Unfortunately, to do this right, we'll need to add and remove peer info from a single goroutine. But that could have some interesting perf impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we go to great length to avoid mutex contention, for example here: https://github.com/libp2p/go-libp2p-peerstore/blob/986d5ceedb842c664be0bbac66f598006392e2ff/pstoremem/protobook.go#L11-L20.
An alternative approach (which @aschmahmann suggested above: #1231 (comment)) would be to somehow connect the peerstore with the host and have a gc
there that listens for disconnect events.
3ea7493
to
03203a8
Compare
Resolved by #1258. |
Depends on #1230 and libp2p/go-libp2p-peerstore#174.
Any idea how to test this?