Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove peers from the peer store when they disconnect #1231

Closed
wants to merge 1 commit into from

Conversation

marten-seemann
Copy link
Contributor

Depends on #1230 and libp2p/go-libp2p-peerstore#174.

Any idea how to test this?

Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be careful with leaving garbage in the ds peerstore; other than that LGTM.

For test we can just manually connect and disconnect two peers and check the peerstore.

@@ -9,7 +9,7 @@ require (
github.com/ipfs/go-log/v2 v2.3.0
github.com/libp2p/go-libp2p v0.14.4
github.com/libp2p/go-libp2p-connmgr v0.2.4
github.com/libp2p/go-libp2p-core v0.11.0
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will need a stable reference here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'm waiting with cutting the release, just in case something else comes up that we need to get into v0.16.0. Will release and then rebase all the PRs.

@@ -5,7 +5,7 @@ go 1.16
require (
github.com/gogo/protobuf v1.3.2
github.com/libp2p/go-libp2p v0.14.4
github.com/libp2p/go-libp2p-core v0.11.0
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too.

@@ -5,7 +5,7 @@ go 1.16
require (
github.com/gdamore/tcell/v2 v2.1.0
github.com/libp2p/go-libp2p v0.14.1
github.com/libp2p/go-libp2p-core v0.11.0
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amd here, lest we foeget.

@@ -23,13 +23,13 @@ require (
github.com/libp2p/go-libp2p-autonat v0.5.0
github.com/libp2p/go-libp2p-blankhost v0.2.0
github.com/libp2p/go-libp2p-circuit v0.4.0
github.com/libp2p/go-libp2p-core v0.11.0
github.com/libp2p/go-libp2p-core v0.11.1-0.20211024101752-b18a4c9c5629
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too.

return
}
for {
// Note that this might shut down before the swarm has closed all connections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be problematic with the ds peerstore, we might leave garbage behind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's why I put the comment there. I'm a bit wary of building yet more shutdown logic (we'd need to have a chan that is closed not when Close is called, but when the swarm has finished shutting down). Let me see how much work that is.

Comment on lines +539 to +542
ev := e.(event.EvtPeerConnectednessChanged)
if ev.Connectedness == network.NotConnected {
h.Peerstore().RemovePeer(ev.Peer)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason we're doing this instead of letting the peerstore's have their own GC methods since they can just listen on the eventbus?

It's also not clear from here that this doesn't effect the AddrBook and that other peerstore implementations should ignore the AddrBook for RemovePeer. If we did remove peers from the AddrBook on disconect it would be the source of a lot of problems for earlier code written where this assumption wasn't true (e.g. dialing DHT servers we recently connected to, but are not currently).

What do we get out of having the other subcomponents of the peerstore clear out immediately instead of on a timer, or at their own leisure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason we're doing this instead of letting the peerstore's have their own GC methods since they can just listen on the eventbus?

The peerstore is currently a pretty independent object. It doesn't know anything about libp2p hosts, event busses, etc. (note that the constructor doesn't take any arguments: https://github.com/libp2p/go-libp2p-peerstore/blob/986d5ceedb842c664be0bbac66f598006392e2ff/pstoremem/peerstore.go#L20-L29). We could pass in an event bus there, but that would lead to a tighter coupling.

It's also not clear from here that this doesn't effect the AddrBook and that other peerstore implementations should ignore the AddrBook for RemovePeer. If we did remove peers from the AddrBook on disconect it would be the source of a lot of problems for earlier code written where this assumption wasn't true (e.g. dialing DHT servers we recently connected to, but are not currently).

The AddrBook doesn't have a RemovePeer method, see libp2p/go-libp2p-core@b18a4c9. We only remove data for PeerMetadata, Metrics, ProtoBook and Keybook.

What do we get out of having the other subcomponents of the peerstore clear out immediately instead of on a timer, or at their own leisure?

Currently, memory use is growing unboundedly. Clearing it immediately on disconnect seems the easiest way to prevent a DoS.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, memory use is growing unboundedly. Clearing it immediately on disconnect seems the easiest way to prevent a DoS.

I see what you're saying however this makes the PeerMetadata seem largely useless. Perhaps it's already not super useful, but if the information disappears as soon as there's a connection hiccup then even if a connection is reestablished the information is gone. I'm not sure how the Metrics are used, but any utility they had becomes questionable also as they drop information related to peers as soon as the connection is dropped.

The KeyStore and Protobook are usable too, although perhaps less so.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd forget this information after some time period instead of immediately.

  • Forgetting the bandwidth metrics immediately is problem.
  • One of the goals of the peerstore is to remember recently connected peers.

One solution is to record "last seen" times inside the peerstore, but that's going to be a bit of a state-syncing hell.

case e := <-sub.Out():
ev := e.(event.EvtPeerConnectednessChanged)
if ev.Connectedness == network.NotConnected {
h.Peerstore().RemovePeer(ev.Peer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will race with a reconnect and could leave us without peer info.

Unfortunately, to do this right, we'll need to add and remove peer info from a single goroutine. But that could have some interesting perf impact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we go to great length to avoid mutex contention, for example here: https://github.com/libp2p/go-libp2p-peerstore/blob/986d5ceedb842c664be0bbac66f598006392e2ff/pstoremem/protobook.go#L11-L20.

An alternative approach (which @aschmahmann suggested above: #1231 (comment)) would be to somehow connect the peerstore with the host and have a gc there that listens for disconnect events.

@marten-seemann marten-seemann force-pushed the gc-peerstore branch 2 times, most recently from 3ea7493 to 03203a8 Compare November 26, 2021 13:39
@marten-seemann
Copy link
Contributor Author

Resolved by #1258.

@marten-seemann marten-seemann deleted the gc-peerstore branch November 8, 2022 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants