Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm/: Patch reporting on banned peer connections #2350

Merged
merged 25 commits into from
Nov 26, 2021

Conversation

divagant-martian
Copy link
Contributor

@divagant-martian divagant-martian commented Nov 18, 2021

Closes #2337

  • Track banned connections in the swarm.
  • Prevent notifications to the behaviour for banned connections.
  • Include contract assertions to the CallTraceBehaviour.
  • Rework ban test to check for verious conditions.
  • inject_connected and inject_closed are called for the first and last allowed connections, respectively.

@divagant-martian divagant-martian marked this pull request as ready for review November 18, 2021 23:47
@divagant-martian
Copy link
Contributor Author

I added an extremely ugly test to check that inject_connection_closed is not called for a banned connection. The test fails on master, succeeds here. I'm not sure if the test adds value, so please let me know if you'd like to maintain it to try to make it more up to your (and mine) standards

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @divagant-martian for the patch!

I would like this change to be contained within libp2p-swarm, i.e. the libp2p-swarm concept of banning a peer should not leak into libp2p-core (here via is_allowed).

Instead of the current is_allowed, could one keep track of the banned connections in Swarm instead? E.g. something along the lines of:

diff --git a/swarm/src/lib.rs b/swarm/src/lib.rs
index 0059a21a..0d422561 100644
--- a/swarm/src/lib.rs
+++ b/swarm/src/lib.rs
@@ -275,6 +275,10 @@ where
     /// List of nodes for which we deny any incoming connection.
     banned_peers: HashSet<PeerId>,
 
+    /// List of banned connections and whether they have been reported as open to the
+    /// [`NetworkBehaviour`].
+    banned_connections: HashMap<ConnectionId, bool>,
+
     /// Pending event to be delivered to connection handlers
     /// (or dropped if the peer disconnected) before the `behaviour`
     /// can be polled again.

Combining the above with @MarcoPolo laws:

  • If a peer is banned, all active connections are still in the "unbanned state", but the connections are made to disconnect.

    Active connections to now banned peer are added to Swarm::banned_connections with true.

  • If a peer is banned, all new connections will be in the "banned state" and thus not emit any events to the network behavior.

    New connections to banned peer are added to Swarm::banned_connections with false.

  • If a peer is unbanned (and previously banned), all active connections are still in the "banned state".

    Active connections stay in Swarm::banned_connections and are only removed from Swarm::banned_connections on closing. On closing check the bool in Swarm::banned_connections to tell whether the closing should be reported to the NetworkBehaviour.

  • If a peer is unbanned, all new connections will be in the "unbanned state".

    New connections for unbanned peer are not added to Swarm::banned_connections.

@divagant-martian
Copy link
Contributor Author

divagant-martian commented Nov 19, 2021

Hey,

I would like this change to be contained within libp2p-swarm, i.e. the libp2p-swarm concept of banning a peer should not leak into libp2p-core (here via is_allowed).

Alright, makes sense. As a library user I still see it all as a single unit but if the ban is contained in the swarms, sounds like the best option.

Instead of the current is_allowed, could one keep track of the banned connections in Swarm instead? E.g. something along the lines of:

diff --git a/swarm/src/lib.rs b/swarm/src/lib.rs
index 0059a21a..0d422561 100644
--- a/swarm/src/lib.rs
+++ b/swarm/src/lib.rs
@@ -275,6 +275,10 @@ where
     /// List of nodes for which we deny any incoming connection.
     banned_peers: HashSet<PeerId>,
 
+    /// List of banned connections and whether they have been reported as open to the
+    /// [`NetworkBehaviour`].
+    banned_connections: HashMap<ConnectionId, bool>,
+
     /// Pending event to be delivered to connection handlers
     /// (or dropped if the peer disconnected) before the `behaviour`
     /// can be polled again.

I think a HashSet would be enough, where !conn.is_allowed() is equivalent to banned_connections.contains(&conn.id()).

@divagant-martian
Copy link
Contributor Author

@mxinden Moved the logic to the swarm. I also removed the ugly test and reworked the existing ban test (see updated PR description) The test however, does not finish every now and then. I'd appreciate if you or @MarcoPolo can give it a look

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment. Still need to give this an in-depth review but looking good overall. Thanks for the work @divagant-martian!

swarm/src/lib.rs Outdated Show resolved Hide resolved
@divagant-martian
Copy link
Contributor Author

Not sure why I can't re-request a review from @MarcoPolo. But please both give this a review. I think it's ready

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @divagant-martian. This is great work.

I have a bunch of wording suggestions, but overall I think this is the way to go.

core/src/connection/pool.rs Outdated Show resolved Hide resolved
core/src/network/event.rs Outdated Show resolved Hide resolved
core/src/network/event.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/test.rs Show resolved Hide resolved
@divagant-martian
Copy link
Contributor Author

@mxinden all suggestions taken and applied. I think it's a good advice to keep the language consistent across the code 👍

@mxinden
Copy link
Member

mxinden commented Nov 22, 2021

Unfortunately this pull request does introduce a breaking change to libp2p-core, thus we will need to release a new minor version of libp2p-core and all of its dependents.

I prepared the corresponding cargo and changelog entries (small hacky Python script). Could you cherry-pick 10f7c18 into this pull request?

Also could you add a changelog entry to libp2p-swarm describing the bug fix?

Final ask: Can you run this on one of your lighthouse nodes for a bit to make sure all panics are fixed?

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind including this diff as well?

diff --git a/swarm/src/lib.rs b/swarm/src/lib.rs
index 0059a21a..2b2262eb 100644
--- a/swarm/src/lib.rs
+++ b/swarm/src/lib.rs
@@ -540,6 +540,10 @@ where
     pub fn ban_peer_id(&mut self, peer_id: PeerId) {
         if self.banned_peers.insert(peer_id) {
             if let Some(peer) = self.network.peer(peer_id).into_connected() {
+                // Note that established connections to the now banned peer are closed but not added
+                // to [`Swarm::banned_peer_connections`]. They have been previously reported as open
+                // to the behaviour and need be reported as closed once closing the connection
+                // finished.
                 peer.disconnect();
             }
         }

core/src/connection/pool.rs Outdated Show resolved Hide resolved
@divagant-martian
Copy link
Contributor Author

all done. We'll report in a few days with our findings

swarm/src/lib.rs Outdated
this.behaviour.inject_event(peer, connection, event);
let conn_id = connection.id();
if this.banned_peer_connections.contains(&conn_id)
|| this.banned_peers.contains(&peer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do the || this.banned_peers.contains(&peer) check?

My guess: So that if a peer banned we don't inject_event even on connections from before the ban. This is done because the disconnect of existing connections takes a bit and we don't want to get events from banned peers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the reason. That, and every possible condition of peers being banned, unbanned, etc and connections not closing on time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end, it's also the safer bet

Copy link
Member

@mxinden mxinden Nov 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry in advance for reviving this discussion once more, especially since you already tested this version on your infra @divagant-martian.

Imagine the scenario below:

  • Local peer is connected to peer A via a single connection C1.
  • Peer A sends two events to the local peer via C1. The second event depends on the first event. The events are about to be delivered to the local Swarm from the local Network.
  • User banns peer A.
  • User polls the Swarm.
    • The Swarm receives the first event from Network sent via C1.
    • Swarm drops it due to this.banned_peers.contains(&peer).
  • User unbanns A.
  • User polls the Swarm.
    • The Swarm receives the second event from Network sent via C1.
    • The Swarm does not drop the second event as this.banned_peers.contains(&peer) is no longer true.
    • Swarm delivers the second event to the NetworkBehaviour.

Result: NetworkBehaviour receives the second but not the first event from peer A.

I don't think the behavior above is intuitive for an implementor of NetworkBehaviour. In other words, as an implementor of NetworkBehaviour I would expect to either receive all events from a single remote peer in their correct order or not at all.

With the above in mind, would it not be better to remove || this.banned_peers.contains(&peer) @divagant-martian and @MarcoPolo? A NetworkBehaviour would potentially receive events from banned peers until already existing connections are closed, but a NetworkBehaviour would never receive events out of order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in b43a169

@divagant-martian
Copy link
Contributor Author

divagant-martian commented Nov 25, 2021

Based on runs, I think this looks fine @mxinden
The last CI run failed but it looks like a random infra failure

EDIT: waiting one more day due to changes addressing this comment

@divagant-martian
Copy link
Contributor Author

all looking bright on our side

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 thanks!

@mxinden mxinden changed the title patch reporting on banned peer connections swarm/: Patch reporting on banned peer connections Nov 26, 2021
@mxinden mxinden merged commit fd41751 into libp2p:master Nov 26, 2021
vnermolaev pushed a commit to vnermolaev/rust-libp2p that referenced this pull request Nov 30, 2021
Don't report events of a connection to the `NetworkBehaviour`, if connection has
been established while the remote peer was banned. Among other guarantees this
upholds that `NetworkBehaviour::inject_event` is never called without a previous
`NetworkBehaviour::inject_connection_established` for said connection.

Co-authored-by: Max Inden <[email protected]>
@AgeManning AgeManning deleted the banned-peer-connections branch February 15, 2022 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

swarm/: Never call Behaviour::inject_event without a previous Behaviour::inject_connection_established
3 participants