Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Allow libp2p to determine listening addresses #4700

Closed
wants to merge 27 commits into from

Conversation

jmcph4
Copy link
Member

@jmcph4 jmcph4 commented Sep 5, 2023

Issue Addressed

#4675

Proposed Changes

  • Update local ENR (only port numbers) with local addresses received from libp2p (via SwarmEvent::NewListenAddr)
  • Only use the zero port for CLI tests

Additional Info

See Also

@jmcph4 jmcph4 changed the base branch from stable to unstable September 5, 2023 03:12
@jmcph4 jmcph4 self-assigned this Sep 5, 2023
@jmcph4 jmcph4 added enhancement New feature or request test improvement Improve tests Networking labels Sep 5, 2023
@jmcph4 jmcph4 closed this Sep 19, 2023
@jmcph4 jmcph4 force-pushed the 4675-bind-zero-port branch from 804a9a0 to 0c23c86 Compare September 19, 2023 00:55
@jmcph4 jmcph4 reopened this Sep 19, 2023
@jmcph4 jmcph4 marked this pull request as ready for review September 19, 2023 03:55
@jmcph4 jmcph4 added work-started work-in-progress PR is a work-in-progress labels Sep 19, 2023
@jmcph4
Copy link
Member Author

jmcph4 commented Sep 19, 2023

Currently failing on AddrInUse from warp (https://github.com/sigp/lighthouse/actions/runs/6231636385/job/16913464506?pr=4700)

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me but I'm lacking context to properly assess the discovery changes. Will wait for Age or João

@michaelsproul michaelsproul added ready-for-review The code is ready for review v4.5.0 ETA Q4 2023 and removed work-in-progress PR is a work-in-progress work-started labels Sep 20, 2023
Copy link
Member

@AgeManning AgeManning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in general look fine. Some comments we probably should address tho.

@@ -1036,12 +1038,107 @@ impl<TSpec: EthSpec> NetworkBehaviour for Discovery<TSpec> {
FromSwarm::DialFailure(DialFailure { peer_id, error, .. }) => {
self.on_dial_failure(peer_id, error)
}
FromSwarm::NewListenAddr(ev) => {
/* TODO(jmcph4): plumb user override status into config somehow */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle this before merging?

}
};

let mut should_update_ip: bool = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should never update the ENR IP here.

A few reasons:

  1. The local address we are binding to (i.e the address that libp2p will return) will almost never be a globally routable address. Most computers will be behind a NAT. Unless they are using ipv6 etc. But in any case its better to let discovery handle the IP field of the ENR.
  2. Discovery sets this field based on how others connect to us. We should not interfere with this I dont think.

We should only really set the TCP or QUIC fields of the ENR based on libp2p's response.

Protocol::Tcp(port) => port,
Protocol::Udp(port) => port,
_ => {
crit!(self.log, "Attempt to update discovery with unsupported transport (this should never occur)"; "transport" => ?transport);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on every start we now also listen on the QUICv1 protocol.

I would imagine that means running this code would immediately emit this crit.

I usually run an instance of lighthouse locally (with backfill disabled and weak subjectivity sync) to see if it starts up and connects and doesn't emit crits.

It's usually a good self-test to catch bugs like this.

Copy link
Member Author

@jmcph4 jmcph4 Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on every start we now also listen on the QUICv1 protocol.

As of #4402 this is correct.

I would imagine that means running this code would immediately emit this crit.

When we listen on QUICv1, the multiaddress looks like this: /ip4/127.0.0.1/udp/9001/quic-v1/. In this handler (for FromSwarm::NewListenAddr), we parse the multiaddress from left-to-right and only ever two Protocols deep -- so we'll never encounter this crit! log emission during normal operation of the BN. This is still a bug obviously as only ever parsing two Protocols deep will completely miss QUICv1 altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. I saw this line:
https://github.com/jmcph4/lighthouse/blob/10263d3e003218c529f420d7e34ad658ff2f1ca1/beacon_node/lighthouse_network/src/discovery/mod.rs#L1131

                match addr.iter().nth(1) {
                    Some(Protocol::Tcp(port)) => update_enr(ip, Protocol::Tcp(port)),
                    Some(Protocol::Udp(port)) => update_enr(ip, Protocol::Udp(port)),
                    Some(Protocol::Quic) => update_enr(ip, Protocol::Quic),
                    _ => {
                        debug!(self.log, "Encountered unacceptable multiaddr for listening (invalid transport)"; "addr" => ?addr);
                    }
                };

which explicitly passes the Quic protocol to that function. I guess this case just never occurs because you only look at the nth(1).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's somewhat regrettable that the current fix (i.e., 599c706) duplicates the port number itself (as the Protocol::QuicV1 variant doesn't store the port unlike its TCP and UDP counterparts) but is much more explicit about the mapping from multiaddress to ENR update information.

match ip {
IpAddr::V4(_) => {
if transport_is_tcp {
insert_into_enr("tcp4", &port);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also probably should check that the tcp port isn't already set from a previous run. I.e we load it from disk. I can't recall maybe the insert() function already checks. But we probably dont want to be increasing the sequence number if we don't have to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also probably should check that the tcp port isn't already set from a previous run. I.e we load it from disk.

Are you referring to previous instances of us entering this swarm event handler throughout the duration of the BN runtime? Namely, if we receive multiple local addresses to listen on from the swarm (which virtually always happens during normal operation), when we perform the updates serially, we keep thrashing the ENR's sequence number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of the case like, we run Lighthouse on port 9000 as default. It creates an ENR from fresh because its our first ever time running. We set the tcp port on the ENR.

We then shutdown lighthouse and save the ENR to disk.

Then we run lighthouse again. This loads the saved ENR from disk which has the TCP field already set. If we are using the exact same CLI params (i.e all we have done is restart lighthouse), the tcp field in the ENR is set to the one that libp2p would return to us so we probably don't need to reset it.

match addr.iter().nth(1) {
Some(Protocol::Tcp(port)) => update_enr(ip, Protocol::Tcp(port)),
Some(Protocol::Udp(port)) => update_enr(ip, Protocol::Udp(port)),
Some(Protocol::Quic) => update_enr(ip, Protocol::Quic),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be using QuicV1. Should double check this.

In either case a simple run should show this debug error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be using QuicV1.

We are; good catch.

In either case a simple run should show this debug error.

This will almost never occur (see above for context)

Copy link
Member

@AgeManning AgeManning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the logic looks good to me. Just a few nitpicks.

beacon_node/lighthouse_network/src/discovery/mod.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/discovery/mod.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/discovery/mod.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/discovery/mod.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/discovery/mod.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/discovery/mod.rs Outdated Show resolved Hide resolved
Copy link
Member

@AgeManning AgeManning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets gooo!

@AgeManning AgeManning added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Sep 25, 2023
@paulhauner
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Sep 29, 2023
## Issue Addressed

#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - #4705 
 - #4402 
 - #4745
@bors
Copy link

bors bot commented Sep 29, 2023

Build failed (retrying...):

@paulhauner
Copy link
Member

bors r-

@bors
Copy link

bors bot commented Sep 29, 2023

Canceled.

@paulhauner
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Sep 29, 2023
## Issue Addressed

#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - #4705 
 - #4402 
 - #4745
@paulhauner
Copy link
Member

bors r-

@bors
Copy link

bors bot commented Sep 29, 2023

Canceled.

@paulhauner
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Sep 29, 2023
## Issue Addressed

#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - #4705 
 - #4402 
 - #4745
@bors
Copy link

bors bot commented Sep 29, 2023

Build failed (retrying...):

bors bot pushed a commit that referenced this pull request Sep 29, 2023
## Issue Addressed

#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - #4705 
 - #4402 
 - #4745
@bors
Copy link

bors bot commented Sep 29, 2023

Build failed:

@paulhauner
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Oct 3, 2023
## Issue Addressed

#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - #4705 
 - #4402 
 - #4745
@bors
Copy link

bors bot commented Oct 3, 2023

Pull request successfully merged into unstable.

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot changed the title Allow libp2p to determine listening addresses [Merged by Bors] - Allow libp2p to determine listening addresses Oct 3, 2023
@bors bors bot closed this Oct 3, 2023
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
## Issue Addressed

sigp#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - sigp#4705 
 - sigp#4402 
 - sigp#4745
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
## Issue Addressed

sigp#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - sigp#4705 
 - sigp#4402 
 - sigp#4745
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
## Issue Addressed

sigp#4675 

## Proposed Changes

 - Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
 - Only use the zero port for CLI tests

## Additional Info

### See Also ###

 - sigp#4705 
 - sigp#4402 
 - sigp#4745
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Networking ready-for-merge This PR is ready to merge. test improvement Improve tests v4.6.0 ETA Q1 2024
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants