Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPV client utility for syncing a lightning node #791

Merged
merged 5 commits into from
Feb 26, 2021

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Feb 5, 2021

Completes the lightning-block-sync crate with the following components:

  • SpvClient capable of syncing chain listeners from a variety of block sources, including fork detection
  • Poll trait for polling one or more BlockSource implementations for new block data
  • ChainPoller for implementing the Poll trait for one blocks source

This PR is the second half of #763.

@codecov
Copy link

codecov bot commented Feb 5, 2021

Codecov Report

Merging #791 (8bfdfdc) into main (94bb0c9) will increase coverage by 0.84%.
The diff coverage is 94.38%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #791      +/-   ##
==========================================
+ Coverage   90.80%   91.64%   +0.84%     
==========================================
  Files          45       48       +3     
  Lines       24547    28747    +4200     
==========================================
+ Hits        22289    26345    +4056     
- Misses       2258     2402     +144     
Impacted Files Coverage Δ
lightning/src/chain/chainmonitor.rs 94.11% <ø> (ø)
lightning/src/chain/channelmonitor.rs 95.68% <ø> (ø)
lightning/src/chain/mod.rs 100.00% <ø> (ø)
lightning/src/ln/channelmanager.rs 85.24% <ø> (ø)
lightning-block-sync/src/poll.rs 92.04% <92.04%> (ø)
lightning-block-sync/src/init.rs 93.56% <93.56%> (ø)
lightning-block-sync/src/test_utils.rs 93.87% <93.87%> (ø)
lightning-block-sync/src/lib.rs 95.37% <96.30%> (+95.37%) ⬆️
lightning/src/ln/functional_tests.rs 96.77% <0.00%> (-0.22%) ⬇️
lightning/src/ln/features.rs 98.64% <0.00%> (-0.15%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94bb0c9...8bfdfdc. Read the comment docs.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of where the traits ended up. All the below comments are simple code-level things. As for init_sync_listener, roughly what I was thinking was the below. Its not ideal - passing in a ValidatedBlockHeader instead of a BlockHash is probably impractical for users, the add_listener method really needs to return a Result and be able to indicate that it failed to find some blocks it needs to disconnect couldn't be found, and the call needs to probably not modify the cache while calling sync_listener. That said, I think the safer API around having multiple listeners (most users will have two - channel and chain monitoring) is nice. There may be some other way to capture this, maybe some wrapper listener object that handles it during setup, I'm not sure.

diff --git a/lightning-block-sync/src/lib.rs b/lightning-block-sync/src/lib.rs
index 72332b07..beea7c5a 100644
--- a/lightning-block-sync/src/lib.rs
+++ b/lightning-block-sync/src/lib.rs
@@ -151,11 +151,11 @@ pub struct BlockHeaderData {
 /// custom cache eviction policy. This offers flexibility to those sensitive to resource usage.
 /// Hence, there is a trade-off between a lower memory footprint and potentially increased network
 /// I/O as headers are re-fetched during fork detection.
-pub struct SpvClient<P: Poll, C: Cache, L: ChainListener> {
+pub struct SpvClient<'a, P: Poll, C: Cache> {
        chain_tip: ValidatedBlockHeader,
        chain_poller: P,
        chain_notifier: ChainNotifier<C>,
-       chain_listener: L,
+       chain_listeners: Vec<&'a mut dyn ChainListener>,
 }
 
 /// Adaptor used for notifying when blocks have been connected or disconnected from the chain.
@@ -190,9 +190,9 @@ pub async fn init_sync_listener<CL: ChainListener, B: BlockSource>(
        let old_header = block_source
                .get_header(&old_block, None).await.unwrap()
                .validate(old_block).unwrap();
-       let mut chain_poller = poll::ChainPoller::new(block_source as &mut dyn BlockSource, network);
+       let mut chain_poller = poll::ChainPoller::new(block_source, network);
        let mut chain_notifier = ChainNotifier { header_cache: UnboundedCache::new() };
-       chain_notifier.sync_listener(new_header, &old_header, &mut chain_poller, chain_listener).await.unwrap();
+       chain_notifier.sync_listener(new_header, &old_header, &mut chain_poller, &mut [chain_listener]).await.unwrap();
 }
 
 /// The `Cache` trait defines behavior for managing a block header cache, where block headers are
@@ -230,7 +230,7 @@ impl Cache for UnboundedCache {
        }
 }
 
-impl<P: Poll, C: Cache, L: ChainListener> SpvClient<P, C, L> {
+impl<'a, P: Poll, C: Cache> SpvClient<'a, P, C> {
        /// Creates a new SPV client using `chain_tip` as the best known chain tip.
        ///
        /// Subsequent calls to [`poll_best_tip`] will poll for the best chain tip using the given chain
@@ -245,10 +245,17 @@ impl<P: Poll, C: Cache, L: ChainListener> SpvClient<P, C, L> {
                chain_tip: ValidatedBlockHeader,
                chain_poller: P,
                header_cache: C,
-               chain_listener: L,
        ) -> Self {
                let chain_notifier = ChainNotifier { header_cache };
-               Self { chain_tip, chain_poller, chain_notifier, chain_listener }
+               Self { chain_tip, chain_poller, chain_notifier, chain_listeners: Vec::new() }
+       }
+
+       pub async fn add_listener(&mut self, listener: &'a mut dyn ChainListener, old_header: ValidatedBlockHeader) {
+               if &old_header.block_hash[..] == &[0; 32] { return; } // Remove?
+               if old_header.block_hash == self.chain_tip.block_hash { return; }
+
+               let new_header = self.chain_tip;
+               self.chain_notifier.sync_listener(new_header, &old_header, &mut self.chain_poller, &mut [listener]).await.unwrap();
        }
 
        /// Polls for the best tip and updates the chain listener with any connected or disconnected
@@ -277,7 +284,7 @@ impl<P: Poll, C: Cache, L: ChainListener> SpvClient<P, C, L> {
        /// Updates the chain tip, syncing the chain listener with any connected or disconnected
        /// blocks. Returns whether there were any such blocks.
        async fn update_chain_tip(&mut self, best_chain_tip: ValidatedBlockHeader) -> bool {
-               match self.chain_notifier.sync_listener(best_chain_tip, &self.chain_tip, &mut self.chain_poller, &mut self.chain_listener).await {
+               match self.chain_notifier.sync_listener(best_chain_tip, &self.chain_tip, &mut self.chain_poller, &mut self.chain_listeners).await {
                        Ok(_) => {
                                self.chain_tip = best_chain_tip;
                                true
@@ -315,12 +322,12 @@ impl<C: Cache> ChainNotifier<C> {
        /// disconnected to the fork point. Thus, this may return an `Err` that includes where the tip
        /// ended up which may not be `new_header`. Note that iff the returned `Err` contains `Some`
        /// header then the transition from `old_header` to `new_header` is valid.
-       async fn sync_listener<L: ChainListener, P: Poll>(
+       async fn sync_listener<P: Poll>(
                &mut self,
                new_header: ValidatedBlockHeader,
                old_header: &ValidatedBlockHeader,
                chain_poller: &mut P,
-               chain_listener: &mut L,
+               chain_listener: &mut [&mut dyn ChainListener],
        ) -> Result<(), (BlockSourceError, Option<ValidatedBlockHeader>)> {
                let mut events = self.find_fork(new_header, old_header, chain_poller).await.map_err(|e| (e, None))?;
 
@@ -333,7 +340,9 @@ impl<C: Cache> ChainNotifier<C> {
                                        if let Some(cached_header) = self.header_cache.remove(&header.block_hash) {
                                                assert_eq!(cached_header, *header);
                                        }
-                                       chain_listener.block_disconnected(&header.header, header.height);
+                                       for l in chain_listener.iter_mut() {
+                                               l.block_disconnected(&header.header, header.height);
+                                       }
                                        last_disconnect_tip = Some(header.header.prev_blockhash);
                                },
                                &ForkStep::ForkPoint(ref header) => {
@@ -361,7 +370,9 @@ impl<C: Cache> ChainNotifier<C> {
 
                                println!("Connecting block {}", header.block_hash);
                                self.header_cache.insert(header.block_hash, header);
-                               chain_listener.block_connected(&block, header.height);
+                               for l in chain_listener.iter_mut() {
+                                       l.block_connected(&block, header.height);
+                               }
                                new_tip = Some(header);
                        }
                }

impl Validate for Block {
type T = ValidatedBlock;

fn validate(self, block_hash: BlockHash) -> BlockSourceResult<Self::T> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to also check the PoW on the header here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I should probably add some unit tests for validation. Some cases are tested indirectly elsewhere only.

lightning-block-sync/src/poll.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/lib.rs Show resolved Hide resolved
lightning-block-sync/src/poll.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
/// blocks accordingly.
///
/// Returns the best polled chain tip relative to the previous best known tip and whether any
/// blocks were indeed connected or disconnected.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Note that the current best chain active may not be the ChainTip if any errors occurred" or something like that. but maybe we should change the api to actually return the current tip instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we made self.chain_tip available, they could compare against that, which I'm doing in the unit tests. But perhaps that wouldn't be the cleanness API. Would it be worth returning an Err to the user in that case? Right now we'll return an error when Poll::poll_chain_tip fails but not when any failures originate from update_chain_tip. Maybe to ask this a different way, does poll_best_tip need to return a bool? What is a user expected to do with it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bool was, at least at some point, used in the rpcclient crate to determine whether it should check for new ChannelManager/ChainMonitor Events. Its not the most interesting thing in the world, but its somewhat useful.

lightning-block-sync/src/poll.rs Show resolved Hide resolved
lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
@jkczyz jkczyz force-pushed the 2021-01-spv-client branch from 7942383 to 075724d Compare February 9, 2021 17:53
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 9, 2021

Somehow missed @devrandom's comments. Will also address @TheBlueMatt's top-level comment in the next response.

@jkczyz jkczyz force-pushed the 2021-01-spv-client branch from 075724d to c12631d Compare February 9, 2021 20:23
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 9, 2021

As for init_sync_listener, roughly what I was thinking was the below. Its not ideal - passing in a ValidatedBlockHeader instead of a BlockHash is probably impractical for users, the add_listener method really needs to return a Result and be able to indicate that it failed to find some blocks it needs to disconnect couldn't be found, and the call needs to probably not modify the cache while calling sync_listener. That said, I think the safer API around having multiple listeners (most users will have two - channel and chain monitoring) is nice. There may be some other way to capture this, maybe some wrapper listener object that handles it during setup, I'm not sure.

Agree that a safer API is ideal. Parameterizing by a single listener which may wrap multiple objects seems the safest since setup would occur before SpvClient is initialized. It would also users to determine any ordering guarantees.

The more difficult issue is when needing to initialize multiple ChannelMonitors on setup before passing them over to ChainMonitor. Here, each monitor may be at a different block hash. Calling sync_listener on each would work but leads to duplicate fetches for headers along the common path. Reusing the cache would help here, but re-fetching common blocks couldn't be avoided.

Some sort of wrapper approach that you suggested may work. The issue that I see is determining which block hash across all ChannelMonitors is the earliest in the chain without knowing the height. Hmmm... then again, we should have the height once we initially fetch the header within init_sync_listener. So we could get them to a common ancestor and then proceed to sync in unison going forward.

@TheBlueMatt
Copy link
Collaborator

Right, @jkczyz, I didnt intend to imply I'm married to the above design, only suggested it as a sample of could be accomplished with some kind of safe "here's a bucket of things, go figure out how to get them on the same chain tip" API :).

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 11, 2021

Right, @jkczyz, I didnt intend to imply I'm married to the above design, only suggested it as a sample of could be accomplished with some kind of safe "here's a bucket of things, go figure out how to get them on the same chain tip" API :).

Let me know what you think of the new API in 99881bd. I'm thinking about updating SpvClient to take a ChainNotifier rather than a Cache so the one passed to sync_listeners could be re-used. Either way I should unify the APIs to take one or the other.

Another concern that came out of this is being able to disconnect stale blocks upon restart. Seems either the source needs to maintain these or we would need to persist the header cache across restarts.

@TheBlueMatt
Copy link
Collaborator

Let me know what you think of the new API in 99881bd.

Hmm, how does one initialize an SpvClient after calling sync_listeners - you get the listeners synced to a common ancestor, but does it tell you what the ancestor is, and how do you initialize an SpvClient to start connecting blocks from that ancestor?

I'm thinking about updating SpvClient to take a ChainNotifier rather than a Cache so the one passed to sync_listeners could be re-used. Either way I should unify the APIs to take one or the other.

Either is fine to me.

Another concern that came out of this is being able to disconnect stale blocks upon restart. Seems either the source needs to maintain these or we would need to persist the header cache across restarts.

Right, I think this is more of something we punt to clients - for the Rest and RPC clients it should be no issue as long as you point them to the same bitcoind as was running previously, but this should be documented. No issue in making the cache serializable, but I feel like its up to the user to decide how they want to tackle it.

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 11, 2021

Let me know what you think of the new API in 99881bd.

Hmm, how does one initialize an SpvClient after calling sync_listeners - you get the listeners synced to a common ancestor, but does it tell you what the ancestor is, and how do you initialize an SpvClient to start connecting blocks from that ancestor?

They are all synced to the passed in new_block hash, so the user would initialize SpvClient with that. The common ancestor is just an intermediary step that allows sync_listeners to know how far back it needs to connect blocks.

BTW, given that SpvClient takes a ValidatedBlockHeader rather than a BlockHash, I may settle on using that in both interfaces. Let me know if you have any concerns. Earlier you had said:

passing in a ValidatedBlockHeader instead of a BlockHash is probably impractical for users

But I'm not sure that I follow why it is impractical given sync_listeners does that immediately. Open to using BlockHash instead and hiding the look-up in SpvClient if you prefer. It would make user code a little simpler I suppose.

@TheBlueMatt
Copy link
Collaborator

They are all synced to the passed in new_block hash, so the user would initialize SpvClient with that. The common ancestor is just an intermediary step that allows sync_listeners to know how far back it needs to connect blocks.

Ah, right, it just still feels really awkward given we could create a type-safe API which enforces that a user cannot end up passing in a different common block hash than what they give to SpvClient to start.

BTW, given that SpvClient takes a ValidatedBlockHeader rather than a BlockHash, I may settle on using that in both interfaces. Let me know if you have any concerns. Earlier you had said:

I find it really awkward because the API they have to deserialize objects returns a blockhash, but then we expect users to call something to fetch the header and then validate it, when we could otherwise do that for them.

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 12, 2021

Ah, right, it just still feels really awkward given we could create a type-safe API which enforces that a user cannot end up passing in a different common block hash than what they give to SpvClient to start.

Yeah, enforcing a common tip -- rather than common ancestor -- through the API is a different sort of problem. What makes this a bit tricky is that there are two sets of listeners.

  1. During setup: ChannelManager and a ChannelMonitor per channel
  2. While polling: ChannelManager and ChainMonitor

Further, (1) requires using a single trusted BlockSource while (2) allows multiple source behind a Poll implementation. Note also that fetching an arbitrary header can be done through BlockSource but not Poll.

So my solution was addressing (1). I could add a utility function that glues (1) and (2) together. It may be a common -- albeit very specific -- sort of use case, so I was hoping to leave it out of lightning-block-sync in favor of putting it in the sample node. Then there wouldn't be any dependencies on specific listeners in this crate.

I find it really awkward because the API they have to deserialize objects returns a blockhash, but then we expect users to call something to fetch the header and then validate it, when we could otherwise do that for them.

Gotcha. I'd imagine the utility mentioned above would manage this for them.

@TheBlueMatt
Copy link
Collaborator

Yeah, enforcing a common tip -- rather than common ancestor -- through the API is a different sort of problem. What makes this a bit tricky is that there are two sets of listeners.

Oh, right, duh, I'd totally forgotten about this, that changes things. It does seem like an API that doesn't include knowledge of the specific listeners is ideal, but I wonder if we can still get a similar result. What do you think about passing block hashes into the utility method and then having it return a ValidatedBlockHeader, which users then pass into the construction of the SpvClient to tell it where to start.

@jkczyz jkczyz force-pushed the 2021-01-spv-client branch from 99881bd to 136f36d Compare February 12, 2021 22:06
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 12, 2021

Updated as discussed to return the ValidatedBlockHeader. Also, cleaned up the interfaces a bit to accept a reference to a Cache so that users can decide on whether it should be persisted.

use bitcoin::hash_types::BlockHash;
use bitcoin::network::constants::Network;

/// Performs a one-time sync of chain listeners using a single *trusted* block source, bringing each
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to specify what we mean by "trusted" here? (ie that they could get us stuck out on a fork we cannot disconnect from, though maybe thats not the case with the cache now?) We do validate the headers given from the source, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has indeed shifted since the introduction of the cache and enforced validation. Are there other reasons why a trusted source is needed? I suppose an untrusted source could not provide enough blocks to get all listeners in sync.

@TheBlueMatt
Copy link
Collaborator

Updated as discussed to return the ValidatedBlockHeader.

The API is looking good. I'll do a more careful review of the full code over the weekend or next week.

@jkczyz jkczyz force-pushed the 2021-01-spv-client branch 2 times, most recently from c94c272 to f372c16 Compare February 13, 2021 17:59
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only doc comments, I'm happy!

lightning-block-sync/src/poll.rs Show resolved Hide resolved
lightning-block-sync/src/poll.rs Show resolved Hide resolved
lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
///
/// Useful during startup to bring the [`ChannelManager`] and each [`ChannelMonitor`] in sync before
/// switching to [`SpvClient`].
///
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a doc example here that shows how to read 2 channelmonitors and a channelmanager from disk and create an SpvClient from them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent idea! This may get a little verbose as I'll need to implement ChainListener a few times. Do you think it is worth adding these implementations to the crate? I left them off from the draft but I could see them being useful to others since they would be commonly used adapters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I presume there's little point not having them given the crate already depends on lightning. Feel free to do it in the next PR and add docs then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, turns out the crate dependency was only needed for these implementations. It was an oversight to include it in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I mean I guess we could feature-gate it if we want in the future, but for now I see little reason to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a stab at writing an example, which turned out to be a good exercise in seeing how suitable the interface works in practice. See 1d2f5b3.

I ended up changing ChainListener's methods to take &self instead of &mut self. This was probably not entirely necessary but did make the code a little cleaner as otherwise the mutable borrows needed to be properly scoped. Definitely open to changing this if you have any strong opinions.

This did require implementing ChainListener for RefCell<ChannelMonitor> since it needs to call methods that take &mut self. I don't have a firm grasp of when trait methods should use borrows vs mutable borrows, but this seems to give some advice:

https://doc.rust-lang.org/std/cell/index.html#when-to-choose-interior-mutability

Ideally, we are at least consist in how we decide to design these interfaces that goes beyond "placates the borrow checker". :P

use bitcoin::network::constants::Network;

/// Performs a one-time sync of chain listeners using a single *trusted* block source, bringing each
/// listener's view of the chain from its paired block hash to `block_source`'s best chain tip.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention that, for users who wish to do less work during initialization, it may make sense to only bring them in sync, and not necessarily to the current tip, but that that is not yet supported by ChannelManager (we need a mode where we only accept incoming payments, and never make outgoing payments/forward).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently isn't supported by the utility either since it polls the source for the best tip.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was just thinking outloud more than anything. Its something we should eventually support, though. Will open an issue.

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 22, 2021

Rebased

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I don't think we need to fall back to RefCell, we can take &mut self to a non-mut reference and it should work fine.

impl X for &Y { fn (&mut self) { /* Y is a non-mut ref, but we can modify where it points to */ }
impl X for &mut Y { fn (&mut self) { /* Now Y is mut! */ }

}
}

impl<S, C: Deref, T: Deref, F: Deref, L: Deref, P: Deref> ChainListener for &ChainMonitor<S, C, T, F, L, P>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and above) could be Z : Deref<Target=ChainMonitor<...>> instead of &ChainMonitor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented for Z? When attempting this for these, I get:

error[E0119]: conflicting implementations of trait `ChainListener`:`

Were you able to get something to compile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically, after temporarily removing the other two implementations used in the example, I get this error:

error[E0119]: conflicting implementations of trait `ChainListener`:
   --> lightning-block-sync/src/init.rs:246:1
    |
226 | / impl<T, S, M: Deref, B: Deref, K: Deref, F: Deref, L: Deref> ChainListener for T
227 | | where
228 | |     T: Deref<Target = ChannelManager<S, M, B, K, F, L>>,
229 | |     S: keysinterface::Sign,
...   |
243 | |     }
244 | | }
    | |_- first implementation here
245 | 
246 | / impl<T, S, C: Deref, B: Deref, F: Deref, L: Deref, P: Deref> ChainListener for T
247 | | where
248 | |     T: Deref<Target = ChainMonitor<S, C, B, F, L, P>>,
249 | |     S: keysinterface::Sign,
...   |
263 | |     }
264 | | }
    | |_^ conflicting implementation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I didn't try to implement this suggestion but that makes sense - Trait<A=B> is the same type as Trait<A=C> (but Trait<A> is not the same as Trait<B>). You could, however, implement it for ChainMonitor and ChannelManager and then also for Deref<Target=B: ChainListener>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That avoids the conflict but note that when implementing for Deref I get this warning:

warning: trait objects without an explicit `dyn` are deprecated
   --> lightning-block-sync/src/init.rs:281:42
    |
281 | impl<T: ChainListener> ChainListener for Deref<Target = T> {
    |                                          ^^^^^^^^^^^^^^^^^ help: use `dyn`: `dyn Deref<Target = T>`
    |
    = note: `#[warn(bare_trait_objects)]` on by default

Which I think we want to avoid doing? If I introduce another type parameter for Deref that brings me back to having conflicting implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I was referring to though when I said it brings me back to having a conflicting implementation.

error[E0119]: conflicting implementations of trait `ChainListener` for type `lightning::ln::channelmanager::ChannelManager<_, _, _, _, _, _>`:
   --> lightning-block-sync/src/init.rs:281:1
    |
243 | / impl<S, M: Deref, B: Deref, K: Deref, F: Deref, L: Deref> ChainListener for ChannelManager<S, M, B, K, F, L>
244 | | where
245 | |     S: keysinterface::Sign,
246 | |     M::Target: chain::Watch<S>,
...   |
259 | |     }
260 | | }
    | |_- first implementation here
...
281 |   impl<T: ChainListener, U: Deref<Target = T>> ChainListener for U {
    |   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `lightning::ln::channelmanager::ChannelManager<_, _, _, _, _, _>`
    |
    = note: upstream crates may add a new impl of trait `std::ops::Deref` for type `lightning::ln::channelmanager::ChannelManager<_, _, _, _, _, _>` in future versions

error: aborting due to previous error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... so if I'm reading the error correctly, I think we can move ChainListener to the lightning crate and define the relevant listeners there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Yes, I think that would fix the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, but what exactly do you mean by defining the relevant listeners there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More precisely, implementing ChainListener for types in the lightning crate (e.g., ChannelManager).

lightning-block-sync/src/init.rs Outdated Show resolved Hide resolved
Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a first parse to get in mind the changes, still have to parse poll.rs
Great work, I'll try to integrate it to the sample soon :)

lightning-block-sync/src/init.rs Show resolved Hide resolved
lightning-block-sync/src/init.rs Show resolved Hide resolved
lightning-block-sync/src/init.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/init.rs Outdated Show resolved Hide resolved
/// Block headers for the best chain are maintained in the parameterized cache, allowing for a
/// custom cache eviction policy. This offers flexibility to those sensitive to resource usage.
/// Hence, there is a trade-off between a lower memory footprint and potentially increased network
/// I/O as headers are re-fetched during fork detection.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Note that re-fetching headers from a given height is likely to constitute an observable fingerprint and as such a privacy leak"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given that these examples are meant to serve as minimum viable demo clients, that is something that can be documented and annotated with a warning somewhere – as long as we remember to do it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tend to agree with @arik-so. Such warnings dispersed throughout the documentation may not be the best way to surface this. Additionally, I would be hesitant to add such a message without also stating what course of action a user should take to avoid the problem.

To the actual point, why is this a privacy leak? The headers are fetched when a new best block is found and when identifying forks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say you're a lightclient shuting down at block 100, sheltered behind Tor addr XYZ. I'm your BIP157 server and logs the fact you're not downloading anymore headers. Few hours, later you're back with a new tor address ABC, but requests back headers 99, 98 to handle the reorg which did happen meantime. I think it's a hint XYZ == ABC ?

Bookmarking in #565, we already have an issue for this.

///
/// The client is parameterized by a chain poller which is responsible for polling one or more block
/// sources for the best chain tip. During this process it detects any chain forks, determines which
/// constitutes the best chain, and updates the listener accordingly with any blocks that were
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should say "most-height' instead of best. AFAICT, we don't even do stateless block headers verification ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. We are checking the chainwork in ChainPoller. Am I misunderstanding your comment?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Best chain tip" is really unclear and doesn't say if it's most-work or not, semi-validated or not. If we verify chainwork, I would say "most-work chain tip".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "best" is used many places throughout this PR and including in identifiers. It would better to define what we mean by "best" in one place if there is ambiguity, IMHO.

/// forks reliably, caches may be entirely unnecessary.
///
/// [`ChainNotifier`]: struct.ChainNotifier.html
pub trait Cache {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we parameterized cache size by default to a week of blocks ? IIRC our onchain reorg handling stuff as a reorg security of only 6 blocks...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can chose how to implement their cache, including keeping the entire header chain if they wish. I'm not sure how we would enforce such parameterization.

Copy link

@ariard ariard Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and I'm not sure if it's really worthy for a lightclient to keep the entire header chain. We might give perf recommendations later when we know better.

lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
/// Finds the first common ancestor between `new_header` and `old_header`, disconnecting blocks
/// from `old_header` to get to that point and then connecting blocks until `new_header`.
///
/// Validates headers along the transition path, but doesn't fetch blocks until the chain is
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should say consistency-check to dissociate from valid-headers according to a full node verification logic ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this given "validate" is a fairly general term and used throughout poll, including by validate_pow. Any suggestions for improving Validate's documentation would be useful and could be linked from here, though.

Copy link
Contributor

@arik-so arik-so left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple nits. Looks great to me, all ongoing discussions notwithstanding, especially given it's a sample module.

lightning-block-sync/src/lib.rs Outdated Show resolved Hide resolved
}
}

impl<S, B: Deref, F: Deref, L: Deref> ChainListener for (RefCell<ChannelMonitor<S>>, B, F, L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it might look a bit nicer to just declare a type for (RefCell, …)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming suggestions? :)

}
}

impl<S, C: Deref, T: Deref, F: Deref, L: Deref, P: Deref> ChainListener for &ChainMonitor<S, C, T, F, L, P>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, but what exactly do you mean by defining the relevant listeners there?

/// Block headers for the best chain are maintained in the parameterized cache, allowing for a
/// custom cache eviction policy. This offers flexibility to those sensitive to resource usage.
/// Hence, there is a trade-off between a lower memory footprint and potentially increased network
/// I/O as headers are re-fetched during fork detection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given that these examples are meant to serve as minimum viable demo clients, that is something that can be documented and annotated with a warning somewhere – as long as we remember to do it

lightning-block-sync/src/init.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/init.rs Outdated Show resolved Hide resolved
lightning-block-sync/src/poll.rs Show resolved Hide resolved
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 23, 2021

Hmmm, I don't think we need to fall back to RefCell, we can take &mut self to a non-mut reference and it should work fine.

impl X for &Y { fn (&mut self) { /* Y is a non-mut ref, but we can modify where it points to */ }
impl X for &mut Y { fn (&mut self) { /* Now Y is mut! */ }

I didn't quite follow what you meant by this. The functions that I need to implement take &self, not &mut self. So there is a compilation error with something like this:

impl<Signer: Sign, T: Deref, F: Deref, L: Deref> ChainListener for (&mut ChannelMonitor<Signer>, T, F, L)
where
       T::Target: BroadcasterInterface,
       F::Target: FeeEstimator,
       L::Target: Logger,
{
       fn block_connected(&self, block: &Block, height: u32) {
               let txdata: Vec<_> = block.txdata.iter().enumerate().collect();
               self.0.block_connected(&block.header, &txdata, height, &*self.1, &*self.2, &*self.3);
       }

       fn block_disconnected(&self, header: &BlockHeader, height: u32) {
               self.0.block_disconnected(header, height, &*self.1, &*self.2, &*self.3);
       }
}
error[E0596]: cannot borrow `*self.0` as mutable, as it is behind a `&` reference
    --> lightning/src/chain/channelmonitor.rs:2310:3
     |
2308 |     fn block_connected(&self, block: &Block, height: u32) {
     |                        ----- help: consider changing this to be a mutable reference: `&mut self`
2309 |         let txdata: Vec<_> = block.txdata.iter().enumerate().collect();
2310 |         self.0.block_connected(&block.header, &txdata, height, &*self.1, &*self.2, &*self.3);
     |         ^^^^^^ `self` is a `&` reference, so the data it refers to cannot be borrowed as mutable

error[E0596]: cannot borrow `*self.0` as mutable, as it is behind a `&` reference
    --> lightning/src/chain/channelmonitor.rs:2314:3
     |
2313 |     fn block_disconnected(&self, header: &BlockHeader, height: u32) {
     |                           ----- help: consider changing this to be a mutable reference: `&mut self`
2314 |         self.0.block_disconnected(header, height, &*self.1, &*self.2, &*self.3);
     |         ^^^^^^ `self` is a `&` reference, so the data it refers to cannot be borrowed as mutable

Were you suggesting that there is a workaround not involving the internal mutability provided by RefCell?

@TheBlueMatt
Copy link
Collaborator

I didn't quite follow what you meant by this. The functions that I need to implement take &self, not &mut self. So there is a compilation error with something like this:

Right, I was suggesting swapping the functions back to take &mut self. If you do that, its fine to implement the trait on a &Struct, which means the &mut self is now a mutable reference to a regular reference, so multiple access of the &Struct is allowed.

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 23, 2021

I didn't quite follow what you meant by this. The functions that I need to implement take &self, not &mut self. So there is a compilation error with something like this:

Right, I was suggesting swapping the functions back to take &mut self. If you do that, its fine to implement the trait on a &Struct, which means the &mut self is now a mutable reference to a regular reference, so multiple access of the &Struct is allowed.

Gotcha. My larger question was around interface design. That is, when should a trait's methods take &mut self. I find the answer "whenever an implementation needs to mutate" lacking as presumably there may always be some implementation that requires mutation.

As a different example, a look-up method for a cache shouldn't take &mut self even though an LRU cache implementation would require it. Thus, it would need to rely on internal mutability to achieve its goals.

For listening, it seems like a grey area at first. A listener may be passive and simply log events or may need to modify some state based on events. For the passive logging example, the logger may itself need to modify its own internal state. But would a logger interface take &mut self? I'd wager not. Further, more generally, whatever is notifying the listeners would need to either own or hold a reference to them, making the interface dictate which type of reference to hold.

So I tend towards choosing &self here, but would be interested in any counterpoints to this specific example and to the larger question.

(Note that both ChannelManager and ChainMonitor use internal mutability which makes it possible to take &self for them.)

@TheBlueMatt
Copy link
Collaborator

Right, I don't think Rust has a really good answer for this - interior mutability means you lose one of rustc's greatest safety features (not to mention runtime checking costs), while, I agree, making everything mut and wrapping in a Deref to avoid needing mut is another layer of indirection that makes the API hard to read (not to mention runtime cost of a pointer indirection in some cases). You're sort of damned if you do, damned if you don't here, in general.

In this specific case, I'd vote for the &mut option and implementing it on Deref here because we expect users to primarily work with Deref things anyway - Arcs to manager and chainmonitors in the multi-threaded case, and in the init case we enforce single ownership of the ChannelMonitor directly in the compiler, which is nice.

@jkczyz jkczyz force-pushed the 2021-01-spv-client branch from 5e4b218 to da79fb0 Compare February 24, 2021 03:41
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 24, 2021

In this specific case, I'd vote for the &mut option and implementing it on Deref here because we expect users to primarily work with Deref things anyway - Arcs to manager and chainmonitors in the multi-threaded case, and in the init case we enforce single ownership of the ChannelMonitor directly in the compiler, which is nice.

Ok, in order to get everything working with Deref, I needed to parameterize SpvClient with L: Deref where L::Target: ChainListener. However, IIUC, this means I must us RefCell for ChannelMonitor because making ChainListener methods take &mut self would require changing these to use DerefMut, which Arc does not implement.

I think I touched on most of this offline, but let me know if I'm off on anything.

Also, I ended up pushing the change through to ChainNotifier as it made it a bit cleaner. Renamed ChainListener to Listen now that it is under lightning::chain, too.

@TheBlueMatt
Copy link
Collaborator

Ok, in order to get everything working with Deref, I needed to parameterize SpvClient with L: Deref where L::Target: ChainListener

Wait, why? Can't we switch to SpvClient using just T: ChainListener and then implement Chainlistener for &TheThings?

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 24, 2021

Ok, in order to get everything working with Deref, I needed to parameterize SpvClient with L: Deref where L::Target: ChainListener

Wait, why? Can't we switch to SpvClient using just T: ChainListener and then implement Chainlistener for &TheThings?

That will work for references but breaks on any other types that implement Deref like Arc. I could implement it for Arc, too, but why not parameterize SpvClient with Deref just as we do for types like ChannelManager? Then it would support any type that implements Deref.

Copy link
Contributor Author

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will get to the remaining comments in the morning.

lightning-block-sync/src/init.rs Show resolved Hide resolved
lightning-block-sync/src/init.rs Show resolved Hide resolved
lightning-block-sync/src/poll.rs Show resolved Hide resolved
@TheBlueMatt
Copy link
Collaborator

That will work for references but breaks on any other types that implement Deref like Arc. I could implement it for Arc, too, but why not parameterize SpvClient with Deref just as we do for types like ChannelManager? Then it would support any type that implements Deref.

Oops, sorry, I meant implement Listener for Deref types.

@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 24, 2021

Oops, sorry, I meant implement Listener for Deref types.

I think I'm already doing so here:

e012c4c#diff-6dfc691076e612ec16f5f292503bf11fe71edf95700ee7e9a8a67b19ce6719e3R139

But that gives me compilation errors when passing references. If I also implement for &T, the example fails to compile when modifying it to use Rc. I might be missing something, so if you want to play around with it I made a branch with the change: jkczyz@1bbfe12.

@jkczyz jkczyz force-pushed the 2021-01-spv-client branch from da79fb0 to c28cb71 Compare February 24, 2021 21:14
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 24, 2021

Only a couple nits. Looks great to me, all ongoing discussions notwithstanding, especially given it's a sample module.

Note that this is intended to be used outside of a sample (but not required), though the various traits would likely be customized to something more suitable for the use case.

Comment on lines +65 to +78
/// let (monitor_block_hash, mut monitor) = <(BlockHash, ChannelMonitor<S>)>::read(
/// &mut Cursor::new(&serialized_monitor), keys_manager).unwrap();
///
/// // Read the channel manager paired with the block hash when it was persisted.
/// let serialized_manager = "...";
/// let (manager_block_hash, mut manager) = {
/// let read_args = ChannelManagerReadArgs::new(
/// keys_manager,
/// fee_estimator,
/// chain_monitor,
/// tx_broadcaster,
/// logger,
/// config,
/// vec![&mut monitor],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm @TheBlueMatt is it the case that ChannelMonitors need to be synced to chain before being handed to the ChannelManager on deserialization? I thought it was but maybe wrong...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so the docs for ChannelManagerReadArgs seem to say you should reconnect after deserializing the ChannelManager (and I tend to trust docs written at the time the code was written more than my recollection). If I had to guess, I'd think it would work ok either way, you just end up with events already ready to be handled by the manager before its even ready to go, but the only thing its supposed to do that requires mut is broadcast using the monitor iff the monitor is out of sync with the updates the manager sent, and the monitor deciding to close itself based on blockchain data is not such an update.

///
/// Validates headers along the transition path, but doesn't fetch blocks until the chain is
/// disconnected to the fork point. Thus, this may return an `Err` that includes where the tip
/// ended up which may not be `new_header`. Note that if the returned `Err` contains `Some`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the previous use of iff instead of if was not a typo - the inverse is also true, and important to note.

Copy link
Contributor Author

@jkczyz jkczyz Feb 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/inverse/converse :)

Right, I mistakenly interpreted this to include the Ok case. Switching back.

/// required to be built in terms of it to ensure chain data validity.
///
/// [`ChainPoller`]: ../struct.ChainPoller.html
pub trait Poll {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have other ideas for trait name ? It's conflicting with return type of poll_recv used in the sample, see https://docs.rs/tokio/1.2.0/tokio/sync/mpsc/struct.Receiver.html.

Or we can just alias the type ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fully-qualify it as std::task::Poll instead? It's a pretty short name even when fully qualified. I'd imagine you could break the code up reasonably where they wouldn't need to be used in the same file, as well.

SPV clients need to poll one or more block sources for the best chain
tip and to retrieve related chain data. The Poll trait serves as an
adaptor interface for BlockSource. Implementations may define an
appropriate polling strategy.
ChainPoller defines a strategy for polling a single BlockSource. It
handles validating chain data returned from the BlockSource. Thus, other
implementations of Poll must be defined in terms of ChainPoller.
Add an interface for being notified of block connected and disconnected
events, along with a notifier for generating such events. Used while
polling block sources for a new tip in order to feed these events into
ChannelManager and ChainMonitor.
Adds a lightweight client for polling one or more block sources for the
best chain tip. Notifies listeners of blocks connected or disconnected
since the last poll. Useful for keeping a Lightning node in sync with
the chain.
Add a utility for syncing a set of chain listeners to a common chain
tip. Required to use before creating an SpvClient when the chain
listener used with the client is actually a set of listeners each of
which may have had left off at a different block. This would occur when
the listeners had been persisted individually at different frequencies
(e.g., a ChainMonitor's individual ChannelMonitors).
@jkczyz jkczyz force-pushed the 2021-01-spv-client branch from 0ed242e to 8bfdfdc Compare February 26, 2021 06:55
@jkczyz
Copy link
Contributor Author

jkczyz commented Feb 26, 2021

Squashed!

@TheBlueMatt TheBlueMatt merged commit e77b160 into lightningdevkit:main Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants