Initial server commit. #1

arik-so · 2022-05-20T03:17:01Z

No description provided.

… already exist

…es, efficacy measurement and response length visibility

…ith the gossip.

* add channel update delta tallies, and omit updates that have no delta * Update the server full gossip data once the downloader is caught up with the gossip. * Cache the warp response type directly.

…tes.

…nel flags field

…emental channel updates

…dates

…hether intermediate updates should be included, and preparing an easily traversable serialization set.

� Conflicts: � src/download.rs � src/main.rs � src/server.rs

…e method

…activating dynamic response route.

TheBlueMatt

Two more notes about being robust against a slow db

TheBlueMatt · 2022-08-21T02:12:24Z

src/persistence.rs

+impl GossipPersister {
+	pub fn new(server_sync_completion_sender: mpsc::Sender<()>, network_graph: Arc<NetworkGraph<Arc<TestLogger>>>) -> Self {
+		let (gossip_persistence_sender, gossip_persistence_receiver) =
+			mpsc::channel::<DetectedGossipMessage>(10000);


This should probably not be so huge - if the DB gets behind we'll end up creating snapshots that are missing db entries that are still in the queue.

We won't actually. The snapshot generation is triggered by the persistence tasks being completed, not the gossip getting caught up.

But isn't the snapshot generation based on an entry in the queue? There can still be updates in the queue after that entry.

Yes, but there will always be updates in the queue after that entry because you're gonna keep receiving new gossip. The criterion is "sufficiently caught up," but considering that snapshot generation takes a couple minutes to calculate, the odds of not receiving any gossip during that time are diminishingly low. We can't stop and restart snapshot generation every time we receive a new gossip message, right? We'd never produce any snapshots.

TheBlueMatt · 2022-08-21T02:14:39Z

src/downloader.rs

+		let gossip_message = GossipMessage::ChannelAnnouncement(msg.clone());
+		let detected_gossip_message = DetectedGossipMessage {
+			message: gossip_message,
+			timestamp_seen: timestamp_seen as u32,


We should create the timestamp when we insert or we get a race condition - we have something sitting in the queue with a timestamp a second ago but we create a snapshot with now. I think we should instead at rely on the DB to add the timestamp.

One issue with letting postgres generate the timestamp is that the db's clock may not be in sync with the clock the rust server is running on (thinking of hosted environments here). I think moving the timestamp to the persistence job, and allowing for sub-second deltas, is the prudent way forward.

I think that's fine? As long as we do the filtering in SQL itself it'll work itself out.

TheBlueMatt · 2022-08-21T05:15:36Z

src/downloader.rs

+			timestamp_seen: timestamp_seen as u32,
+		};
+		let sender = self.sender.clone();
+		tokio::spawn(async move {


Rather than spawning here, and in order to block if we get too far ahead of the DB, we should first do a sender.try_send and then, if that fails, do a tokio::block_on.

Is that still applicable considering #1 (comment)? Also, keep in mind that large divergences are only possible during initial sync, where the snapshot process is only triggered after the persistence is caught up with the downloaded gossip.

TheBlueMatt · 2022-08-21T05:30:35Z

src/snapshot.rs

+							fs::remove_file(&symlink_path).unwrap();
+						}
+						println!("Recreating symlink: {} -> {}", symlink_path, snapshot_path);
+						symlink(&canonical_snapshot_path, &symlink_path).unwrap();


Can we make these symlinks relative somehow? I want to be able to rsync the two folders around and have the symlinks still be valid.

they were originally relative, but that wasn't working on my machine. Would you mind experimenting here a little bit?

arik-so · 2022-08-21T06:51:37Z

#1 (comment)

Will do.

TheBlueMatt · 2022-08-21T19:17:52Z

Things left, I think:

Initial server commit. #1 (comment)
blocking when db gets behind
moving dates into SQL instead of doing them locally
CI (incl MSRV)
relative symlinks (and always generating, say, 10k symlinks?)
peer reconnection
better detection of "we're synced" on multiple peers

…ve hours.

src/tracking.rs

TheBlueMatt · 2022-08-21T20:27:44Z

src/tracking.rs

+		tokio::spawn(async move {
+			disconnection_future.await;
+			eprintln!("Disconnected from peer {}@{}", current_peer.0.to_hex(), current_peer.1.to_string());
+			monitor_peer_connection(current_peer.clone(), peer_manager_clone);


This will just silently stop trying to reconnect if the peer reboots and is refusing connections for a while, no? We should retry always.

so we actually have a bigger issue. Without executor, I can't await monitor_peer_connection now that I've made it async, because it needs to recursively check type safety.

error[E0391]: cycle detected when unsafety-checking tracking::monitor_peer_connection
--> src/tracking.rs:158:1
|
158 | async fn monitor_peer_connection(current_peer: (PublicKey, SocketAddr), peer_manager: GossipPeerManager) -> bool {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: ...which requires building MIR for tracking::monitor_peer_connection...
--> src/tracking.rs:158:1
|
158 | async fn monitor_peer_connection(current_peer: (PublicKey, SocketAddr), peer_manager: GossipPeerManager) -> bool {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: ...which requires type-checking tracking::monitor_peer_connection...
--> src/tracking.rs:171:3
|
171 | tokio::spawn(async move {
| ^^^^^^^^^^^^
= note: ...which requires evaluating trait selection obligation for<'r> {std::future::ResumeTy, impl futures::Future, (), &'r (bitcoin::secp256k1::PublicKey, std::net::SocketAddr), (bitcoin::secp256k1::PublicKey, std::net::SocketAddr), std::sync::Arc<lightning::ln::peer_handler::PeerManager<lightning_net_tokio::SocketDescriptor, std::sync::Arc<lightning::ln::peer_handler::ErroringMessageHandler>, std::sync::Arc<downloader::GossipRouter>, std::sync::Arc<types::TestLogger>, std::sync::Arc<lightning::ln::peer_handler::IgnoringMessageHandler>>>, impl futures::Future}: std::marker::Send...
note: ...which requires computing type of tracking::monitor_peer_connection::{opaque#0}...
--> src/tracking.rs:158:109
|
158 | async fn monitor_peer_connection(current_peer: (PublicKey, SocketAddr), peer_manager: GossipPeerManager) -> bool {
| ^^^^
= note: ...which again requires unsafety-checking tracking::monitor_peer_connection, completing the cycle
note: cycle used when computing type of <impl at src/lib.rs:57:1: 132:2>::start_sync::{opaque#0}
--> src/lib.rs:86:33
|
86 | pub async fn start_sync(&self) {
| ^

Matt, I really need your help here.

Oh, also, rustc complains here now:

Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: warning: unused implementer of `futures::Future` that must be used Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: --> src/tracking.rs:176:4 Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: | Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: 176 | monitor_peer_connection(current_peer.clone(), peer_manager_clone); Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: | Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: = note: `#[warn(unused_must_use)]` on by default Aug 21 23:54:41 ldk-gossip-sync-server.bitcoin.ninja run.sh[2348]: = note: futures do nothing unless you `.await` or poll them

Ok, I was wondering why you weren't responding to this thread, where I had already asked about that like two hours ago: #1 (comment)

But turns out Github didn't upload my comments, sorry

Heh, okay just leave this one and fix the other stuff, we'll land this and I'll do it as a followup pr

TheBlueMatt · 2022-08-21T23:03:21Z

src/tracking.rs

+		tokio::spawn(async move {
+			disconnection_future.await;
+			eprintln!("Disconnected from peer {}@{}", current_peer.0.to_hex(), current_peer.1.to_string());
+			// TODO: figure out how to await this


You can just have the failure case in the else Failed block retry.

It would be nice not to indefinitely keep retrying, especially on first connect. If the peer is no longer available, it should optimally crash, instructing the user to find better peers.

TheBlueMatt · 2022-08-21T23:16:56Z

src/lookup.rs

+
+	for current_announcement_row in announcement_rows {
+		let blob: String = current_announcement_row.get("announcement_signed");
+		let data = hex_utils::to_vec(&blob).unwrap();


Why are we hex-encoding the blobs? Can't we just store them as blobs?

TheBlueMatt · 2022-08-21T23:26:54Z

src/lookup.rs

+	};
+
+	println!("Obtaining corresponding database entries");
+	// get all the channel announcements that are currently in the network graph


Why aren't we filtering this by "first seen after our target time"?

Because that would be incorrect. There can be announcements that were first seen before our target time, but whose first channel update, or even whose first channel update in a given direction, was seen after.

Why do we need a channel announcement to be returned if we're only giving users the channel update?

TheBlueMatt · 2022-08-21T23:27:25Z

src/lookup.rs

+
+	// here is where the channels whose first update in either direction occurred after
+	// `last_seen_timestamp` are added to the selection
+	let unannounced_rows = client.query("SELECT short_channel_id, blob_signed, seen FROM (SELECT DISTINCT ON (short_channel_id) short_channel_id, blob_signed, seen FROM channel_updates ORDER BY short_channel_id ASC, seen ASC) AS first_seens WHERE first_seens.seen >= $1", &[&last_sync_timestamp_object]).await.unwrap();


I'm confused why we need two queries here at all? We should be able to do this all in a single query, no?

Above reason is why we can't.

arik-so · 2022-08-22T00:47:28Z

Are my comments not getting sent or something?

TheBlueMatt · 2022-08-22T01:01:35Z

No, they were not... thanks GitHub

Arik Sosman and others added 30 commits January 11, 2022 20:09

Configure SAST in .gitlab-ci.yml, creating this file if it does not…

c34851a

… already exist

relocation commit

fb538a4

create naïve downloader and server functionality

f5e64f6

remove redundant imports and move db commands to config

f399fb6

offload the compression to flate2 for better benchmarking opportuniti…

64bd942

…es, efficacy measurement and response length visibility

add channel update delta tallies, and omit updates that have no delta

1cfce08

Update the server full gossip data once the downloader is caught up w…

8d637cd

…ith the gossip.

Cache the warp response type directly.

b3de771

Dynamic gossip update (#1)

7d9289d

* add channel update delta tallies, and omit updates that have no delta * Update the server full gossip data once the downloader is caught up with the gossip. * Cache the warp response type directly.

Make response pending first load return status 503.

81eb1dd

Serve both full and dynamic gossip.

3ea8f12

Change output type to first list all announcements, and then all upda…

47f1dc1

…tes.

Fix bug where type data was serialized upon timestamp zeroing.

836175a

Output incremental updates where possible

a76d45f

Encode incremental channel updates using unused bit flags in the chan…

691b10b

…nel flags field

use updated serialziation format with default disclosure for non-incr…

8f7aa97

…emental channel updates

create separate server path for snapshotted data

08accf1

properly store seen timestamps and solely rely on them for dynamic up…

849b1ca

…dates

create mechanism for calculating set deltas efficiently, discerning w…

b5967ba

…hether intermediate updates should be included, and preparing an easily traversable serialization set.

use new serialization mechanism in dynamic server response calculation

e5581c1

remove experiment setup constants

b8ae6ef

Remove sample module.

6bc4af7

Merge branch 'main' of github.com:arik-so/rust-ln-sync into arik_main

8b4f3e6

� Conflicts: � src/download.rs � src/main.rs � src/server.rs

add some config and rationale comments

9f824b5

Remove unused compression module

e8c3844

remove unused functions in serialization

24b835d

add snapshotting template

726d87c

refactor server response serialization and compression into a separat…

b36dece

…e method

store created snapshots

1d32613

Await sync completion prior to initializing snapshotting service and …

cbe3109

…activating dynamic response route.

arik-so added 3 commits August 20, 2022 16:41

Insert schema version into config table.

5944d7b

Don't immediately persist network graph.

09d3658

Fix tokio sleep behavior.

835dab1