Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Block Tracker #21

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

ProofOfKeags
Copy link
Contributor

@ProofOfKeags ProofOfKeags commented Jan 15, 2025

Description

This PR implements a reorg resistant blockchain tracker that should correctly track the full lifecycle of any transactions that a consumer wants to monitor. The PR should be reviewed in stages, giving attention to different aspects during progression. This should help reduce the amount of unnecessary review cycles, allowing us to solve the bigger, more foundational problems first, and move onto refinement after.

  • Interface design
  • Property testing completeness
  • Implementation correctness (edge case handling)
  • Algorithmic implementation efficiency
  • Documentation clarity, correctness, and completeness
  • Code organization (which files/order of presentation within files)
  • Silicon implementation efficiency

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature/Enhancement (non-breaking change which adds functionality or enhances an existing one)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactor

Checklist

  • I have performed a self-review of my code.
  • I have commented my code where necessary.
  • I have updated the documentation if needed.
  • My changes do not introduce new warnings.
  • I have added tests that prove my changes are effective or that my feature works.
  • New and existing tests pass with my changes.

Related Issues

}
}

#[derive(Clone)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always derive Debug as well.

}
}

#[derive(Clone)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always derive Debug as well.

Comment on lines 46 to 103
impl BtcZmqConfig {
/// This generates a default config that will not connect to any of the bitcoind zeromq interfaces. It is useful in
/// conjunction with subsequent mutations for partial initialization.
pub fn empty() -> BtcZmqConfig {
BtcZmqConfig {
bury_depth: 6,
hashblock_connection_string: None,
hashtx_connection_string: None,
rawblock_connection_string: None,
rawtx_connection_string: None,
sequence_connection_string: None,
}
}

/// Updates the BtcZmqConfig with a zmqpubhashblock connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_hashblock_connection_string(mut self, s: &str) -> Self {
self.hashblock_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubhashtx connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_hashtx_connection_string(mut self, s: &str) -> Self {
self.hashtx_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubrawblock connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_rawblock_connection_string(mut self, s: &str) -> Self {
self.rawblock_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubrawtx connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_rawtx_connection_string(mut self, s: &str) -> Self {
self.rawtx_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubsequence connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_sequence_connection_string(mut self, s: &str) -> Self {
self.sequence_connection_string = Some(s.to_string());
self
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the builder pattern. I like it. But it is missing the .build() method.
See https://rust-unofficial.github.io/patterns/patterns/creational/builder.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then build() -> BtcZmq, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, as it stands right now, isn't exactly the builder pattern since there is no intermediate type to call .build upon to get the actual type that is being built. That said, you might want to check out bon to reduce some of the boilerplate involved.

The nicest way to do this is to use the typestate pattern to only expose accessors if the corresponding value has been set. But that is a massive overkill for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow bon seems awesome :)

Comment on lines 104 to 146
pub struct Subscription<T> {
receiver: mpsc::Receiver<T>,
}

impl<T> Subscription<T> {
fn from_receiver(receiver: mpsc::Receiver<T>) -> Subscription<T> {
Subscription {
receiver,
}
}
}

impl<T> Stream for Subscription<T> {
type Item = T;

fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
self.get_mut().receiver.poll_recv(cx)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need trait bounds (or trait markers) in T.
I'm not sure, but having a bound is always a good idea, if applicable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why we'd want to constrain it at all. The construction works independent of T

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a bound is always a good idea, if applicable

Hard disagree. Code should be written in as general way as possible. The trait bound should only be invoked if the essential aspects of that trait are used. In this case, the subscription is a stream irrespective of the contents of the subscription. This is a somewhat trivial case since it's really just a trivial indirection from the underlying tokio channel but the principle still holds. Regardless of what you are subscribing to, the subscription behaves like a stream.

Copy link
Member

@storopoli storopoli Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this T need to be Sync?
Yeah your argument makes sense. We can keep T as it is if it works fine

Copy link
Collaborator

@Rajil1213 Rajil1213 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, I would be tempted to at least constrain the generic to Sized. But that's mostly because when using an unconstrained generic, I usually run into issues that is solved by constraining it with Sized. If that isn't the problem here, I'm fine with leaving it unconstrained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can definitely see sized being something that kicks in as a requirement at some point so you may be right.

Copy link
Collaborator

@Rajil1213 Rajil1213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few cosmetic nits and some questions.

I like how the subscription API is coming along. Definitely needs a bunch of tests though. There is a lot of async coordination going on that I couldn't completely reason about looking at the code alone.

use std::pin::Pin;
use std::sync::Arc;
use std::task::Context;
use std::task::Poll;
Copy link
Collaborator

@Rajil1213 Rajil1213 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustfmt should have grouped these more nicely, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll address before finality but I want to focus on things in the following order:

  1. Interface design
  2. Property testing completeness
  3. Implementation correctness (edge case handling)
  4. Algorithmic implementation efficiency
  5. Documentation clarity, correctness, and completeness
  6. Code organization (which files/order of presentation within files)
  7. Silicon implementation efficiency

I still consider this to be in stage 1. Feel free to leave feedback on any of the stages but I'll address in roughly this order. To the extent that you can focus review on the stage we're in (currently 1 and 2), that would be ideal. If something is preventing you from properly evaluating those two, let me know.

I'll add this stage checklist to the main body of the PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. I only wanted to flag this because it's something that your editor should take care of for you instead of having to dedicate actual hours on this.

Comment on lines +22 to +26
/// BtcZmqConfig is the main configuration type used to establish the connection with the ZMQ interface of Bitcoin. It
/// accepts independent connection strings for each of the stream types. Any connection strings that are left as None
/// when initializing the BtcZmqClient will result in those streams going unmonitored. In the limit, this means that the
/// default BtcZmqConfig will result in a BtcZmqClient that does absolutely nothing (NOOP).
pub struct BtcZmqConfig {
Copy link
Collaborator

@Rajil1213 Rajil1213 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally structure these docs to have a single header line followed by a longer paragraph. There is also a clippy lint for this that is allowed by default but something we might deny later. Tidying this up now will make our work much easier in the future.

The content itself is top-notch though!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the content is REALLY good.

@ProofOfKeags check the engineering best practices for docstrings in this section of our notion doc: https://www.notion.so/Engineering-Rust-Best-Practices-14c901ba000f8025a974ec8e2a70b4af?pvs=4#14c901ba000f80b38a1fc290ef3a900a

Comment on lines 46 to 103
impl BtcZmqConfig {
/// This generates a default config that will not connect to any of the bitcoind zeromq interfaces. It is useful in
/// conjunction with subsequent mutations for partial initialization.
pub fn empty() -> BtcZmqConfig {
BtcZmqConfig {
bury_depth: 6,
hashblock_connection_string: None,
hashtx_connection_string: None,
rawblock_connection_string: None,
rawtx_connection_string: None,
sequence_connection_string: None,
}
}

/// Updates the BtcZmqConfig with a zmqpubhashblock connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_hashblock_connection_string(mut self, s: &str) -> Self {
self.hashblock_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubhashtx connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_hashtx_connection_string(mut self, s: &str) -> Self {
self.hashtx_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubrawblock connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_rawblock_connection_string(mut self, s: &str) -> Self {
self.rawblock_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubrawtx connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_rawtx_connection_string(mut self, s: &str) -> Self {
self.rawtx_connection_string = Some(s.to_string());
self
}

/// Updates the BtcZmqConfig with a zmqpubsequence connection string and returns the updated config. Useful for a
/// builder pattern with dotchaining.
pub fn with_sequence_connection_string(mut self, s: &str) -> Self {
self.sequence_connection_string = Some(s.to_string());
self
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, as it stands right now, isn't exactly the builder pattern since there is no intermediate type to call .build upon to get the actual type that is being built. That said, you might want to check out bon to reduce some of the boilerplate involved.

The nicest way to do this is to use the typestate pattern to only expose accessors if the corresponding value has been set. But that is a massive overkill for this.

Comment on lines 104 to 146
pub struct Subscription<T> {
receiver: mpsc::Receiver<T>,
}

impl<T> Subscription<T> {
fn from_receiver(receiver: mpsc::Receiver<T>) -> Subscription<T> {
Subscription {
receiver,
}
}
}

impl<T> Stream for Subscription<T> {
type Item = T;

fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
self.get_mut().receiver.poll_recv(cx)
}
}
Copy link
Collaborator

@Rajil1213 Rajil1213 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, I would be tempted to at least constrain the generic to Sized. But that's mostly because when using an unconstrained generic, I usually run into issues that is solved by constraining it with Sized. If that isn't the problem here, I'm fine with leaving it unconstrained.


fn setup() -> Result<(BtcZmqClient, corepc_node::Node), Box<dyn std::error::Error>> {
let mut bitcoin_conf = corepc_node::Conf::default();
bitcoin_conf.view_stdout = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to get away with just setting RUST_LOG to trace for debugging purposes. Is there any reason we need the stdout too though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was for my own ability to debug stuff since I wanted to ensure that the bitcoind side was actually publishing on the zmq interface.

Can you clarify whatyou mean by:

You should be able to get away with just setting RUST_LOG to trace for debugging purposes.

Copy link
Collaborator

@Rajil1213 Rajil1213 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set the environment variable RUST_LOG to trace and that should produce a lot more logs from corepc-node including the actual requests and response. This requires that you first initialize the logger which you can do via strata_common::logging::init(...). More on this here: https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/struct.EnvFilter.html

let newly_mined = bitcoind.client.generate_to_address(1, &bitcoind.client.new_address()?)?.into_model()?;
let blk = block_sub.next().await.map(|b|b.block_hash());
assert_eq!(newly_mined.0.first(), blk.as_ref());
drop(client);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this explicit drop necessary? If we are doing this, should we also check if the thread has indeed been aborted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that rustc does not get cute with doing a premature drop, causing the producer thread to terminate prematurely (by virtue of its drop trait)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also check if the thread has indeed been aborted?

That's a great property to test. I'd have to think about how to design such a test though.

Copy link
Collaborator

@Rajil1213 Rajil1213 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest way to change the visible of the JoinHandle to pub(crate) and then, check for is_canceled. But that may be too literal/specific to qualify as a good test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the other test design here would be to ensure that all subscription channels have their send sides closed when the client is dropped. Because the subscription doesn't contain references to the client, it isn't automatically invalidated by lifetime analysis (which is probably a good thing). I'm hesitant to leak the JoinHandle since it breaks some of the abstraction. I'm focusing on tests right now so I'll probably have better ideas in the next day or so here.

self.tx_lifecycles.insert(matched_tx.compute_txid(), lifecycle);
}
Some(lifecycle) => {
match (&lifecycle.raw, lifecycle.block, lifecycle.seq_received) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are a bunch of runtime invariants that we are trying to handle. The comments definitely help but is there a way to encode this at compile time instead? Perhaps with enums with struct variants?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree with this push. The answer is I believe that we can. As I was going through implementation I think I figured out a way to do a CBC encoding of this to reduce and possibly eliminate the runtime invariants, but I wanted to complete a full sketch before going back and refining.

pub async fn subscribe_transactions(&mut self, f: impl Fn(&Transaction) -> bool + Sync + Send + 'static) ->
Subscription<(Transaction, TxStatus)> {

let (send, recv) = mpsc::channel(4);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any specific reason to go with this queue size?. Same question for the choice of 10 in the subscribe_blocks method below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No they were just best guesses that I went with to start. I'll drop a TODO in there to give a more careful thought to the magick numbers.

@ProofOfKeags ProofOfKeags force-pushed the STR-821-duty-tracker-interface branch from d3a76dc to 0083e8c Compare January 15, 2025 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants