Skip to content

Commit

Permalink
Merge branch 'release/0.6.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
gcarq committed Oct 25, 2016
2 parents 8b90c21 + 6d3355c commit 981fc75
Show file tree
Hide file tree
Showing 12 changed files with 156 additions and 247 deletions.
6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "rusty-blockparser"
version = "0.5.4"
version = "0.6.0"
authors = ["gcarq <[email protected]>"]
include = ["src/*", "sql/*", "LICENSE", "README.md", "Cargo.toml"]
description = "Multithreaded Blockchain Parser for most common Cryptocurrencies based on Bitcoin"
Expand All @@ -13,12 +13,12 @@ license = "GPL-3.0"
[dependencies]
time = "~0.1"
log = "~0.3"
clap = "~2.9"
clap = "~2.16"
rust-crypto = "~0.2"
rustc-serialize = "~0.3"
byteorder = "~0.5"
rust-base58 = "~0.0"

seek_bufread = "~1.2"

# The development profile, used for `cargo build`
[profile.dev]
Expand Down
65 changes: 27 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ It assumes a local copy of the blockchain, typically downloaded by Bitcoin core.
The program flow is split up in two parts.
Lets call it ParseModes:

* **HeaderOnly**
* **Indexing**

If the parser is started the first time, it iterates over all blk.dat files and seeks from header to header. It doesn't evaluates the whole block it just calculates the block hashes to determine the main chain. So we only need to keep ~50 Mb in RAM instead of the whole Blockchain. This process is very fast and takes only **7-8 minutes with 2-3 threads and a average HDD (bottleneck here is I/O)***.
The main chain is saved as a JSON file, lets call it ChainStorage. (The path can be specified with `--chain-storage`)
Expand Down Expand Up @@ -117,7 +117,7 @@ Transaction Types:

* **Resume scans**

If you sync the blockchain at some point later, you don't need to make a FullData rescan. Just use `--resume` to force a HeaderOnly scan followed by a FullData scan which parses only new blocks. If you want a complete FullData rescan delete the ChainStorage json file.
If you sync the blockchain at some point later, you don't need to make a FullData rescan. Just use `--resume` to force a Reindexing followed by a FullData scan which parses only new blocks. If you want a complete FullData rescan delete the ChainStorage json file.

## Installing

Expand Down Expand Up @@ -179,51 +179,42 @@ Now export this wrappper with: `export RUSTC="./rustc-wrapper.sh"` and execute `

## Usage
```
Usage:
target/debug/rusty-blockparser [OPTIONS] CALLBACK ARGUMENTS [...]
Multithreaded Blockchain Parser written in Rust
positional arguments:
callback Set a callback to execute. See `--list-callbacks`
arguments All following arguments are consumed by this callback.
optional arguments:
-h,--help show this help message and exit
--list-coins Lists all implemented coins
--list-callbacks Lists all available callbacks
-c,--coin COINNAME Specify blockchain coin (default: bitcoin)
-d,--blockchain-dir PATH
Set blockchain directory which contains blk.dat files
(default: ~/.bitcoin/blocks)
--verify-merkle-root BOOL
Verify merkle root (default: false)
-t,--threads COUNT Thread count (default: 2)
-r,--resume Resume from latest known block
--new Force complete rescan
-s,--chain-storage PATH
Specify path to chain storage. This is just a internal
state file (default: chain.json)
--backlog COUNT Set maximum worker backlog (default: 100)
-v,--verbose Increases verbosity level. Error=0, Info=1, Debug=2,
Trace=3 (default: 1)
--version Show version
USAGE:
rusty-blockparser [FLAGS] [OPTIONS] [SUBCOMMAND]
FLAGS:
-h, --help Prints help information
-n, --reindex Force complete reindexing
-r, --resume Resume from latest known block
-V, --version Prints version information
-v Increases verbosity level. Info=0, Debug=1, Trace=2 (default: 0)
--verify-merkle-root Verifies the merkle root of each block
OPTIONS:
--backlog <COUNT> Sets maximum worker backlog (default: 100)
-d, --blockchain-dir <blockchain-dir> Sets blockchain directory which contains blk.dat files (default: ~/.bitcoin/blocks)
--chain-storage <FILE> Specify path to chain storage. This is just a internal state file (default: chain.json)
-c, --coin <NAME> Specify blockchain coin (default: bitcoin) [values: bitcoin, testnet3, namecoin, litecoin, dogecoin, myriadcoin,
unobtanium]
-t, --threads <COUNT> Thread count (default: 2)
SUBCOMMANDS:
csvdump Dumps the whole blockchain into CSV files
help Prints this message or the help of the given subcommand(s)
simplestats Shows various Blockchain stats
```
### Example

To make a `csvdump` of the Bitcoin blockchain your command would look like this:
```
# ./blockparser -t 3 csvdump /path/to/dump/
[00:42:19] INFO - main: Starting blockparser-0.3.0 ...
[00:42:19] INFO - init: No header file found. Generating a new one ...
[00:42:19] INFO - main: Starting rusty-blockparser v0.6.0 ...
[00:42:19] INFO - blkfile: Reading files from folder: ~/.bitcoin/blocks
[00:42:19] INFO - parser: Parsing with mode HeaderOnly (first run).
[00:42:19] INFO - parser: Building blockchain index ...
...
[00:50:46] INFO - dispatch: All threads finished.
[00:50:46] INFO - dispatch: Done. Processed 393496 blocks in 8.45 minutes. (avg: 776 blocks/sec)
[00:50:47] INFO - chain: Inserted 393489 new blocks ...
[00:50:48] INFO - main: Iteration 1 finished.
[00:50:49] INFO - blkfile: Reading files from folder: ~/.bitcoin/blocks
[00:50:49] INFO - parser: Parsing 393489 blocks with mode FullData.
[00:50:49] INFO - callback: Using `csvdump` with dump folder: csv-dump/ ...
Expand All @@ -234,8 +225,6 @@ Dumped all blocks: 393489
-> transactions: 103777752
-> inputs: 274278239
-> outputs: 308285408
[02:04:42] INFO - chain: Inserted 0 new blocks ...
[02:04:42] INFO - main: Iteration 2 finished.
```


Expand Down
75 changes: 62 additions & 13 deletions src/blockchain/parser/chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,17 +64,15 @@ impl ChainStorage {
let latest_known_idx = transform!(headers.iter().position(|h| h.hash == latest_hash));

let mut new_hashes = hashes.split_off(latest_known_idx + 1);

if new_hashes.len() > 0 {
debug!(target: "chain.extend", "\n -> latest known: {}\n -> first new: {}",
debug!(target: "chain", "\n -> latest known block: {}\n -> first new block: {}",
utils::arr_to_hex_swapped(transform!(self.hashes.last())),
utils::arr_to_hex_swapped(transform!(new_hashes.first())));
self.hashes.append(&mut new_hashes);
}
self.hashes.append(&mut new_hashes);
}
debug!(target: "chain", "Inserted {} new blocks ...", self.hashes.len() - self.hashes_len);
}

debug!(target: "chain", "Inserted {} new blocks ...", self.hashes.len() - self.hashes_len);
self.hashes_len = self.hashes.len();
self.latest_blk_idx = latest_blk_idx;
Ok(())
Expand All @@ -88,7 +86,7 @@ impl ChainStorage {
try!(file.read_to_string(&mut encoded));

let storage = try!(json::decode::<ChainStorage>(&encoded));
debug!(target: "chain.load", "Imported {} hashes from {}. Current block height: {} ... (latest blk.dat index: {})",
debug!(target: "chain", "Imported {} hashes from {}. Current block height: {} ... (latest blk.dat index: {})",
storage.hashes.len(), path.display(), storage.get_cur_height(), storage.latest_blk_idx);
Ok(storage)
}
Expand All @@ -98,7 +96,7 @@ impl ChainStorage {
let encoded = try!(json::encode(&self));
let mut file = try!(File::create(&path));
try!(file.write_all(encoded.as_bytes()));
debug!(target: "chain.serialize", "Serialized {} hashes to {}. Current block height: {} ... (latest blk.dat index: {})",
debug!(target: "chain", "Serialized {} hashes to {}. Latest processed block height: {} ... (latest blk.dat index: {})",
self.hashes.len(), path.display(), self.get_cur_height(), self.latest_blk_idx);
Ok(encoded.len())
}
Expand All @@ -116,7 +114,7 @@ impl ChainStorage {
if self.index < self.hashes_len {
self.index += 1;
} else {
panic!("consume_next() index > len");
panic!("FATAL: consume_next() index > len! Please report this issue.");
}
}

Expand Down Expand Up @@ -198,7 +196,6 @@ impl<'a> ChainBuilder<'a> {
}



impl<'a> IntoIterator for &'a ChainBuilder<'a> {
type Item = Hashed<BlockHeader>;
type IntoIter = RevBlockIterator<'a>;
Expand Down Expand Up @@ -271,8 +268,7 @@ mod tests {
use blockchain::parser::types::{CoinType, Bitcoin};

#[test]
fn test_chain_storage() {

fn chain_storage() {
let mut chain_storage = ChainStorage::default();
let new_header = BlockHeader::new(
0x00000001,
Expand Down Expand Up @@ -315,7 +311,60 @@ mod tests {

#[test]
#[should_panic]
fn test_load_bogus_chain_storage() {
fn chain_storage_insert_bogus_header() {
let mut chain_storage = ChainStorage::default();
let new_header = BlockHeader::new(
0x00000001,
[0u8; 32],
[0x3b, 0xa3, 0xed, 0xfd, 0x7a, 0x7b, 0x12, 0xb2,
0x7a, 0xc7, 0x2c, 0x3e, 0x67, 0x76, 0x8f, 0x61,
0x7f, 0xc8, 0x1b, 0xc3, 0x88, 0x8a, 0x51, 0x32,
0x3a, 0x9f, 0xb8, 0xaa, 0x4b, 0x1e, 0x5e, 0x4a],
1231006505,
0x1d00ffff,
2083236893);

assert_eq!(0, chain_storage.latest_blk_idx);
assert_eq!(0, chain_storage.get_cur_height());

// Extend storage and match genesis block
let coin_type = CoinType::from(Bitcoin);
chain_storage.extend(vec![Hashed::double_sha256(new_header)], &coin_type, 1).unwrap();
assert_eq!(coin_type.genesis_hash, chain_storage.get_next().unwrap());
assert_eq!(1, chain_storage.latest_blk_idx);

// try to insert same header again
let same_header = BlockHeader::new(
0x00000001,
[0u8; 32],
[0x3b, 0xa3, 0xed, 0xfd, 0x7a, 0x7b, 0x12, 0xb2,
0x7a, 0xc7, 0x2c, 0x3e, 0x67, 0x76, 0x8f, 0x61,
0x7f, 0xc8, 0x1b, 0xc3, 0x88, 0x8a, 0x51, 0x32,
0x3a, 0x9f, 0xb8, 0xaa, 0x4b, 0x1e, 0x5e, 0x4a],
1231006505,
0x1d00ffff,
2083236893);
chain_storage.extend(vec![Hashed::double_sha256(same_header)], &coin_type, 1).unwrap();
assert_eq!(coin_type.genesis_hash, chain_storage.get_next().unwrap());
assert_eq!(1, chain_storage.latest_blk_idx);

// try to insert bogus header
let bogus_header = BlockHeader::new(
0x00000001,
[1u8; 32],
[0x3b, 0xa3, 0xed, 0xfd, 0x7a, 0x7b, 0x12, 0xb2,
0x7a, 0xc7, 0x2c, 0x3e, 0x67, 0x76, 0x8f, 0x61,
0x7f, 0xc8, 0x1b, 0xc3, 0x88, 0x8a, 0x51, 0x32,
0x3a, 0x9f, 0xb8, 0xaa, 0x4b, 0x1e, 0x5e, 0x4a],
1231006505,
0x1d00ffff,
2083236893);
chain_storage.extend(vec![Hashed::double_sha256(bogus_header)], &coin_type, 1).unwrap();
}

#[test]
#[should_panic]
fn load_bogus_chain_storage() {
// Must fail
let encoded = String::from("AABAAAFKAAANANFANAAMMDDMDAMDADNNDANANDNAVCACANAFMAFAMMAMDAMDM");
match json::decode::<ChainStorage>(&encoded) {
Expand All @@ -326,7 +375,7 @@ mod tests {

#[test]
#[should_panic]
fn test_serialize_bogus_chain_storage() {
fn serialize_bogus_chain_storage() {
let encoded = String::from("AABAAAFKAAANANFANAAMMDDMDAMDADNNDANANDNAVCACANAFMAFAMMAMDAMDM");
match json::decode::<ChainStorage>(&encoded) {
Ok(_) => return,
Expand Down
28 changes: 14 additions & 14 deletions src/blockchain/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,20 @@ pub mod worker;
pub mod chain;
pub mod types;

/// Specifies ParseMode. The first time the blockchain is scanned with HeaderOnly,
/// Specifies ParseMode. The first time the blockchain needs to be indexed,
/// because we just need the block hashes to determine the longest chain.
#[derive(Clone, Debug, PartialEq)]
pub enum ParseMode {
FullData,
HeaderOnly
Indexing
}

/// Wrapper to pass different data between threads. Specified by ParseMode
pub enum ParseResult {
FullData(Block),
HeaderOnly(BlockHeader),
Indexing(BlockHeader),
Complete(String), // contains the name of the finished thread
Error(OpError) // Indicates critical error
Error(OpError) // Indicates critical error
}

/// Small struct to hold statistics together
Expand All @@ -48,7 +48,7 @@ pub struct BlockchainParser<'a> {
unsorted_blocks: HashMap<[u8; 32], Block>, /* holds all blocks in parse mode FullData */
remaining_files: Arc<Mutex<VecDeque<BlkFile>>>, /* Remaining files (shared between all threads) */
h_workers: Vec<JoinHandle<()>>, /* Worker job handles */
mode: ParseMode, /* ParseMode (FullData or HeaderOnly) */
mode: ParseMode, /* ParseMode (FullData or Indexing) */
options: &'a mut ParserOptions, /* struct to hold cli arguments */
chain_storage: chain::ChainStorage, /* Hash storage with the longest chain */
stats: WorkerStats, /* struct for thread management & statistics */
Expand All @@ -57,16 +57,16 @@ pub struct BlockchainParser<'a> {

impl<'a> BlockchainParser<'a> {

/// Instantiats a new Parser but does not start the workers.
/// Instantiates a new Parser but does not start the workers.
pub fn new(options: &'a mut ParserOptions,
parse_mode: ParseMode,
blk_files: VecDeque<BlkFile>,
chain_storage: chain::ChainStorage) -> Self {

info!(target: "parser", "Parsing {} blockchain ...", options.coin_type.name);
match parse_mode {
ParseMode::HeaderOnly => {
info!(target: "parser", "Parsing with mode HeaderOnly (first run).");
ParseMode::Indexing => {
info!(target: "parser", "Building blockchain index ...");
}
ParseMode::FullData => {
info!(target: "parser", "Parsing {} blocks with mode FullData.", chain_storage.remaining());
Expand Down Expand Up @@ -95,7 +95,7 @@ impl<'a> BlockchainParser<'a> {

// save latest blk file index for resume mode.
self.stats.latest_blk_idx = match self.mode {
ParseMode::HeaderOnly => self.chain_storage.latest_blk_idx,
ParseMode::Indexing => self.chain_storage.latest_blk_idx,
ParseMode::FullData => transform!(try!(self.remaining_files.lock()).back()).index
};

Expand Down Expand Up @@ -151,8 +151,8 @@ impl<'a> BlockchainParser<'a> {
if now - t_last_log > t_measure_frame {
let blocks_sec = self.stats.n_valid_blocks.checked_div((now - self.t_started) as u64).unwrap_or(1);
match self.mode {
ParseMode::HeaderOnly => {
info!(target:"dispatch", "Status: {:6} Headers scanned. (avg: {:5.2} blocks/sec)",
ParseMode::Indexing => {
info!(target:"dispatch", "Status: {:6} Blocks added to index. (avg: {:5.2} blocks/sec)",
self.stats.n_valid_blocks, blocks_sec);
}
ParseMode::FullData => {
Expand Down Expand Up @@ -202,7 +202,7 @@ impl<'a> BlockchainParser<'a> {
}
}
// Collect headers to built a valid blockchain
ParseResult::HeaderOnly(header) => {
ParseResult::Indexing(header) => {
let header = Hashed::double_sha256(header);
self.unsorted_headers.insert(header.hash, header.value);
self.stats.n_valid_blocks += 1;
Expand Down Expand Up @@ -249,10 +249,10 @@ impl<'a> BlockchainParser<'a> {

/// Searches for the longest chain and writes the hashes t
fn save_chain_state(&mut self) -> OpResult<usize> {
debug!(target: "dispatch", "Saving block headers as {}", self.options.chain_storage_path.display());
info!(target: "dispatch", "Saving block headers as {} ...", self.options.chain_storage_path.display());
// Update chain storage
let headers = match self.mode {
ParseMode::HeaderOnly => try!(chain::ChainBuilder::extract_blockchain(&self.unsorted_headers)),
ParseMode::Indexing => try!(chain::ChainBuilder::extract_blockchain(&self.unsorted_headers)),
ParseMode::FullData => Vec::new()
};
try!(self.chain_storage.extend(headers, &self.options.coin_type, self.stats.latest_blk_idx));
Expand Down
2 changes: 1 addition & 1 deletion src/blockchain/parser/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ impl Coin for Dash {
}*/

#[derive(Clone)]
// Holds the selected coin type informations
// Holds the selected coin type information
pub struct CoinType {
pub name: String,
pub magic: u32,
Expand Down
Loading

0 comments on commit 981fc75

Please sign in to comment.