A tool for counting exact K-mer occurrences in a DNA or RNA sequence very, very quickly (where K=32).
the_count <haystack> <needles> <output>
where the haystack is a FASTA file
that contains sequences to be searched and the needles are a FASTA file that
contains 32-mers to be searched for.
The Count is implemented in the Rust programming language and supports Rust 1.43 and later. Tooling instructions are below. They assume you already have the Rust toolchain installed. To do this, see https://rustup.rs.
- Run unit tests:
cargo test
- Run the demo:
cargo run
- Create a release build (faster):
cargo build --release
, the binary will end up intarget/release/
- Format the code (do this before pushing):
cargo fmt
To run the benchmarks, you will need to install
hyperfine. On a Mac this can be done
through Homebrew using brew install hyperfine
. You can also use the
setup-mac
make target: make setup-mac
.
Benchmarks may then be run with make benchmark
. The default benchmark searches a
file with 1 million auto-generated sequences for 999 auto-generated 32-mers.
- Sarah Walling [email protected]
- Travis Wheeler [email protected]
- George Lesica [email protected]