Implementation of the Routed Sparse Graph approach described here
Creates a graph-based representation of short read, long read or pre-assembled sequences. Current implementation is primarily optimized for short reads, and partially also for pre-assembled sequences. Modest long-read datasets can also be used.
Graph building is divided into two phases, indexing, which determines a suitable set of nodes, and building, which adds edges and routes to the chosen set of nodes.
This is early code release, for evaluation purposes.
Building the graph should be feasible on a 32GB RAM desktop for most Illumina datasets up to ~100Gbp (preprocessing with Trimmomatic for adapter removal and filtering at Q20 is recommended). Creation of pan-genome graphs from multiple gigabase-scale assembled references should also work, and has been tested with Human vs Chimp and 10 human genomes.
Basic but inefficient example code for extracting sequences is included. Code for further analysing the resulting graph is under development.
Reading/Writing of the graph in binary formats is now available.
Any recent version of: gcc, make, gperftools, jdk & ant
- Change to
LOGAN-Graph
directory, and build Java part usingant
- Change to
LOGAN-Graph-Native
directory, build JNI headers using./scripts/jHeaders.sh
, then build native code usingmake
. You may need to modify JNI_INC inmakefile
to correctly locate the JNI headers (jni.h etc).
- Change to LOGAN-Graph-Native/bin
- Run using
./LOGAN TestIndexAndRoute <indexingThreads> <routingThreads> <file1>...
- Files can be either reads from Illumina, PacBio and Oxford Nanopore etc. in FASTQ format (.fq extension), or reference / pre-assembled sequences in FASTA format (.fa extension). Support for compressed files is planned.
General format is: LOGAN <command> <args>
LOGAN Index <indexingThreads> <graph> <files...>
: Performs the indexing phase of graph building for the provided FASTQ/A files and outputs the resulting nodes ingraph.nodes
.LOGAN Route <routingThreads> <graph> <files...>
: Performs the routing phase of graph building for the provided FASTQ/A files, based on the existinggraph.nodes
file, and outputs the resulting edges/routes ingraph.edges
andgraph.routes
.LOGAN IndexAndRoute <indexingThreads> <routingThreads> <graph> <files...>
: Performs the both phases of graph building for the provided FASTQ/A files and outputs the resulting graph elements ingraph.nodes
,graph.edges
andgraph.routes
.LOGAN TestIndex <indexingThreads> <files...>
: Performs the indexing phase of graph building for the provided FASTQ/A files, but does not save the result.LOGAN TestRoute <routingThreads> <files...>
: Performs the routing phase of graph building for the provided FASTQ/A files, based on the existinggraph.nodes
file, but does not save the result.LOGAN TestIndexAndRoute <indexingThreads> <routingThreads> <files...>
: Performs the both phases of graph building for the provided FASTQ/A files, but does not save the result.LOGAN ReadGraph
: Loads the graph previously stored ingraph.nodes
,graph.edges
andgraph.routes
.