-
Notifications
You must be signed in to change notification settings - Fork 5
Demo dataset
This demo dataset is a small 'genome' consisting of some E. coli plasmids. By excluding the chromosome, the file sizes are kept smaller, making this demo faster to download and assemble. This dataset provides a practical way to test Autocycler's workflow and become familiar with its commands.
You can download the demo dataset from here: autocycler-demo-dataset.tar
The autocycler_demo_dataset.tar
file contains the following:
-
reads.fastq.gz
: 75 Mbp of ONT reads -
truth.fasta
: an error-free reference
The following commands will guide you through running a fully automated assembly on the demo dataset. These commands use only three different assemblers to minimise processing time.
threads="16"
genome_size="242000"
autocycler subsample --reads reads.fastq.gz --out_dir subsampled_reads --genome_size "$genome_size"
mkdir assemblies
for assembler in flye miniasm raven; do
for i in 01 02 03 04; do
"$assembler".sh subsampled_reads/sample_"$i".fastq assemblies/"$assembler"_"$i" "$threads" "$genome_size"
done
done
rm subsampled_reads/*.fastq
autocycler compress -i assemblies -a autocycler_out
autocycler cluster -a autocycler_out
for c in autocycler_out/clustering/qc_pass/cluster_*; do
autocycler trim -c "$c"
autocycler resolve -c "$c"
done
autocycler combine -a autocycler_out -i autocycler_out/clustering/qc_pass/cluster_*/5_final.gfa
The final consensus assembly will be saved as autocycler/consensus_assembly.fasta
. This assembly should closely (ideally exactly) match truth.fasta
, but since the plasmids are circular, the sequences will probably differ in strand and starting position.
You can also try running Autocycler on the Trycycler demo datasets which contain pre-made assemblies. These are a little bit dated (the assemblies have a higher error rate with lots of homopolymer-length errors) but will still work with Autocycler. The 'great', 'good' and 'mediocre' datasets should yield a structurally correct assembly.
- Step 1: Autocycler subsample
- Step 2: Generating input assemblies
- Step 3: Autocycler compress
- Step 4: Autocycler cluster
- Step 5: Autocycler trim
- Step 6: Autocycler resolve
- Step 7: Autocycler combine