Skip to content

Manually curated assembly

Ryan Wick edited this page Sep 4, 2024 · 29 revisions

In addition to Autocycler, these commands require the following assemblers and supporting tools: Canu, canu_trim.py, Flye,, miniasm, Minipolish, Raven and any2fasta.

# Set these variables as appropriate for your system and genome:
threads=16
genome_size="5500000"

# Subsample the long-read set into multiple files:
autocycler subsample --reads ont.fastq --out_dir subsampled_reads --genome_size "$genome_size"

# Assemble each subsampled file:
mkdir assemblies
for i in 01 05 09 13 17 21; do
    canu -p canu -d canu_temp_"$i" -fast genomeSize="$genome_size" useGrid=false maxThreads="$threads" -nanopore subsampled_reads/sample_"$i".fastq
    canu_trim.py canu_temp_"$i"/canu.contigs.fasta > assemblies/canu_"$i".fasta
    rm -rf canu_temp_"$i"
done
for i in 02 06 10 14 18 22; do
    flye --nano-hq subsampled_reads/sample_"$i".fastq --threads "$threads" --out-dir flye_temp_"$i"
    cp flye_temp_"$i"/assembly.fasta assemblies/flye_"$i".fasta
    rm -r flye_temp_"$i"
done
for i in 03 07 11 15 19 23; do
    miniasm_and_minipolish.sh subsampled_reads/sample_"$i".fastq "$threads" > assemblies/miniasm_"$i".gfa
    any2fasta assemblies/miniasm_"$i".gfa > assemblies/miniasm_"$i".fasta
done
for i in 04 08 12 16 20 24; do
    raven --threads "$threads" --disable-checkpoints subsampled_reads/sample_"$i".fastq > assemblies/raven_"$i".fasta
done

MANUAL STEP: CURATE INPUT ASSEMBLIES

# Compress the input assemblies into a unitig graph:
autocycler compress -i assemblies -a autocycler

# Cluster the input contigs into putative replicons:
autocycler cluster -a autocycler

MANUAL STEP: CURATE CLUSTERS

# Trim and resolve each QC-pass cluster:
for c in autocycler/clustering/qc_pass/cluster_*; do
    autocycler trim -c "$c"
    autocycler resolve -c "$c"
done

MANUAL STEP: EXAMINE DOTPLOTS

MANUAL STEP: EXAMINE 4_merged.gfa FILES

# Combine resolved clusters into a final assembly:
autocycler combine -a autocycler -i autocycler/clustering/qc_pass/cluster_*/5_final.gfa

The final consensus assembly will be named: autocycler/consensus_assembly.fasta

Clone this wiki locally