-
Notifications
You must be signed in to change notification settings - Fork 5
Manually curated assembly
Ryan Wick edited this page Sep 4, 2024
·
29 revisions
In addition to Autocycler, these commands require the following assemblers and supporting tools: Canu, canu_trim.py, Flye,, miniasm, Minipolish, Raven and any2fasta.
# Set these variables as appropriate for your system and genome:
threads=16
genome_size="5500000"
# Subsample the long-read set into multiple files:
autocycler subsample --reads ont.fastq --out_dir subsampled_reads --genome_size "$genome_size"
# Assemble each subsampled file:
mkdir assemblies
for i in 01 05 09 13 17 21; do
canu -p canu -d canu_temp_"$i" -fast genomeSize="$genome_size" useGrid=false maxThreads="$threads" -nanopore subsampled_reads/sample_"$i".fastq
canu_trim.py canu_temp_"$i"/canu.contigs.fasta > assemblies/canu_"$i".fasta
rm -rf canu_temp_"$i"
done
for i in 02 06 10 14 18 22; do
flye --nano-hq subsampled_reads/sample_"$i".fastq --threads "$threads" --out-dir flye_temp_"$i"
cp flye_temp_"$i"/assembly.fasta assemblies/flye_"$i".fasta
rm -r flye_temp_"$i"
done
for i in 03 07 11 15 19 23; do
miniasm_and_minipolish.sh subsampled_reads/sample_"$i".fastq "$threads" > assemblies/miniasm_"$i".gfa
any2fasta assemblies/miniasm_"$i".gfa > assemblies/miniasm_"$i".fasta
done
for i in 04 08 12 16 20 24; do
raven --threads "$threads" --disable-checkpoints subsampled_reads/sample_"$i".fastq > assemblies/raven_"$i".fasta
done
MANUAL STEP: CURATE INPUT ASSEMBLIES
# Compress the input assemblies into a unitig graph:
autocycler compress -i assemblies -a autocycler
# Cluster the input contigs into putative replicons:
autocycler cluster -a autocycler
MANUAL STEP: CURATE CLUSTERS
# Trim and resolve each QC-pass cluster:
for c in autocycler/clustering/qc_pass/cluster_*; do
autocycler trim -c "$c"
autocycler resolve -c "$c"
done
MANUAL STEP: EXAMINE DOTPLOTS
MANUAL STEP: EXAMINE 4_merged.gfa
FILES
# Combine resolved clusters into a final assembly:
autocycler combine -a autocycler -i autocycler/clustering/qc_pass/cluster_*/5_final.gfa
The final consensus assembly will be named: autocycler/consensus_assembly.fasta
- Step 1: Autocycler subsample
- Step 2: Generating input assemblies
- Step 3: Autocycler compress
- Step 4: Autocycler cluster
- Step 5: Autocycler trim
- Step 6: Autocycler resolve
- Step 7: Autocycler combine