-
Notifications
You must be signed in to change notification settings - Fork 136
An example of real assembly
We use a human gut dataset (SRR341725) in this tutorial.
# clone MEGAHIT to the folder "megahit"
git clone https://github.com/voutcn/megahit.git
# compile MEGAHIT
cd megahit
make -j
# test MEGAHIT
make test
# go back to the original directory
cd ..
wget http://www.bio8.cs.hku.hk/dataset/MEGAHIT/SRR341725_1.fastq.gz
wget http://www.bio8.cs.hku.hk/dataset/MEGAHIT/SRR341725_2.fastq.gz
megahit/megahit -1 SRR341725_1.fastq.gz -2 SRR341725_2.fastq.gz -o SRR341725.megahit_asm
When the assembly finishes, you will see a screen message like:
--- [Wed Aug 5 15:27:33 2015] Merging to output final contigs ---
--- [STAT] 70771 contigs, total 117990591 bp, min 200 bp, max 217183 bp, avg 1667 bp, N50 3897 bp
--- [Wed Aug 5 15:27:35 2015] ALL DONE. Time elapsed: 1992.627266 seconds ---
Assembled contigs are located in SRR341725.megahit_asm/final.contigs.fa.
Tips:
- Single-end reads could be specified by
-r
- Interleaved PE reads could be specified by
--12
-
-r
,--12
,-1
and-2
could be specified for multiple times to assemble multiple libraries; be careful that files in-1
and-2
should be paired and in order
Here we only demonstrate how to calculate the read coverage per contig and extract unassembled reads using BBMap and samtools. You may pass the contigs your own subsequent analyses.
4.1 Download & install BBMap and samtools:
# BBmap
wget http://downloads.sourceforge.net/project/bbmap/BBMap_35.24.tar.gz
tar zvxf BBMap_35.24.tar.gz
# samtools
wget https://github.com/samtools/samtools/releases/download/1.2/samtools-1.2.tar.bz2
tar xvjf samtools-1.2.tar.bz2
cd samtools-1.2 && make -j && cd ..
4.2 Align reads with bbwrap.sh
(which accepts multiple read files):
bbmap/bbwrap.sh ref=SRR341725.megahit_asm/final.contigs.fa in=SRR341725_1.fastq.gz in2=SRR341725_2.fastq.gz out=aln.sam.gz
4.3 Output per contig coverage to cov.txt
with pileup.sh
:
bbmap/pileup.sh in=aln.sam.gz out=cov.txt
4.4 Extract unmapped reads (SE to unmapped.se.fq
and PE to unmapped.pe.fq
):
samtools-1.2/samtools view -u -f4 aln.sam.gz | samtools-1.2/samtools bam2fq -s unmapped.se.fq - > unmapped.pe.fq