Skip to content


Folders and files

Last commit message
Last commit date

Latest commit

5efcba5 · Aug 1, 2014
Feb 11, 2014
Jan 24, 2014
Jan 22, 2014
Feb 24, 2014
Jan 24, 2014
Aug 28, 2013
Jan 24, 2014
Apr 22, 2014
Apr 22, 2014
Apr 17, 2013
Jan 24, 2014
Feb 11, 2014
Aug 1, 2014
Mar 27, 2013
May 13, 2013
Aug 29, 2013
Apr 22, 2014
Apr 8, 2014
Feb 11, 2014
Jan 7, 2014

Repository files navigation


Structural Variation AUTOmated PIpeLine Optimization Tool

by : Wai Yi Leung, Tobias Marschall, Laurent Falquet, Yogesh Paudel, Hailiang Mei, Alex Schoenhuth and Tiffanie Yael Moss

This repository is used to store scripts written during the hackathon of ALLBio Testcase 2.

We aim at providing :

  • a pipeline for automated Structural variation calling
  • an automated approach for benchmarking (new) SV tools.

More information about the project can be found at the following websites:

ALLBio Bioinformatics, Testcase#2, Google site, members only!

How to install

Grab a copy of this repository from GitHub to your home folder and store this in allbiotc2:

cd ~
git clone
cd allbiotc2/
make install

The make install command will do a system-wide install. This step requires sudo rights.

Installation instructions for sysadmins (advanced)

Please take a closer look in the following repository where the installation scripts are located. These scripts were used to install the workshop-ready and production-ready virtual machine.

Comments are welcome via the ticketing system from Github.

Preprocessing reference VCF (optional)

If reference calls are provided in SDI format, the following procedure can be followed to convert from SDI to VCF.

make -f ../scripts/Makefile \
    REFERENCE_VCF=~/myworkdir/ref_all.complete.vcf \
    SDI_FILE=~/myworkdir/ler_0.v7c.sdi \

Installing the software

The software for the pipeline is placed into one central location in the following setup:

allbio@workbench:/virdir/Scratch/software$ tree -L 1
├── bowtie2-2.1.0
├── breakdancer
├── bwa-0.7.4
├── circos-0.63-4
├── clever-sv
├── delly_v0.0.9
├── dwac-seq0.7
├── FastQC
├── gasv
├── picard-tools-1.86
├── pindel
├── PRISM_1_1_6
├── samtools-0.1.19
├── sickle-master
└── SVDetect_r0.8b

Running the pipeline

Configuration can be done in the and upon invocation of the pipeline by passing them via the commandline.

The most important and required variables are:

  • PROGRAMS: Path to the directory where the programs are installed
  • PYTHON_EXE: Path to the PYTHON executable, defaults to python (system distributed version)
  • REFERENCE_DIR: Path to the reference
  • REFERENCE_VCF: Full path to the VCF file with reference SV calls for benchmarking
  • FASTQ_EXTENSION: Filename extentension of the FastQ files
  • PEA_MARK: Filenaming of the left read of FastQ: sample-PEA_MARK.FASTQ_EXTENSION
  • PEB_MARK: Filenaming of the right read of FastQ: sample-PEB_MARK.FASTQ_EXTENSION
  • *_THREADS: Set the amount of cores to used by the programs.

Example invocation of the pipeline:


make -f ../scripts/Makefile \
    REFERENCE_DIR=../input/reference_tair9 \
    PEA_MARK=.1 \
    PEB_MARK=.2 \

Example setup of pipeline directories

allbio@workbench:/opt/allbio/runs/synthetic_run$ tree -L 1
├── input
│   ├── reference_tair10
│   │   ├── bowtie2
│   │   ├── bwa
│   │   ├── reference.fa
│   │   └── reference.fa.fai
│   ├── sim-reads_1.fastq
│   ├── sim-reads_2.fastq
│   ├── sim-reads.409_10.1.fastq
│   ├── sim-reads.409_10.2.fastq
│   ├── sim-reads.511_10.1.fastq
│   ├── sim-reads.511_10.2.fastq
├── log
├── run_integrationtest
│   ├── bd.cfg
│   ├── comparison.tex
│   ├──
│   ├── sim-read-511_10.1.fastq -> ../input/sim-reads.511_10.1.fastq
│   ├── sim-read-511_10.1.filtersync.stats
│   ├──
│   ├── sim-read-511_10.1.trimmed.fastq
│   ├── sim-read-511_10.2.fastq -> ../input/sim-reads.511_10.2.fastq
│   ├── sim-read-511_10.2.trimmed.fastq
│   ├── sim-read-511_10.bam
│   ├── sim-read-511_10.bam.bai
│   ├──
│   ├── sim-read-511_10.breakdancer
│   ├── sim-read-511_10.delly
│   ├── sim-read-511_10.delly.vcf
│   ├── sim-read-511_10.flagstat
│   ├── sim-read-511_10.gasv
│   ├── sim-read-511_10.gasv.vcf
│   ├── sim-read-511_10.pindel
│   ├── sim-read-511_10.pindel.vcf
│   ├── sim-read-511_10.prism
│   ├── sim-read-511_10.prism.vcf
│   ├── sim-read-511_10.raw_fastqc
│   ├── sim-read-511_10.sam
│   ├── sim-read-511_10.trimmed_fastqc
│   └── sim-read-511_10.unsort.bam
└── scripts
    └── Makefile -> ~/allbiotc2/Makefile