seq_pipeline

Running the pipeline

The pipeline is run using conda environments on the local machine using:

snakemake --cores {cores} --resources ncbi_connection=1 --use-conda --conda-frontend mamba all

Config File `config.yml` and Metadata File `metadata.tsv`

For examples of different config files see parameter_templates/

Aligners

The pipeline supports either STAR or bowtie2 for aligning given in config.yml as aligner: star or aligner: bowtie2

Input Files

Currently the pipeline only acceps pair end data, except for bam file inputs.

It expects readlength: {the_readlength} in config.yml and optionally for aligning adapters can be specified in config.yml see parameter_templates/adapter_trimming.yml, with type of adapter present beind passed on to cutadapt.

SRR

With metadata_files: srr in config.yml and SRRxxxxxxxx for sample_ids in metadata.tsv

fastq

With metadata_files: fastq in config.yml and R1 and R2 columns with paths to the respective fastq files in metadata.tsv

bam

With metadata_files: bam in config.yml and bam column with paths to the respective bam files in metadata.tsv

Pipelines

Bulk

Runs qc, and generates a count matrix, super basic ready for downstream analysis.

Ripseq

The pipeline runs peakcalling using MACS2, and then run through various peak detection methods:

IDR
PePr
DEQ
Thor
Genrich

before saving pileup summaries of these analysis tools. The pipeline strictly requires unique matched input control for each condtion.

The metadata.tsv file requires one line per sample with the following columns:

sample_id
condition
method: IP or Input
matching_input_control: the sample_id of the matching input control

The config file requires the following terms in config.yml:

control_condition: setting which condition is compared against, all other conditions compared against this

See parameter_templates/ripseq_config.yml and parameter_templates/ripseq_metadata.tsv for an example template

Stamp

The pipeline runs the Bullseye C to T editing pipeline an performs analysis on the data, and plotting pileups with optionally other bam files on edit genes.

The metadata.tsv file requires one line per sample with the following columns:

sample_id
condition
method: IP or Input
matching_{condition}: One column per condition providing the matching samples of the same condition

The config file requires the following terms in config.yml:

complex_comparisons: allowing comparisons of one condition against multiple conditions
simple_comparisons: allowing comparisons of pairs of conditions
display_order: determining the order of conditions in the plots

See parameter_templates/stamp_config.yml and parameter_templates/stamp_metadata.tsv for an example template

Dependencies

These will be downloaded automatically by the pipeline when running using conda.

Base

snakemake
sra-tools
parallel-fastq-dump
samtools
[bedtools](http://bioinformatics.oxfordjournals.org/content/26/6/841.short
pybedtools
tidyverse
pandas
STAR
bowtie2
cutadapt
fastqc
multiqc
biomaRt
GenomicRanges
Rsamtools
rtracklayer
ggseqlogo

STAMP

BULLSEYE

Ripseq

MACS2
IDR
piranha
genrich
THOR
PePr
DEQ modified to run on more modern versions of R

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
dev		dev
envs		envs
parameter_templates		parameter_templates
rules		rules
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
seq_pipeline.Rproj		seq_pipeline.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

seq_pipeline

Running the pipeline

Config File `config.yml` and Metadata File `metadata.tsv`

Aligners

Input Files

SRR

fastq

bam

Pipelines

Bulk

Ripseq

Stamp

Dependencies

Base

STAMP

Ripseq

About

Releases

Packages

Languages

License

ksomf/seq_pipeline

Folders and files

Latest commit

History

Repository files navigation

seq_pipeline

Running the pipeline

Config File config.yml and Metadata File metadata.tsv

Aligners

Input Files

SRR

fastq

bam

Pipelines

Bulk

Ripseq

Stamp

Dependencies

Base

STAMP

Ripseq

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Config File `config.yml` and Metadata File `metadata.tsv`

Packages