birneylab/varexplore: Output

Introduction

This document describes the output produced by the pipeline.

The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.

Pipeline overview

The pipeline is built using Nextflow and processes data using the following steps:

Group reads - Group reads in specific regions according to the genotypes at the selected markers
Call variants - Joint call variants in the meta-samples for each vairant of interest
Predict variant effects - Predict variant effects using ENSEMBL VEP

Group reads

Group sample reads around a region of interest according to user-defined grouping criteria and the genotypes at a selected marker.

Output files

reads/group_<GROUP_ID>_variant_<VARIANT_ID>_gt_<GT>/
- *.cram: sequencing reads in cram format
- *.crai: cram file index

Call variants

Use GATK4 joint germline variant calling to detect variants in the grouped sequencing reads by group and genotype. The grouping by genotype allows to detect variant in linkage disequilibrium with the marker of interest.

Output files

variants/variant_<VARIANT_ID>/
- *.vcf.gz: variant calls in all the meta-samples in vcf format
- *.tbi: vcf file index

Predict variant effects

Use ENSEMBL VEP on the variant calls obtained in the previous step to determine variant consequence.

Output files

variants/variant_<VARIANT_ID>/
- *.vep.tsv.gz: variant consequence predictions
- *.mut.gz: variant consequences formatted in such a way that they can be directly loaded in IGV
- *.vep.summary.html: html report from VEP

Pipeline information

Output files

pipeline_info/
- Reports generated by Nextflow: execution_report.html, execution_timeline.html, execution_trace.txt and pipeline_dag.dot/pipeline_dag.svg.
- Reports generated by the pipeline: pipeline_report.html, pipeline_report.txt and software_versions.yml. The pipeline_report* files will only be present if the --email / --email_on_fail parameter's are used when running the pipeline.
- Reformatted samplesheet files used as input to the pipeline: samplesheet.valid.csv.

Nextflow provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

output.md

output.md

birneylab/varexplore: Output

Introduction

Pipeline overview

Group reads

Call variants

Predict variant effects

Pipeline information

Files

output.md

Latest commit

History

output.md

File metadata and controls

birneylab/varexplore: Output

Introduction

Pipeline overview

Group reads

Call variants

Predict variant effects

Pipeline information