Skip to content

examples

Young edited this page Feb 9, 2024 · 10 revisions

Example Usage

There are a lot of ways to use Grandeur, and sometimes the documentation is too much to read through. As such, we have collected some use-cases and how to run Grandeur on both the command line and with a config file.

General rules

It can be noticed that there are two main ways to specify a parameter with nextflow.

On the command line it looks like

nextflow run <workflow> --<name of param> <value of parameter>

In a config file it looks like sample.config

params.<name of param> = '<value of parameter>'

And then the config file is specified with -c on the command line.

nextflow run <workflow> -c sample.config

Table of contents:

Example 1

Running the default workflow.

Fastq files are in a directory named 'reads'. Using local resources with singularity.

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --reads reads

Config file:

sample.config

params.reads                    = 'reads'
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 2

Default workflow, but this uses a sample sheet and Docker (so different!)

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile docker --sample_sheet sample_sheet.csv

Config file:

sample.config

params.sample_sheet             = "sample_sheet.csv"
docker.enabled                  = true
docker.runOptions               = '-u \$(id -u):\$(id -g)'
docker.sudo                     = false
docker.temp                     = /tmp
docker.remove                   = true
docker.registry                 = ''
docker.fixOwnership             = true
docker.engineOptions            = ''
docker.mountFlags               = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 3

Default workflow, except instead of starting from reads, we're going to stard from fasta files! These could be fasta files from a prior run of Grandeur, a different workflow, downloaded from NCBI, etc. In this example, all fasta files are in a directory named 'fastas', and the End User will be using singularity.

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --fastas fastas

Config file:

sample.config

params.fastas                   = 'fastas'
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 4

What if there were fastq AND fasta files?!?!?!

This is how to run the default workflow from fastq in a directory named 'fastq' and fasta files in a different directory named 'contigs' using singularity.

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --reads reads --fastas contigs

Config file:

sample.config

params.reads                    = 'reads'
params.fastas                   = 'contigs'
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 5

Using a sample sheet is mandatory for cloud-based runs. This example is going to use a sample sheet and will demonstrate how to turn on the phylogenetic analysis subworkflow. There was also a desire to include a reference file from NCBI, which was placed in the 'reference' directory.

Command line:

# listing the relevant parameter
nextflow run UPHL-BioNGS/Grandeur -profile singularity --sample_sheet sampleSheet.csv --fastas reference --msa

# with a profile
nextflow run UPHL-BioNGS/Grandeur -profile singularity,msa --sample_sheet sampleSheet.csv --fastas reference

Config file:

sample.config

params.sample_sheet             = 'sampleSheet.csv'
params.fastas                   = 'reference'
params.msa                      = true
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 6

What if there was the idea of using the fasta file from NCBI as the outgroup for iqtree2? This throws errors frequently as iqtree2 determines sample names independently of this workflow. In general, if the fasta file had a base filename of 'GCA_008632635.1.fna', the iqtree2_outgroup would be GCA_008632635.1.

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity,msa --fastas fastas --iqtree2_outgroup GCA_008632635.1

Config file:

sample.config

params.fastas                   = 'fastas'
params.iqtree2_outgroup         = 'GCA_008632635.1'
params.msa                      = true
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 7

What if you have samples in a sample sheet and you'd like to use a Kraken2 database? Specifying a Kraken2 database in a directory named 'database' to use on reads specified by a sample sheet named 'sampleSheet.csv' using docker.

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile docker --sample_sheet sampleSheet.csv --kraken2_db database

Config file:

sample.config

params.kraken2_db               = 'database'
params.sample_sheet             = 'sampleSheet.csv'
docker.enabled                  = true
docker.runOptions               = '-u \$(id -u):\$(id -g)'
docker.sudo                     = false
docker.temp                     = /tmp
docker.remove                   = true
docker.registry                 = ''
docker.fixOwnership             = true
docker.engineOptions            = ''
docker.mountFlags               = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 8

What about using a custom mash reference? Using a custom mash reference file named 'refseqv215_mash_reference.msh' and Kraken2 database in a directory named 'database' for fasta files in a directory named 'fastas' using singularity

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --fastas fastas --kraken2_db database --mash_db refseqv215_mash_reference.msh

Config file:

sample.config

params.mash_db                  = 'refseqv215_mash_reference.msh'
params.kraken2_db               = 'database'
params.fastas                   = 'fastas'
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 9

Adding a genome 'cool_genome.fasta' to the fastani references to be run with fasta files in a directory named 'contigs' including phylogenetic analysis and using singularity for container management

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --fastas contigs --msa --fastani_ref  cool_genome.fasta --current_datasets false

Config file:

sample.config

params.msa                      = true
params.fastas                   = 'contigs'
params.fastani_ref              = 'cool_genome.fasta'
params.current_datasets         = false
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 10

Skipping all "non-essential" processes while creating a phylogenetic tree for some fastas in a directory named 'contigs' using singularity

Command line:

# using a profile
nextflow run UPHL-BioNGS/Grandeur -profile singularity,just_msa --fastas contigs
# specifying the params on the command line
nextflow run UPHL-BioNGS/Grandeur -profile singularity --fastas contigs --skip_extras --msa

Config file:

sample.config

params.msa                      = true
params.skip_extras              = true
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 11

What if there's an organism that need additional references for fastani? There is a param called 'current_datasets' that, when set to 'true', will look up each species from mash, kraken2 (optional), and blobtools (optional) and download the representative genome for each relevant species and use that with fastani. This requires an internet connection.

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --sample_sheet sampleSheet.csv --current_datasets true

Config file:

sample.config

params.current_datasets         = true
params.sample_sheet             = 'sampleSheet.csv'
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config

Example 12

Changing where the final files are using a sample sheet and singularity

Command line:

nextflow run UPHL-BioNGS/Grandeur -profile singularity --sample_sheet sampleSheet.csv --outdir new_directory

Config file:

sample.config

params.sample_sheet             = 'sampleSheet.csv'
params.outdir                   = 'new_directory'
singularity.enabled             = true
singularity.autoMounts          = true
singularity.runOptions          = ''
singularity.engineOptions       = ''
singularity.cacheDir            = ''

Using the config file

nextflow run UPHL-BioNGS/Grandeur -c sample.config
Clone this wiki locally