Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Healx starting differential abundance workflow #11

Merged
merged 48 commits into from
Nov 14, 2022
Merged
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
d723382
Hacky install of validatefomcomponents pending review of PR
Oct 27, 2022
2dd9ebd
Hacky install of validatefomcomponents pending review of PR
Oct 27, 2022
045e566
Basic starting version of a differential abundance workflow
Oct 28, 2022
80648f1
Make plot dir output paths more friendly
Oct 28, 2022
8758ae3
Correct selctors
Oct 28, 2022
cbeefbf
Add a study name (may not need this in the end)
Oct 28, 2022
99ae45a
Apply DESeq2 versions channel fix (currenlty in PR)
Oct 29, 2022
4a25bf5
Apply other module fixes currently in PR
Oct 29, 2022
cb8f9e6
Apply prettier to schema
Oct 31, 2022
d607135
appease eclint
Oct 31, 2022
e300932
Add .nf-core.yml
Oct 31, 2022
6b97e7a
Don't need local input_check
Oct 31, 2022
d1b15af
Fix test profile
Oct 31, 2022
6514378
appease eclint
Oct 31, 2022
c252e96
Add starting Tower config
Oct 31, 2022
232bc1e
Fix modules.json for fixes in modules repo
Oct 31, 2022
ea93358
Install shinyngs/validatefomcomponents from nf-core
Oct 31, 2022
01fbd1e
Dummy file to keep the local subdir
Oct 31, 2022
bbcc161
Add missing module to config (wrong SHA pending PR)
Oct 31, 2022
347e5b9
Don't need genome fasta
Oct 31, 2022
6e781c6
Add missing params to nextflow.config
Oct 31, 2022
eb223be
Handle GTF input consistently with other pipelines
Oct 31, 2022
4055a63
Try issue template update
Oct 31, 2022
bc8016d
Do the easy TODOs
Oct 31, 2022
7acc7c1
Try again with Tower config
Oct 31, 2022
7b39168
Try more useful publishing for DESeq2
Oct 31, 2022
d378ad7
fix syntax issue
Oct 31, 2022
3b82e88
fix syntax issue
Oct 31, 2022
514a32d
Try more tower config
Oct 31, 2022
8602b2a
Fix tower.yml
Oct 31, 2022
7daed58
Run prettier on tower.yml
Nov 1, 2022
ad64822
Fix mime types for Tower
Nov 3, 2022
2b7c8b3
Make gzipping on input GTF optional
Nov 3, 2022
b449476
Conditional gzip version mixing
Nov 3, 2022
45a6a9e
Re-prettify schema after UI building
Nov 3, 2022
6070916
fix up gunzip conditionality
Nov 3, 2022
18182a2
(hopefully) last gunzip fix
Nov 3, 2022
5d1e9ed
Deal with possible NAs in blocking column
Nov 3, 2022
6761a65
Allow retries for exploratory
Nov 3, 2022
2fe6757
Bump resources for exploratory plotting
Nov 3, 2022
41b5922
Revert "Bump resources for exploratory plotting"
Nov 4, 2022
7fa6377
Revert "Allow retries for exploratory"
Nov 4, 2022
f8742c0
See if we can relabel resource from nextflow.config
Nov 4, 2022
04f791a
Labels not working, specify memory directly
Nov 4, 2022
8cd6607
install staticdifferential from nf-core
Nov 8, 2022
7b24450
Update DESeq2 module and configuration
Nov 8, 2022
134bd80
Strip multiqc until we figure out how to integrate it
Nov 11, 2022
916de39
Update plotting modules to exclude _files dirs
Nov 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ body:
description: A clear and concise description of what the bug is.
validations:
required: true

- type: textarea
id: command_used
attributes:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
version: "${{ matrix.NXF_VER }}"

- name: Run pipeline with test data
# TODO nf-core: You can customise CI pipeline run tests as required
# You can customise CI pipeline run tests as required
# For example: adding multiple test runs with different parameters
# Remember that you can parallelise this by using strategy.matrix
run: |
Expand Down
6 changes: 6 additions & 0 deletions .nf-core.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
repository_type: pipeline
lint:
files_unchanged:
- assets/email_template.html
- assets/email_template.txt
- lib/NfcoreTemplate.groovy
14 changes: 6 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

## Introduction

<!-- TODO nf-core: Write a 1-2 sentence summary of what data the pipeline is for and what it does -->
**nf-core/differentialabundance** is a bioinformatics pipeline that can be used to analyse data represented as matrices, compararing groups of observations to generate differential statistics and downstream analayses. The initial feature set is built around RNA-seq, but we anticipate rapid expansion to include other platforms.

**nf-core/differentialabundance** is a bioinformatics best-practice analysis pipeline for differential abundance analysis.

Expand All @@ -25,10 +25,10 @@ On release, automated continuous integration tests run the pipeline on a full-si

## Pipeline summary

<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->

1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
1. Generate a list of genomic feature annotations using the input GTF file.
2. Cross-check matrices, sample annotations, feature set and contrasts to ensure consistency.
3. Run differential analyis over all contrasts specified.
4. Generate exploratory and differential analysis plots for interpretation.

## Quick Start

Expand All @@ -51,10 +51,8 @@ On release, automated continuous integration tests run the pipeline on a full-si

4. Start running your own analysis!

<!-- TODO nf-core: Update the example "typical command" below used to run the pipeline -->

```bash
nextflow run nf-core/differentialabundance --input samplesheet.csv --outdir <OUTDIR> --genome GRCh37 -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
nextflow run nf-core/differentialabundance --input samplesheet.csv --contrasts contrasts.csv --matrix assay_matrix.tsv --outdir <OUTDIR> --genome GRCh37 -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
```

## Documentation
Expand Down
4 changes: 1 addition & 3 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

process {

// TODO nf-core: Check the defaults for all processes
cpus = { check_max( 1 * task.attempt, 'cpus' ) }
memory = { check_max( 6.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
Expand All @@ -24,7 +23,6 @@ process {
// These labels are used and recognised by default in DSL2 files hosted on nf-core/modules.
// If possible, it would be nice to keep the same label naming convention when
// adding in your local modules too.
// TODO nf-core: Customise requirements for specific processes.
// See https://www.nextflow.io/docs/latest/config.html#config-process-selectors
withLabel:process_single {
cpus = { check_max( 1 , 'cpus' ) }
Expand All @@ -38,7 +36,7 @@ process {
}
withLabel:process_medium {
cpus = { check_max( 6 * task.attempt, 'cpus' ) }
memory = { check_max( 36.GB * task.attempt, 'memory' ) }
memory = { check_max( 2.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
}
withLabel:process_high {
Expand Down
63 changes: 56 additions & 7 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -13,26 +13,75 @@
process {

publishDir = [
path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" },
path: { "${params.outdir}/${params.study_name}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

withName: SAMPLESHEET_CHECK {
withName: GUNZIP_GTF {
publishDir = [
path: { "${params.outdir}/pipeline_info" },
enabled: false
]
}

withName: GTF_TO_TABLE {
publishDir = [
enabled: false
]
ext.args = "--feature-type transcript"
}

withName: VALIDATOR {
publishDir = [
enabled: false
]
}

withName: DESEQ2_DIFFERENTIAL {
publishDir = [
[
path: { "${params.outdir}/${params.study_name}/tables/differential" },
mode: params.publish_dir_mode,
pattern: '*.deseq2.results.tsv'
],
[
path: { "${params.outdir}/${params.study_name}/tables/processed_counts" },
mode: params.publish_dir_mode,
pattern: '*.{normalised_counts,vst,rlog}.tsv'
],
[
path: { "${params.outdir}/${params.study_name}/plots/qc" },
mode: params.publish_dir_mode,
pattern: '*.png'
],
[
path: { "${params.outdir}/${params.study_name}/tables/deseq2_other" },
mode: params.publish_dir_mode,
pattern: '*.{rds,sizefactors.tsv,sessionInfo.log}'
]
]
ext.args = { "--vst_nsub 500 --contrast_variable $meta.variable --reference_level $meta.reference --treatment_level $meta.target --blocking_variables $meta.blocking" }
}
withName: PLOT_EXPLORATORY {
publishDir = [
path: { "${params.outdir}/${params.study_name}/plots/exploratory" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
memory = { check_max( 12.GB * task.attempt, 'memory' ) }
ext.args = "--assay_names raw,normalised,variance_stabilised --final_assay variance_stabilised"
}

withName: FASTQC {
ext.args = '--quiet'
withName: PLOT_DIFFERENTIAL {
publishDir = [
path: { "${params.outdir}/${params.study_name}/plots/differential" },
mode: params.publish_dir_mode,
]
ext.args = { "--reference_level $meta.reference --treatment_level $meta.target" }
}

withName: CUSTOM_DUMPSOFTWAREVERSIONS {
publishDir = [
path: { "${params.outdir}/pipeline_info" },
path: { "${params.outdir}/${params.study_name}/pipeline_info" },
mode: params.publish_dir_mode,
pattern: '*_versions.yml'
]
Expand Down
12 changes: 7 additions & 5 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@ params {
max_time = '6.h'

// Input data
// TODO nf-core: Specify the paths to your test data on nf-core/test-datasets
// TODO nf-core: Give any required params for the test so that command line flags are not needed
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv'

// Genome references
genome = 'R64-1-1'
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.samplesheet.csv'
matrix = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.salmon.merged.gene_counts.top1000cov.tsv'
contrasts = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.contrasts.csv'

// To do: replace this with a cut-down mouse GTF matching the matrix for testing
gtf = 'https://ftp.ensembl.org/pub/release-81/gtf/mus_musculus/Mus_musculus.GRCm38.81.gtf.gz'

}
8 changes: 4 additions & 4 deletions lib/WorkflowDifferentialabundance.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ class WorkflowDifferentialabundance {
genomeExistsError(params, log)


if (!params.fasta) {
log.error "Genome fasta file not specified with e.g. '--fasta genome.fa' or via a detectable config file."
System.exit(1)
}
// if (!params.fasta) {
// log.error "Genome fasta file not specified with e.g. '--fasta genome.fa' or via a detectable config file."
// System.exit(1)
// }
}

//
Expand Down
2 changes: 1 addition & 1 deletion main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ nextflow.enable.dsl = 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

params.fasta = WorkflowMain.getGenomeAttribute(params, 'fasta')
params.gtf = WorkflowMain.getGenomeAttribute(params, 'gtf')

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
18 changes: 15 additions & 3 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,37 @@
"https://github.com/nf-core/modules.git": {
"modules": {
"nf-core": {
"atlasgeneannotationmanipulation/gtf2featureannotation": {
"branch": "master",
"git_sha": "b5f51344caec9cbb7c0b650623069d5a254897a4"
},
"custom/dumpsoftwareversions": {
"branch": "master",
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"deseq2/differential": {
"branch": "master",
"git_sha": "72900e24112957e65c53763f59ade5aca6322ea7"
"git_sha": "46e4580ae2e6d361c2c026bb7466e807093a4e18"
},
"fastqc": {
"gunzip": {
"branch": "master",
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"multiqc": {
"branch": "master",
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"shinyngs/staticdifferential": {
"branch": "master",
"git_sha": "36c30c95b8d3fa485939fdeca7325c98cdd9c73e"
},
"shinyngs/staticexploratory": {
"branch": "master",
"git_sha": "bf69d2f911a8aae8d06a6d3fa7ca8bf462c41e9e"
"git_sha": "6d26aab1ab4fdeaabac866924dcfb4370ff5eeac"
},
"shinyngs/validatefomcomponents": {
"branch": "master",
"git_sha": "72695ee4b283a8f479fe503ee91d19fde160508f"
}
}
}
Expand Down
Empty file added modules/local/.gitkeep
Empty file.
27 changes: 0 additions & 27 deletions modules/local/samplesheet_check.nf

This file was deleted.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions modules/nf-core/deseq2/differential/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading