Healx starting differential abundance workflow #11

pinin4fjords · 2022-10-28T22:16:54Z

For demonstration and basis for discussion, this WIP PR demonstrates my take on a differential abundance workflow using various recently developed modules.

One or two modules are still awaiting approval, others are awaiting PR approvals for fixes, but all should work fine as committed here. I run locally like:

nextflow run -resume -profile mamba main.nf \
    --input $(pwd)/testdata/SRP254919.samplesheet.csv \
    --gtf $(pwd)/testdata/genes.gtf.gz \
    --contrasts $(pwd)/testdata/SRP254919.contrasts.csv \
    --matrix $(pwd)/testdata/SRP254919.salmon.merged.gene_counts.top1000cov.tsv \
    --outdir $(pwd)/testdata/output

All test files are available in the test data repo, apart from the GTF, which I retrieved from iGenomes for mouse.

The steps are:

Make a feature annotation table from a GTF
Make a feature/ observation / matrix composite from the features, samples and input matrix (I see this as being transferable to features other than genes in future).
Run a validation to check the internal consistency of features, samples, matrix and contrasts
Run differential expression analysis per contrast with DESeq2
Run an exploratory analysis on matrix outputs, with separate coloring for each unique variable used to define contrasts (see notes in workflow comment)
Generate volcano plots per contrast.

The output file structure is like:

testdata/output
├── pipeline_info
│   ├── execution_report_2022-10-28_23-00-55.html
│   ├── execution_timeline_2022-10-28_23-00-55.html
│   ├── execution_trace_2022-10-28_23-00-55.txt
│   ├── pipeline_dag_2022-10-28_23-00-55.html
│   └── software_versions.yml
├── plots
│   ├── differential
│   │   ├── treatment_mCherry_hND6_
│   │   │   ├── html
│   │   │   └── png
│   │   ├── treatment_mCherry_hND6_sample_number
│   │   │   ├── html
│   │   │   └── png
│   │   └── versions.yml
│   └── exploratory
│       ├── treatment
│       │   ├── html
│       │   └── png
│       └── versions.yml
└── tables
    └── differential
        ├── treatment-mCherry-hND6-sample_number.R_sessionInfo.log
        ├── treatment-mCherry-hND6-sample_number.dds.rld.rds
        ├── treatment-mCherry-hND6-sample_number.deseq2.dispersion.png
        ├── treatment-mCherry-hND6-sample_number.deseq2.results.tsv
        ├── treatment-mCherry-hND6-sample_number.deseq2.sizefactors.tsv
        ├── treatment-mCherry-hND6-sample_number.normalised_counts.tsv
        ├── treatment-mCherry-hND6-sample_number.vst.tsv
        ├── treatment-mCherry-hND6.R_sessionInfo.log
        ├── treatment-mCherry-hND6.dds.rld.rds
        ├── treatment-mCherry-hND6.deseq2.dispersion.png
        ├── treatment-mCherry-hND6.deseq2.results.tsv
        ├── treatment-mCherry-hND6.deseq2.sizefactors.tsv
        ├── treatment-mCherry-hND6.normalised_counts.tsv
        ├── treatment-mCherry-hND6.vst.tsv
        └── versions.yml

15 directories, 22 files

To do

Gather feedback
Get the actual CI up and running using the available test data
Start to account for non-RNA-seq data
Better integrated reporting (MultiQC integration?)

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/differentialabundance branch on the nf-core/test-datasets repository.
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

pinin4fjords · 2022-10-29T22:32:26Z

nextflow.config

@@ -91,6 +93,7 @@ profiles {
        params.enable_conda    = true
        conda.useMamba         = true
        docker.enabled         = false
+        conda.enabled          = true


(note that this is because nf-core/tools#1952 won't have been in the release version of the tools Oskar used)

github-actions · 2022-10-31T14:44:07Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 916de39

+| ✅ 154 tests passed       |+
#| ❔   3 tests were ignored |#
!| ❗  11 tests had warnings |!

❗ Test warnings:

pipeline_todos - TODO string in README.md: Add full-sized test dataset and amend the paragraph below if applicable
pipeline_todos - TODO string in README.md: If applicable, make list of people who have also contributed
pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
pipeline_todos - TODO string in WorkflowMain.groovy: Add Zenodo DOI for pipeline after first release
pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline
pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.

❔ Tests ignored:

files_unchanged - File ignored due to lint config: assets/email_template.html
files_unchanged - File ignored due to lint config: assets/email_template.txt
files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-differentialabundance_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-differentialabundance_logo_light.png
files_exist - File found: docs/images/nf-core-differentialabundance_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreSchema.groovy
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowDifferentialabundance.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-differentialabundance_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.show_hidden_params
nextflow_config - Config variable found: params.schema_ignore_params
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.version
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: '1.0dev'
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-differentialabundance_logo_light.png matches the template
files_unchanged - docs/images/nf-core-differentialabundance_logo_light.png matches the template
files_unchanged - docs/images/nf-core-differentialabundance_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreSchema.groovy matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 21.10.3, Config: 21.10.3
readme - README Nextflow minimum version in Quick Start section matched config. README: 21.10.3, Config: 21.10.3
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (79 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: awstest.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.6
Run at 2022-11-11 17:06:49

pinin4fjords · 2022-11-01T10:31:29Z

The README, docs etc are not quite complete, but this is working in minimal form, including the basic tests. Realise you have your own approach @WackerO @ggabernet so you may well rather do things a different way, but hopefully some of what's here is useful.

The modules are all as in the nf-core modules repo, except for nf-core/modules#2399, which is still in review.

I did encounter nextflow-io/nextflow#3328 when using this workflow in Tower, just FYI in case you're also using Tower. Hopefully it will be fixed at some point - Paolo is aware.

This reverts commit 2fe6757.

This reverts commit 6761a65.

WackerO

Other than MultiQC LGTM

workflows/differentialabundance.nf

pinin4fjords · 2022-11-14T09:13:04Z

Thanks for review @WackerO

Jonathan Manning added 6 commits October 28, 2022 22:31

Hacky install of validatefomcomponents pending review of PR

d723382

Hacky install of validatefomcomponents pending review of PR

2dd9ebd

Basic starting version of a differential abundance workflow

045e566

Make plot dir output paths more friendly

80648f1

Correct selctors

8758ae3

Add a study name (may not need this in the end)

cbeefbf

pinin4fjords marked this pull request as draft October 28, 2022 22:18

Jonathan Manning added 2 commits October 29, 2022 23:24

Apply DESeq2 versions channel fix (currenlty in PR)

99ae45a

Apply other module fixes currently in PR

4a25bf5

pinin4fjords commented Oct 29, 2022

View reviewed changes

Jonathan Manning added 10 commits October 31, 2022 10:21

Apply prettier to schema

cb8f9e6

appease eclint

d607135

Add .nf-core.yml

e300932

Don't need local input_check

6b97e7a

Fix test profile

d1b15af

appease eclint

6514378

Add starting Tower config

c252e96

Fix modules.json for fixes in modules repo

232bc1e

Install shinyngs/validatefomcomponents from nf-core

ea93358

Dummy file to keep the local subdir

01fbd1e

Jonathan Manning added 9 commits October 31, 2022 14:45

Add missing module to config (wrong SHA pending PR)

bbcc161

Don't need genome fasta

347e5b9

Add missing params to nextflow.config

6e781c6

Handle GTF input consistently with other pipelines

eb223be

Try issue template update

4055a63

Do the easy TODOs

bc8016d

Try again with Tower config

7acc7c1

Try more useful publishing for DESeq2

7b39168

fix syntax issue

d378ad7

pinin4fjords marked this pull request as ready for review November 1, 2022 10:25

This was referenced Nov 1, 2022

Add input checking #4

Closed

Add differential analysis #5

Closed

Add differential plotting #7

Closed

Add exploratory plotting #6

Closed

This was linked to issues Nov 1, 2022

Add input checking #4

Closed

Add differential analysis #5

Closed

Add exploratory plotting #6

Closed

Add differential plotting #7

Closed

Jonathan Manning added 15 commits November 3, 2022 09:11

Fix mime types for Tower

ad64822

Make gzipping on input GTF optional

2b7c8b3

Conditional gzip version mixing

b449476

Re-prettify schema after UI building

45a6a9e

fix up gunzip conditionality

6070916

(hopefully) last gunzip fix

18182a2

Deal with possible NAs in blocking column

5d1e9ed

Allow retries for exploratory

6761a65

Bump resources for exploratory plotting

2fe6757

Revert "Bump resources for exploratory plotting"

41b5922

This reverts commit 2fe6757.

Revert "Allow retries for exploratory"

7fa6377

This reverts commit 6761a65.

See if we can relabel resource from nextflow.config

f8742c0

Labels not working, specify memory directly

04f791a

install staticdifferential from nf-core

8cd6607

Update DESeq2 module and configuration

7b24450

WackerO approved these changes Nov 10, 2022

View reviewed changes

workflows/differentialabundance.nf Outdated Show resolved Hide resolved

Jonathan Manning added 2 commits November 11, 2022 16:55

Strip multiqc until we figure out how to integrate it

134bd80

Update plotting modules to exclude _files dirs

916de39

pinin4fjords merged commit 6f646e6 into nf-core:dev Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Healx starting differential abundance workflow #11

Healx starting differential abundance workflow #11

pinin4fjords commented Oct 28, 2022 •

edited by ewels

Loading

pinin4fjords Oct 29, 2022

github-actions bot commented Oct 31, 2022 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

pinin4fjords commented Nov 1, 2022

WackerO left a comment

pinin4fjords commented Nov 14, 2022

Healx starting differential abundance workflow #11

Healx starting differential abundance workflow #11

Conversation

pinin4fjords commented Oct 28, 2022 • edited by ewels Loading

To do

PR checklist

pinin4fjords Oct 29, 2022

Choose a reason for hiding this comment

github-actions bot commented Oct 31, 2022 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

pinin4fjords commented Nov 1, 2022

WackerO left a comment

Choose a reason for hiding this comment

pinin4fjords commented Nov 14, 2022

pinin4fjords commented Oct 28, 2022 •

edited by ewels

Loading

github-actions bot commented Oct 31, 2022 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️