Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fastqc to produce multi-version versions.yml #665

Merged
merged 19 commits into from
Sep 24, 2021
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Closes #XXX <!-- If this PR fixes an issue, please link it here! -->
- [ ] If you've added a new tool - have you followed the module conventions in the [contribution docs](https://github.com/nf-core/modules/tree/master/.github/CONTRIBUTING.md)
- [ ] If necessary, include test data in your PR.
- [ ] Remove all TODO statements.
- [ ] Emit the `<SOFTWARE>.version.txt` file.
- [ ] Emit the `versions.yml` file.
- [ ] Follow the naming conventions.
- [ ] Follow the parameters requirements.
- [ ] Follow the input/output options guidelines.
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@ output/
*.code-workspace
.screenrc
.*.sw?
__pycache__
*.pyo
*.pyc
16 changes: 14 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -411,10 +411,22 @@ using a combination of `bwa` and `samtools` to output a BAM file instead of a SA
- `*.fastq.gz` and NOT `*.fastq`
- `*.bam` and NOT `*.sam`

- Where applicable, each module command MUST emit a file `<SOFTWARE>.version.txt` containing a single line with the software's version in the format `<VERSION_NUMBER>` or `0.7.17` e.g.
- Where applicable, each module command MUST emit a file `versions.yml` emitting the version number for each tool executed by the module, e.g.

```bash
echo \$(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*\$//' > ${software}.version.txt
cat <<-END_VERSIONS > versions.yml
grst marked this conversation as resolved.
Show resolved Hide resolved
${getProcessName(task.process)}:
fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
samtools: \$( samtools --version 2>&1 | sed 's/^.*samtools //; s/Using.*\$// )
END_VERSION
```

resulting in, for instance,

```yaml
FASTQC:
fastqc: 0.11.9
samtools: 1.12
```

If the software is unable to output a version number on the command-line then a variable called `VERSION` can be manually specified to create this file e.g. [homer/annotatepeaks module](https://github.com/nf-core/modules/blob/master/modules/homer/annotatepeaks/main.nf).
Expand Down
11 changes: 10 additions & 1 deletion modules/fastqc/functions.nf
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,13 @@ def getSoftwareName(task_process) {
return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()
}

//
// Extract name of module from process name using $task.process
//
def getProcessName(task_process) {
return task_process.tokenize(':')[-1]
}

//
// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules
//
Expand Down Expand Up @@ -37,7 +44,9 @@ def getPathFromList(path_list) {
// Function to save/publish module results
//
def saveFiles(Map args) {
if (!args.filename.endsWith('.version.txt')) {
// TODO better way to detect that we are in a test and want to emit `versions.yml`
// or a least use a dedicated, unique environment variable.
if (!args.filename.equals('versions.yml') || System.getenv("PROFILE")) {
grst marked this conversation as resolved.
Show resolved Hide resolved
def ioptions = initOptions(args.options)
def path_list = [ ioptions.publish_dir ?: args.publish_dir ]
if (ioptions.publish_by_meta) {
Expand Down
17 changes: 12 additions & 5 deletions modules/fastqc/main.nf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
include { initOptions; saveFiles; getSoftwareName; getProcessName } from './functions'

params.options = [:]
options = initOptions(params.options)
Expand All @@ -24,24 +24,31 @@ process FASTQC {
output:
tuple val(meta), path("*.html"), emit: html
tuple val(meta), path("*.zip") , emit: zip
path "*.version.txt" , emit: version
path "versions.yml" , emit: version

script:
// Add soft-links to original FastQs for consistent naming in pipeline
def software = getSoftwareName(task.process)
def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}"
if (meta.single_end) {
"""
[ ! -f ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz
fastqc $options.args --threads $task.cpus ${prefix}.fastq.gz
fastqc --version | sed -e "s/FastQC v//g" > ${software}.version.txt

cat <<-END_VERSIONS > versions.yml
${getProcessName(task.process)}:
fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS
"""
} else {
"""
[ ! -f ${prefix}_1.fastq.gz ] && ln -s ${reads[0]} ${prefix}_1.fastq.gz
[ ! -f ${prefix}_2.fastq.gz ] && ln -s ${reads[1]} ${prefix}_2.fastq.gz
fastqc $options.args --threads $task.cpus ${prefix}_1.fastq.gz ${prefix}_2.fastq.gz
fastqc --version | sed -e "s/FastQC v//g" > ${software}.version.txt

cat <<-END_VERSIONS > versions.yml
${getProcessName(task.process)}:
fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS
"""
}
}
2 changes: 1 addition & 1 deletion modules/fastqc/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ output:
- version:
type: file
description: File containing software version
pattern: "*.{version.txt}"
pattern: "versions.yml"
authors:
- "@drpatelh"
- "@grst"
Expand Down
2 changes: 2 additions & 0 deletions tests/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Makes the tests directory a python module
# (required to obtain resources from this folder from within Python)
21 changes: 21 additions & 0 deletions tests/test_versions_yml.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
from pathlib import Path
import pytest
import yaml
grst marked this conversation as resolved.
Show resolved Hide resolved


def _get_workflow_names():
here = Path(__file__).parent.resolve()
pytest_workflow_files = here.glob("**/test.yml")
for f in pytest_workflow_files:
test_config = yaml.safe_load(f.read_text())
for workflow in test_config:
yield workflow["name"]


@pytest.mark.workflow(*_get_workflow_names())
def test_ensure_valid_version_yml(workflow_dir):
workflow_dir = Path(workflow_dir)
software_name = workflow_dir.name.split("_")[0].lower()
versions_yml = (workflow_dir / f"output/{software_name}/versions.yml").read_text()
assert "END_VERSIONS" not in versions_yml
yaml.safe_load(versions_yml)