Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cd-hit module #2676

Merged
merged 31 commits into from
Dec 20, 2022
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
94cb605
Create module and tests
Dec 14, 2022
17fe039
Add biocontainer and galaxy project container
Dec 14, 2022
e303fc2
Changes to main.nf
Dec 15, 2022
9b0932c
Add cdhit/cdhit sub-folder as cdhit has many sub-functions with diffe…
Dec 15, 2022
5faee04
Add cd-hit command line and print version number
Dec 15, 2022
047bdec
Remove TODOs from main.nf
Dec 15, 2022
04567fd
Write meta.yml
Dec 15, 2022
0a6b54e
Edit meta.yml
Dec 15, 2022
68ffcab
Update tests for cdhit/cdhit directory
Dec 16, 2022
b72c04f
Get test working and yaml written
Dec 16, 2022
a9e94ef
Fix cdhit/cdhit in pytest_modules
Dec 16, 2022
fd39d6b
Merge branch 'master' into cdhitest
timslittle Dec 16, 2022
b33f450
Remove trailing whitespace
timslittle Dec 16, 2022
78b7833
Update modules/nf-core/cdhit/cdhit/meta.yml, '|' to ','
timslittle Dec 19, 2022
9b8a33a
Change CDHIT to CDHIT/CDHIT and indent the cd-hit command line options
timslittle Dec 19, 2022
1cf0bae
Memory to mega and remove parameters to be specified in task.ext.args
timslittle Dec 19, 2022
eb1c633
Fix trailing ) in the cdhit version output
timslittle Dec 19, 2022
43d5fe8
Remove comment hint
timslittle Dec 19, 2022
307b154
Update modules/nf-core/cdhit/cdhit/meta.yml
timslittle Dec 19, 2022
39dc6d3
Update modules/nf-core/cdhit/cdhit/main.nf
timslittle Dec 19, 2022
7446906
CDHIT to CDHIT_CDHIT Update tests/modules/nf-core/cdhit/cdhit/main.nf
timslittle Dec 19, 2022
954b87d
CDHIT to CDHIT_CDHIT Update tests/modules/nf-core/cdhit/cdhit/main.nf
timslittle Dec 19, 2022
d9bb99e
Update modules/nf-core/cdhit/cdhit/meta.yml
timslittle Dec 19, 2022
daa5504
Simplify version reporting - Update modules/nf-core/cdhit/cdhit/main.nf
timslittle Dec 19, 2022
88f066d
Merge branch 'master' into cdhitest
timslittle Dec 19, 2022
5e1beb2
Update conda dependencie - Update modules/nf-core/cdhit/cdhit/main.nf
timslittle Dec 20, 2022
373d4e0
cdhit to cdhit/cdhit - Update tests/config/pytest_modules.yml
timslittle Dec 20, 2022
3903b3c
Fix versions output - Update modules/nf-core/cdhit/cdhit/main.nf
timslittle Dec 20, 2022
ed7785c
Merge branch 'master' into cdhitest
timslittle Dec 20, 2022
e43b6a4
More cdhit to cdhit/cdhit - Update modules/nf-core/cdhit/cdhit/meta.yml
timslittle Dec 20, 2022
2623360
Merge branch 'master' into cdhitest
muffato Dec 20, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions modules/nf-core/cdhit/cdhit/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
process CDHIT_CDHIT {
tag "$meta.id"
label 'process_medium'

conda (params.enable_conda ? "bioconda::cd-hit=4.8.1" : null)
timslittle marked this conversation as resolved.
Show resolved Hide resolved
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/cd-hit%3A4.8.1--h5b5514e_7':
'quay.io/biocontainers/cd-hit:4.8.1--h5b5514e_7' }"

input:
tuple val(meta), path(sequences)

output:
tuple val(meta), path("*.fasta") ,emit: fasta
tuple val(meta), path("*.clstr") ,emit: clusters
path "versions.yml" ,emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
"""
cd-hit \\
-i $sequences \\
-o ${prefix}.fasta \\
-M $task.memory.mega \\
-T $task.cpus

cat <<-END_VERSIONS > versions.yml
"${task.process}":
cdhit: \$(echo \$(cd-hit -h|head -n 1 2>&1) | sed 's/====== CD-HIT version //;s/ (built on .*) ======//' )
timslittle marked this conversation as resolved.
Show resolved Hide resolved
END_VERSIONS
"""
}
49 changes: 49 additions & 0 deletions modules/nf-core/cdhit/cdhit/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: "cdhit"
timslittle marked this conversation as resolved.
Show resolved Hide resolved
description: Cluster protein sequences using sequence similarity
keywords:
- cluster
- protein
- alignment
- fasta
tools:
- "cdhit":
description: "Clusters and compares protein or nucleotide sequences"
homepage: "https://sites.google.com/view/cd-hit/home"
documentation: "https://github.com/weizhongli/cdhit/wiki"
tool_dev_url: "https://github.com/weizhongli/cdhit"
doi: "10.1093/bioinformatics/btl158"
licence: "['GPL v2']"

input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- sequences:
type: file
description: fasta file of sequences to be clustered
pattern: "*.{fa,fasta}"

output:
#Only when we have meta
timslittle marked this conversation as resolved.
Show resolved Hide resolved
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
- fasta:
type: file
description: fasta file of the representative sequences for each cluster
pattern: "*.{fasta}"
- clusters:
type: file
description: List of clusters
pattern: "*.{clstr}"

authors:
- "@timslittle"
4 changes: 4 additions & 0 deletions tests/config/pytest_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -529,6 +529,10 @@ cat/fastq:
- modules/nf-core/cat/fastq/**
- tests/modules/nf-core/cat/fastq/**

cdhit:
timslittle marked this conversation as resolved.
Show resolved Hide resolved
- modules/nf-core/cdhit/cdhit/**
- tests/modules/nf-core/cdhit/cdhit/**

cellranger/count:
- modules/nf-core/cellranger/count/**
- tests/modules/nf-core/cellranger/count/**
Expand Down
15 changes: 15 additions & 0 deletions tests/modules/nf-core/cdhit/cdhit/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { CDHIT } from '../../../../../modules/nf-core/cdhit/cdhit/main.nf'
timslittle marked this conversation as resolved.
Show resolved Hide resolved

workflow test_cdhit {

input = [
[ id:'test', single_end:false ], // meta map
file(params.test_data['proteomics']['database']['yeast_ups'], checkIfExists: true)
]

CDHIT ( input )
timslittle marked this conversation as resolved.
Show resolved Hide resolved
}
5 changes: 5 additions & 0 deletions tests/modules/nf-core/cdhit/cdhit/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
process {

publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }

}
11 changes: 11 additions & 0 deletions tests/modules/nf-core/cdhit/cdhit/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
- name: cdhit cdhit test_cdhit
command: nextflow run ./tests/modules/nf-core/cdhit/cdhit -entry test_cdhit -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/cdhit/cdhit/nextflow.config
tags:
- cdhit
- cdhit/cdhit
files:
- path: output/cdhit/test.fasta
md5sum: 0d4dda84911f7ffa3237c82391160c45
- path: output/cdhit/test.fasta.clstr
md5sum: 07b5af3b377f05c6970e6f3d2bb75ef2
- path: output/cdhit/versions.yml