Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diamond blastx subworkflow #74

Closed
wants to merge 28 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
b2e2e06
added chunk_busco module
May 23, 2023
a833404
added diamond_blastx module
alxndrdiaz Jun 2, 2023
94b44c9
added blastx_cols and blastx_outext
alxndrdiaz Jun 2, 2023
46ae681
fix DIAMOND_BLASTX path
alxndrdiaz Jun 2, 2023
b5594c4
added module BLOBTOOLKIT_UNCHUNK
alxndrdiaz Jun 3, 2023
4c3e944
diamond_blastx data for test_full
alxndrdiaz Jun 3, 2023
40c212a
removed log files
alxndrdiaz Jun 8, 2023
0ae6480
fix path to uniprot_blastx
alxndrdiaz Jun 8, 2023
a7502f0
renamed files in RUN_BLASTX subworkflow
alxndrdiaz Jun 8, 2023
8de106a
use meta and meta2
alxndrdiaz Jun 8, 2023
5466144
use names in RUN_BLASTX input channels
alxndrdiaz Jun 8, 2023
d667473
added uniprot_blastx
alxndrdiaz Jun 8, 2023
49cb8bf
merge blastx results in BlobDir
Jun 8, 2023
abc2b97
minimum Nextflow version 23.04.1
alxndrdiaz Jun 8, 2023
7bd7873
update uniprot databases
alxndrdiaz Jun 8, 2023
cec0022
updated paths to uniprot databases
alxndrdiaz Jun 9, 2023
ddb9645
Update conf/test.config
alxndrdiaz Jun 19, 2023
d814e5f
use new names for uniprot databases
alxndrdiaz Jun 19, 2023
3bc14aa
Update modules/local/blobtoolkit/chunk.nf
alxndrdiaz Jun 19, 2023
72f040c
update names for uniprot databases
alxndrdiaz Jun 19, 2023
0dbc9b6
fix description
alxndrdiaz Jun 19, 2023
c8bef6c
update uniprot database names
alxndrdiaz Jun 19, 2023
1c9f9c4
check params.blastp and params.blastx
alxndrdiaz Jun 19, 2023
8f3f719
independent channels in RUN_BLASTX
alxndrdiaz Jun 19, 2023
d21e11d
RUN_BLASTX subworkflow description
alxndrdiaz Jun 27, 2023
6c5e052
added RUN_BLASTX versions
alxndrdiaz Jun 27, 2023
524f81b
Revert "merge blastx results in BlobDir"
priyanka-surana Sep 17, 2023
751c07c
add update blobdir for blastx
priyanka-surana Sep 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file not shown.
Binary file added assets/test/mMelMel3.1.buscoregions.dmnd
Binary file not shown.
Binary file added assets/test_full/gfLaeSulp1.1.buscoregions.dmnd
Binary file not shown.
18 changes: 17 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,20 @@ process {
ext.args = "--evalue 1.0e-25 --max-target-seqs 10 --max-hsps 1"
}

withName: "DIAMOND_BLASTX" {
ext.args = "--evalue 1.0e-25 --max-target-seqs 10 --max-hsps 1"
}

withName: "BLOBTOOLKIT_WINDOWSTATS" {
ext.args = "--window 0.1 --window 0.01 --window 1 --window 100000 --window 1000000"
}

withName: "BLOBTOOLKIT_BLOBDIR" {
withName: "BLOBTOOLKIT_CREATEBLOBDIR" {
ext.args = "--evalue 1.0e-25 --hit-count 10"
}

withName: "BLOBTOOLKIT_CREATEBLOBDIR" {
ext.args = "--evalue 1.0e-25 --hit-count 10 --update-plot"
publishDir = [
path: { "${params.outdir}/" },
mode: params.publish_dir_mode,
Expand All @@ -66,6 +74,14 @@ process {
]
}

withName: "BLOBTOOLKIT_CHUNK" {
ext.args = "--chunk 100000 --overlap 0 --max-chunks 10 --min-length 1000"
}

withName: "BLOBTOOLKIT_UNCHUNK" {
ext.args = "--count 10"
}

withName: "CUSTOM_DUMPSOFTWAREVERSIONS" {
publishDir = [
path: { "${params.outdir}/blobtoolkit_info" },
Expand Down
7 changes: 4 additions & 3 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ params {
taxon = "Meles meles"

// Databases
taxdump = "/lustre/scratch123/tol/teams/grit/geval_pipeline/btk_databases/taxdump"
busco = "/lustre/scratch123/tol/resources/nextflow/busco_2021_06_reduced/"
uniprot = "${projectDir}/assets/test/mCerEla1.1.buscogenes.dmnd"
taxdump = "/lustre/scratch123/tol/teams/grit/geval_pipeline/btk_databases/taxdump"
busco = "/lustre/scratch123/tol/resources/nextflow/busco_2021_06_reduced/"
blastp = "${projectDir}/assets/test/mMelMel3.1.buscogenes.dmnd"
blastx = "${projectDir}/assets/test/mMelMel3.1.buscoregions.dmnd"
}
8 changes: 5 additions & 3 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ params {
taxon = "Laetiporus sulphureus"

// Databases
taxdump = "/lustre/scratch123/tol/teams/grit/geval_pipeline/btk_databases/taxdump"
busco = "/lustre/scratch123/tol/resources/busco/v5/"
uniprot = "${projectDir}/assets/test_full/gfLaeSulp1.1.buscogenes.dmnd"
taxdump = "/lustre/scratch123/tol/teams/grit/geval_pipeline/btk_databases/taxdump"
busco = "/lustre/scratch123/tol/resources/busco/v5/"
blastp = "${projectDir}/assets/test_full/gfLaeSulp1.1.buscogenes.dmnd"
blastx = "${projectDir}/assets/test_full/gfLaeSulp1.1.buscoregions.dmnd"

}
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"installed_by": ["modules"]
},
"diamond/blastx": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"installed_by": ["modules"]
},
"fastawindows": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
Expand Down
36 changes: 36 additions & 0 deletions modules/local/blobtoolkit/chunk.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
process BLOBTOOLKIT_CHUNK {
tag "$meta.id"
label 'process_single'

if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
exit 1, "BLOBTOOLKIT_CHUNK module does not support Conda. Please use Docker / Singularity / Podman instead."
}
container "genomehubs/blobtoolkit:4.1.5"

input:
tuple val(meta) , path(fasta)
tuple val(meta2), path(busco_table)

output:
tuple val(meta), path("*.chunks.fasta"), emit: chunks
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
"""
btk pipeline chunk-fasta \\
--in ${fasta} \\
--busco ${busco_table} \\
--out ${prefix}.chunks.fasta \\
$args

cat <<-END_VERSIONS > versions.yml
"${task.process}":
blobtoolkit: \$(btk --version | cut -d' ' -f2 | sed 's/v//')
END_VERSIONS
"""
}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
process BLOBTOOLKIT_BLOBDIR {
process BLOBTOOLKIT_CREATEBLOBDIR {
tag "$meta.id"
label 'process_medium'

Expand Down
34 changes: 34 additions & 0 deletions modules/local/blobtoolkit/unchunk.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
process BLOBTOOLKIT_UNCHUNK {
tag "$meta.id"
label 'process_single'

if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
exit 1, "BLOBTOOLKIT_UNCHUNK module does not support Conda. Please use Docker / Singularity / Podman instead."
}
container "genomehubs/blobtoolkit:4.1.5"

input:
tuple val(meta), path(blast_table)

output:
tuple val(meta), path("*.blastx.out"), emit: blastx
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
"""
btk pipeline unchunk-blast \\
--in ${blast_table} \\
--out ${prefix}.blastx.out \\
$args

cat <<-END_VERSIONS > versions.yml
"${task.process}":
blobtoolkit: \$(btk --version | cut -d' ' -f2 | sed 's/v//')
END_VERSIONS
"""
}
42 changes: 42 additions & 0 deletions modules/local/blobtoolkit/updateblobdir.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
process BLOBTOOLKIT_UPDATEBLOBDIR {
tag "$meta.id"
label 'process_medium'

if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
exit 1, "BLOBTOOLKIT_BLOBDIR module does not support Conda. Please use Docker / Singularity / Podman instead."
}
container "genomehubs/blobtoolkit:4.1.5"

input:
tuple val(meta), path(input)
tuple val(meta1), path(blastx)
path(taxdump)

output:
tuple val(meta), path(prefix), emit: blobdir
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
prefix = task.ext.prefix ?: "${meta.id}"
def hits_blastx = blastx ? "--hits ${blastx}" : ""
"""
blobtools replace \\
--taxdump ${taxdump} \\
--taxrule bestdistorder=buscoregions \\
${hits_blastx} \\
--bedtsvdir windowstats \\
--meta ${yaml} \\
--threads ${task.cpus} \\
$args \\
${input}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
blobtoolkit: \$(btk --version | cut -d' ' -f2 | sed 's/v//')
END_VERSIONS
"""
}
68 changes: 68 additions & 0 deletions modules/nf-core/diamond/blastx/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

81 changes: 81 additions & 0 deletions modules/nf-core/diamond/blastx/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,13 @@ params {
// Databases and related options
taxdump = null
busco = null
uniprot = null
blastp = null
blastx = null
blastp_outext = 'txt'
blastp_cols = 'qseqid staxids bitscore qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore'
blastx_outext = 'txt'
blastx_cols = 'qseqid staxids bitscore qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore'


// MultiQC options
multiqc_config = null
Expand Down
Loading