You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Despite supplying a pre-built decoy-aware Salmon index for transcripts, both Genome fasta and GTF files are still needed.
It is not clear why this is needed.
Genome fasta file not specified with e.g. '--fasta genome.fa' or via a detectable config file.
No GTF or GFF3 annotation specified! The pipeline requires at least one of these files.
Issue 2:
The fq subsample step is run, not sure if this is necessary for Salmon to infer strandedness.
Issue 3:
At some point in the pipeline, there is a failure due to an RSEM error.
It is not clear why RSEM is being called for the Reference Genome, when it is not part of Steps 1 and 3.
process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:MAKE_TRANSCRIPTS_FASTA (rsem/GRCh38.primary_assembly.genome.fa) [ 0%] 0 of 1
Issue 4:
The pipeline does not stop at Salmon quantification and tries to continue to unexpected next steps.
[78/cab4d8] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:SALMON_QUANT (ERR2179089) [100%] 1 of 1 ✔
[78/c6d1b3] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:TX2GENE (gencode.v46.primary_assembly.annotation.gtf) [100%] 1 of 1 ✔
[8d/caed0e] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:TXIMPORT [100%] 1 of 1, failed: 1 ✘
It would be very helpful to know what switches need to toggled to only execute Steps 1–7.
Thank you for your consideration.
So, as a general point you need to consider the flow chart as a qualitative guide to what's going on. The workflow doesn't provided you with absolute control on the modules that are run- for that you'll need to make your own workflow (which is definitely an option for you to get exactly what you want).
We have a related feature request to reduce the genome requirements in this context, but haven't got to it yet.
Further:
This step is necessary, we don't need to example all reads to infer the strandedness so we down-sample first.
This is just using a utility from the RSEM suite to generate a transcriptome. We may be able to remove that dependency if and when we tackle the issue above.
We use tximport to construct matrices from the output of Salmon, we don't have any plans to remove that.
To summarise:
Reducing dependencies when using pseudo-aligners is a valid point we will try to address as priorities allow.
But you don't have absolute control of the specific modules used. For that, I would encourage you to build your own workflow using the pre-build nf-core modules and subworkflows that are available.
I'm closing this as not being a bug, and we're already tracking the feature request elsewhere.
Description of the bug
Dear Researchers and Developers,
Thank you for developing this pipeline.
I am trying to go from FASTQ files to Salmon pseudo-alignment and quantification, as per the flow chart (Phase 1 and 3 only): https://raw.githubusercontent.com/nf-core/rnaseq/3.14.0//docs/images/nf-core-rnaseq_metro_map_grey.png
Specifically, trying to achieve:
Issue 1:
Despite supplying a pre-built decoy-aware Salmon index for transcripts, both Genome fasta and GTF files are still needed.
It is not clear why this is needed.
Issue 2:
The
fq
subsample step is run, not sure if this is necessary for Salmon to infer strandedness.Issue 3:
At some point in the pipeline, there is a failure due to an RSEM error.
It is not clear why RSEM is being called for the Reference Genome, when it is not part of Steps 1 and 3.
Issue 4:
The pipeline does not stop at Salmon quantification and tries to continue to unexpected next steps.
It would be very helpful to know what switches need to toggled to only execute Steps 1–7.
Thank you for your consideration.
Command used and terminal output
System information
Nextflow: 24.04.2
Hardware: Desktop
Executor: local
Container: conda
OS: Ubuntu 22.04.4 LTS
nf-core/rnaseq v3.14.0-gb89fac3
The text was updated successfully, but these errors were encountered: