-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow tabs in fasta header when creating decoys for salmon index #878
Comments
Hi @paoloAngelino ! Thanks for reporting. I am unable to reproduce this if I run:
Which is essentially running this bit of code in the module that splits on spaces and only takes the first value. As you can see above this prints the contig name without the comment delimited by spaces. |
Closing this for now but feel free to re-open if the issue persists or if you are able to identify another reason for this failure. |
@drpatelh Thanks for your answer.
This is the reason why, in my case, |
Cool. Will re-open and fix in the next release. In the meantime, you will have to clean the fasta before passing to the pipeline. |
Description of the bug
A fasta header can contain comments together with the name of the contig. Example:
>HLA-DRB1*16:02:01 HLA00878
The corresponding line in the decoy.txt file to be passed to salmon index would be
HLA-DRB1*16:02:01 HLA00878
the problem is that the comment is interpreted by salmon as an extra decoy, while creating the index, and it stops with an error. In my case:
An additional cleaning step in rnaseq/modules/nf-core/modules/salmon/index/main.nf would fix the issue. What I propose is to replace line 31:
sed -i.bak -e 's/>//g' decoys.txt
with
Command used and terminal output
No response
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: