-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seqkit split with regexp does not respect letter case overwriting file output #462
Comments
Not related, I installed seqkit via mamba which reports the version as |
Supported now. Added a flag
seqkit_darwin_amd64.tar.gz
Yes, I forgot to bump the version number in the tool. |
Thank you for the super quick response. For my usecase this is solves the issue. But I think the bug still exists. Seqkit sends a message that 4 files are created, but creates 2 due to case conflict. Seqkit split message needs to reflect the output files. |
Where's it? |
Seqkit prints this:
But it writes out only 2 files. It needs to print either this:
or alternatively actually write case sensitive files (the preferred option in my opinion). |
I just read this issue once again.
And find that SeqKit does so in a case-sensitive way. seqkit 2.8.1 works as you expect.
Only adding
|
This is likely to be an issue related to the case (in)sensitivity of MacOS file system. |
OMG, I just learned this. |
That's exactly what's happening. Thanks @botond-sipos @shenwei356 do you think adding a disclaimer to recommend macos users use the ignore case flag? This will hopefully stop future issues. |
Added.
|
Updated:
|
Dear ShenWei,
Thank you again for creating and maintaining seqkit. Congrats on seqkit2 publication!
I need to split a fasta file from gisaid. An example fasta looks like this
The goal is to split the fasta files into 2 files. The pattern is essentially
hRSV/A/
andhRSV/B/
. However the fasta file contains capitals and lowercase A/a and B/b in the designation name.The command I am using:
Terminal output looks correct:
Folder output:
Note there are only 2 files when there should be 4. The contents are also incorrect, they contain the upper case designations.
I suspect the code which writes out the fasta files is ignoring the letter case, resulting in the overwriting of files.
Using seqkit v2.8.1 on macos (x86 rosetta) installed via conda.
The text was updated successfully, but these errors were encountered: