Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate sequences in the _decont.fasta #32

Open
sivasubramanics opened this issue Jun 6, 2023 · 2 comments
Open

Duplicate sequences in the _decont.fasta #32

sivasubramanics opened this issue Jun 6, 2023 · 2 comments

Comments

@sivasubramanics
Copy link

Dear team,

It's great tool with lots of potential, keeping the pan-transcriptome concept is coming of age. I was trying the pipeline, and after the completion, I could see some sequences are duplicated in the _decont.fasta output.

Could you please provide some clarification of this scenario?

Thanks,

Siva

@Lafond-LapalmeJ
Copy link
Owner

Are you talking about duplicate sequence or duplicate sequence name ?
Are you using the AST parameters ?

In the test on my side there is no duplicated sequences.
The _decont.fasta file is only a merging of all 'good' clusters. Each sequence is only in a single cluster for each clustering level so there is no possible duplicates.

@isovaline2230
Copy link

isovaline2230 commented Dec 27, 2024

I've encountered the same issue. Somehow sequences with the same name and content arrived at *_decont.fasta (2~3 duplicates depending on the case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants