Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple sequences in a single file #37

Open
Pigrenok opened this issue Feb 13, 2023 · 0 comments
Open

Multiple sequences in a single file #37

Pigrenok opened this issue Feb 13, 2023 · 0 comments

Comments

@Pigrenok
Copy link

Pigrenok commented Feb 13, 2023

Hello!

It is not clear from specification whether the GFF3 file should be sorted by seqid or not if multiple seqid present in a file.

I received a file where it is not the case, e.g. first there are lines of type gene for multiple seqids and then multiple nRNA lines with the same set of seqids and with parents of the genes described above.

The reader I use (Sci-Kit Bio read function) reads each occurrence of seqid as new name. If specific sequence ID it provided, it reads only the first record (I presume because it encounters different seqid after that).

So, my problem is that because it is not specified, I cannot understand is it reader's behaviour incorrect or it is being strict and correct and the file itself is formatted incorrectly?

Thank you very much for clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant