Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not a HASH reference at gottcha_db.pl line 1573. #3

Open
playerra opened this issue Jun 2, 2015 · 6 comments
Open

Not a HASH reference at gottcha_db.pl line 1573. #3

playerra opened this issue Jun 2, 2015 · 6 comments

Comments

@playerra
Copy link

playerra commented Jun 2, 2015

I'm currently stuck at the gottcha_db.pl part, building a custom database, with the following error:

...
Tues 06/02/2015(15:51:17) ------------------------------------------------------------------
Tues 06/02/2015(15:51:17) Importing XML file...done.
Not a HASH reference at gottcha_db.pl line 1573.
Command exited with non-zero status 255

Any idea what that 'Not a HASH reference' means? I checked in the actual code and can't make much sense of it.

Thanks,
Robert

@poeli
Copy link
Member

poeli commented Jun 2, 2015

It seems that the script can’t find the corresponding taxonomy for given GI number. Can you double check it?

@playerra
Copy link
Author

playerra commented Jun 3, 2015

You are correct. Looking back at my mkGottchaTaxTree.log I found:

Indexing GIs by TAXID...
Warning: GI "300693042" has no TAXID!
Warning: GI "409246494" has no TAXID!
Warning: GI "479208076" has no TAXID!
and so on...

Suggestions?

@playerra
Copy link
Author

playerra commented Jun 3, 2015

So when I check the first GI on ncbi it returns:
"Record removed. This RefSeq genome was suppressed because updated RefSeq validation criteria identified problems with the assembly or annotation."

There are 300 or so of these warnings from the mkGottchaTaxTree.log. These GIs that have been warned should not be searched for a TAXID during gottcha_db.pl database build (which resulted in my intial error at the top of this thread), right?

@poeli
Copy link
Member

poeli commented Jun 3, 2015

Yes, you are right. You will need to remove them before you run the database generating scripts. Thanks.

On Jun 3, 2015, at 12:27 PM, playerra <[email protected]mailto:[email protected]> wrote:

So when I check the first GI on ncbi it returns:
"Record removed. This RefSeq genome was suppressed because updated RefSeq validation criteria identified problems with the assembly or annotation."

There are 300 or so of these warnings from the mkGottchaTaxTree.log. These GIs that have been warned should not be searched for a TAXID during gottcha_db.pl database build (which resulted in my intial error at the top of this thread), right?


Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-108561774.

@playerra
Copy link
Author

playerra commented Jun 4, 2015

Paul, thanks for the update.
Could you please clarify from what file I need to remove those 300 GIs? Do they need to be removed from the 'genomes.txt' file containing the path to all the .gbk files of Bacteria and Viruses? Or from the SpeciesTreeGI.dmp that is created? I'm pretty lost here.

@poeli
Copy link
Member

poeli commented Jun 4, 2015

Hi Robert,

You will need to remove paths of those 300 gbk files from genomes.txt file.

We did do a bunch of curation works on the reference genomes upfront in order to generate correct signatures. So cleaning removed/mistake genomes is required at this moment. The database building scripts could be annoying if the input genomes are directly downloaded from NCBI.

Thanks,
Paul

On Jun 4, 2015, at 11:55 AM, playerra <[email protected]mailto:[email protected]> wrote:

Paul, thanks for the update.
Could you please clarify from what file I need to remove those 300 GIs? Do they need to be removed from the 'genomes.txt' file containing the path to all the .gbk files of Bacteria and Viruses? Or from the SpeciesTreeGI.dmp that is created? I'm pretty lost here.


Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-108991234.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants