Automatic drop of genome leads to error in downstream modules of classify_wf #312
Labels
enhancement
Proposed feature or change to GTDB-Tk.
next version
Upcoming feature/fix in staging branch.
Hello!
I am running gtdbtk classify_wf on some of my genomes. During the identify step, one genome was automatically dropped by Prodigal:
Skipping 1 of 202 genomes as no genes were called by Prodigal. Check the genome quality (see gtdb.warnings.log).
Skipping: 3300001969
This resulted in an error, when the script reached the align step:
[2021-03-22 14:09:01] ESC[1;31mERROR:ESC[0m [] are not present in the input list of genome to process.
[2021-03-22 14:09:01] ESC[1;31mERROR:ESC[0m Controlled exit resulting from an unrecoverable error or warning.
================================================================================
EXCEPTION: InconsistentGenomeBatch
MESSAGE: You are attempting to run GTDB-Tk on a non-empty directory that contains extra genomes not present in your initial identify directory. Remove them, or run GTDB-Tk on a new directory.
I tried to rerun in a fresh directory, but that did not work. So I removed the genome from the input fasta folder and the intermediate identify folder, and only then the script continued and finished successfully.
I don't know, if this is really the issue that caused the exception, but if it is, I would suggest making gtdbtk classify_wf aware of genomes dropped by itself, so it does not get confused anymore.
The text was updated successfully, but these errors were encountered: