error when running bakta with --skip-trna option #79

joyn-sromero · 2021-09-26T03:35:19Z

Describe the bug
Im running bakta in a data set that stalls when running with default options and wont move beyond the detection of tRNAs (after 12 hour it keeps running this step) so I choose to run with the --skip-trna option this dataset but this causes bakta to finish the run with this error message:

write JSON output...
Traceback (most recent call last):
File "/opt/conda/bin/bakta", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.9/site-packages/bakta/main.py", line 472, in main
json.write_json(genome, features, json_path)
File "/opt/conda/lib/python3.9/site-packages/bakta/io/json.py", line 20, in write_json
feat.pop('aa_digest') # remove binary aa digest before JSON serialization
KeyError: 'aa_digest'

This causes none of the bakta outputs being generated just the log file.

Therefore, please provide us with at least the following information:

what exactly happened
See above
provide the detailed logs (execute Bakta via --verbose)
-the logs I get are these

parse genome sequences...
imported: 193
filtered & revised: 193
contigs: 193
skip tRNA prediction...
predict tmRNAs...
found: 1
predict rRNAs...
found: 8
predict ncRNAs...
found: 9
predict ncRNA regions...
found: 54
predict CRISPR arrays...
found: 0
predict & annotate CDSs...
predicted: 359
predict & annotate CDSs... [50/927]
predicted: 359
discarded spurious: 0
detected IPSs: 119
found PSCs: 215
found PSCCs: 15
lookup annotations...
conduct expert systems...
amrfinder: 1
protein sequences: 1
combine annotations and mark hypotheticals...
analyze hypothetical proteins: 36
detected Pfam hits: 9
calculated proteins statistics
extract sORF...
potential: 91445
discarded due to overlaps: 55998
discarded spurious: 3
detected IPSs: 0
found PSCs: 0
lookup annotations...
filter and combine annotations...
filtered sORFs: 0
detect gaps...
found: 0
detect oriCs/oriVs...
found: 0
detect oriTs...
found: 0
apply feature overlap filters...
select features and create locus tags...
selected: 81253

genome statistics:
Genome size: 7,524,442 bp
Contigs/replicons: 193
GC: 53.0 % [14/927]
N50: 91,849
N ratio: 0.0 %
coding density: 891.9 %

annotation statistics:
tRNAs: 0
tmRNAs: 193
rRNAs: 1544
ncRNAs: 1737
ncRNA regions: 10229
CRISPR arrays: 0
CDSs: 67550, hypotheticals: 6948
sORFs: 0
gaps: 0
oriCs/oriVs: 0
oriTs: 0

write JSON output...
Traceback (most recent call last):
File "/opt/conda/bin/bakta", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.9/site-packages/bakta/main.py", line 472, in main
json.write_json(genome, features, json_path)
File "/opt/conda/lib/python3.9/site-packages/bakta/io/json.py", line 20, in write_json
feat.pop('aa_digest') # remove binary aa digest before JSON serialization
KeyError: 'aa_digest'

what installation of Bakta did you use: BioConda, Pip
-Im using bakta docker v1.1.1

The text was updated successfully, but these errors were encountered:

joyn-sromero · 2021-09-26T03:36:56Z

I also get this same problem exactly using bakta docker v1.0.4

oschwengers · 2021-09-27T08:24:13Z

Hi @joyn-sromero thanks for reporting. Could you please upload the according log file so I can take a deeper look at this? If it's possible, the genome fasta file would help as well.

oschwengers · 2021-10-12T10:56:54Z

Also, could you please update to the latest version v1.2.1 and check if the issue remains?

joyn-sromero · 2021-10-12T14:15:49Z

Hi Oliver! Thanks a lot for reaching back, so I checked back a few days ago and the problem turned out to be related to the formatting of the fasta sequences of the contigs, there were some blank characters in some of the contigs that went unnoticed in the preformatting of the contigs files, apparently some component of bakta either cannot deal with blank characters in between the contigs or that was causing some contigs having duplicated IDs (lets say come contigs ended up being called >STRAIN_A[BLANK]CONTIG1 and

STRAIN_A[BLANK]CONTIG2 ) that caused the program to not work properly ,

after I ran a pre-formatting step accounting for every type of blank spaces in between contigs fixed the problems. I had closed the issue already in github. Thanks a lot for staying on top of this until now!!

…

On Tue, Oct 12, 2021 at 6:57 AM Oliver Schwengers ***@***.***> wrote: Also, could you please update to the latest version v1.2.1 and check if the issue remains? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#79 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ATFSPNWYVJD64GB4FJNFDP3UGQIABANCNFSM5EYJUMDQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

oschwengers · 2021-10-12T14:35:55Z

Oh I see. Yes, that indeed will cause some severe issues within the pipeline.
I added a quick check for unique contig IDs. Thanks for pointing that out!

joyn-sromero added the bug label Sep 26, 2021

oschwengers self-assigned this Sep 27, 2021

joyn-sromero closed this as completed Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error when running bakta with --skip-trna option #79

error when running bakta with --skip-trna option #79

joyn-sromero commented Sep 26, 2021 •

edited

Loading

joyn-sromero commented Sep 26, 2021

oschwengers commented Sep 27, 2021

oschwengers commented Oct 12, 2021

joyn-sromero commented Oct 12, 2021 via email

oschwengers commented Oct 12, 2021

error when running bakta with --skip-trna option #79

error when running bakta with --skip-trna option #79

Comments

joyn-sromero commented Sep 26, 2021 • edited Loading

joyn-sromero commented Sep 26, 2021

oschwengers commented Sep 27, 2021

oschwengers commented Oct 12, 2021

joyn-sromero commented Oct 12, 2021 via email

oschwengers commented Oct 12, 2021

joyn-sromero commented Sep 26, 2021 •

edited

Loading