Question about GBFF output reporting all 0 annotations (vs. txt and gff3 file #354

patriciatran · 2024-12-13T17:47:35Z

Hello,

I am running bakta v.1.10.2 this way:

$1 = samplename
bakta --db /path/hidden/db \
        $1_assembly.fasta \
        --output bakta_$1 \
        --threads $2

It outputs all the files, but I have a question about the gbff output file.

Could you explain why there is a discrepancy in the reported annotations being 0 in the gbff file vs the .txt and the gff3 file?
I am comparing this with a bakta output from another analysis ran in the past using a previous version:

Current output:
The gbff file looks like this:

LOCUS       contig_1             2902592 bp    DNA     linear   UNK 13-DEC-2024
DEFINITION  contig_1, whole genome shotgun sequence.
ACCESSION   contig_1
VERSION     contig_1
KEYWORDS    .
SOURCE      None
  ORGANISM  .
            .
COMMENT     Annotated with Bakta
            Software: v1.10.2
            Database: v5.1, full
            DOI: 10.1099/mgen.0.000685
            URL: github.com/oschwengers/bakta
            
            ##Genome Annotation Summary:##
            Annotation Date                :: 12/13/2024, 15:06:35
            CDSs                           ::     0
            tRNAs                          ::     0
            tmRNAs                         ::     0
            rRNAs                          ::     0
            ncRNAs                         ::     0
            regulatory ncRNAs              ::     0
            CRISPR Arrays                  ::     0
            oriCs/oriVs                    ::     0
            oriTs                          ::     0
            gaps                           ::     0
            pseudogenes                    ::     0

However, the sample.txt output looks like this:

Sequence(s):
Length: 2902592
Count: 1
GC: 32.8
N50: 2902592
N90: 2902592
N ratio: 0.0
coding density: 85.5

Annotation:
tRNAs: 60
tmRNAs: 1
rRNAs: 16
ncRNAs: 90
ncRNA regions: 25
CRISPR arrays: 0
CDSs: 2704
pseudogenes: 4
hypotheticals: 39
sORFs: 16
gaps: 0
oriCs: 4
oriVs: 0
oriTs: 1

Bakta:
Software: v1.10.2
Database: v5.1, full
DOI: 10.1099/mgen.0.000685
URL: github.com/oschwengers/bakta

Comparing with output from a previous run with another bakta version.
Note: Ignore the actual numbers, this is not ran on the same genome. Just pasting an output here for example purposes.

LOCUS       contig_1             2921883 bp    DNA     linear   UNK 17-OCT-2024
DEFINITION  contig_1, whole genome shotgun sequence.
ACCESSION   contig_1
VERSION     contig_1
KEYWORDS    .
SOURCE      None
  ORGANISM  .
            .
COMMENT     Annotated with Bakta
            Software: v1.6.1
            Database: v4.0
            DOI: 10.1099/mgen.0.000685
            URL: github.com/oschwengers/bakta
            
            ##Genome Annotation Summary:##
            Annotation Date                :: 10/17/2024, 22:36:47
            Annotation Pipeline            :: Bakta
            Annotation Software version    ::  v1.6.1
            Annotation Database version    ::  v4.0
            CDSs                           :: 2,733
            tRNAs                          ::    61
            tmRNAs                         ::     1
            rRNAs                          ::    19
            ncRNAs                         ::    88
            regulatory ncRNAs              ::    25
            CRISPR Arrays                  ::     0
            oriCs/oriVs                    ::     2
            oriTs                          ::     0
            gaps                           ::     0
            pseudogenes                    ::     5

Sequence(s):
Length: 3011165
Count: 3
GC: 32.7
N50: 2921883
N ratio: 0.0
coding density: 85.3

Annotation:
tRNAs: 61
tmRNAs: 1
rRNAs: 19
ncRNAs: 92
ncRNA regions: 25
CRISPR arrays: 0
CDSs: 2825
pseudogenes: 7
hypotheticals: 171
signal peptides: 0
sORFs: 11
gaps: 0
oriCs: 3
oriVs: 0
oriTs: 1

Bakta:
Software: v1.6.1
Database: v4.0
DOI: 10.1099/mgen.0.000685
URL: github.com/oschwengers/bakta

Thank you in advance for your explanation.

Best,
Patricia

The text was updated successfully, but these errors were encountered:

manalcric · 2024-12-16T09:50:28Z

Hello,
I am running bakta 1.10.2 and I have the same problem, but I don't know why. I think that the INSDC export to EMBL and GBFF is not working, since the rest of annotations files are correct.
I have renamed the contigs with really short names but this is not the problem.
Could be related with the Numpy warning?? UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero

Thanks in advance,
BW,
Manuel

Dx-wmc · 2024-12-17T06:22:41Z

I have encountered the same problem.

simone-pignotti · 2024-12-17T10:52:59Z

I am also running into this bug!

EDIT: downgrading to 1.10.1 works

oschwengers · 2024-12-17T15:59:44Z

Hi, yes, this is obviously a severe bug that I could reproduce. I'm working on it.

oschwengers · 2024-12-18T11:43:05Z

OK, so this is now fixed by https://github.com/oschwengers/bakta/releases/tag/v1.10.3.

@manalcric In fact, this was just a critical typo and is not related to the numpy warnings.

patriciatran added the bug Something isn't working label Dec 13, 2024

patriciatran mentioned this issue Dec 13, 2024

Question about GBFF output reporting all 0 annotations (vs. txt and gff3 file) #355

Closed

oschwengers self-assigned this Dec 17, 2024

oschwengers added this to the c1.10.3 milestone Dec 17, 2024

oschwengers added a commit that referenced this issue Dec 17, 2024

fix wrong feature numbers in INSDC output formats #354

5769eb3

oschwengers closed this as completed Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about GBFF output reporting all 0 annotations (vs. txt and gff3 file #354

Question about GBFF output reporting all 0 annotations (vs. txt and gff3 file #354

patriciatran commented Dec 13, 2024

manalcric commented Dec 16, 2024 •

edited

Loading

Dx-wmc commented Dec 17, 2024

simone-pignotti commented Dec 17, 2024 •

edited

Loading

oschwengers commented Dec 17, 2024

oschwengers commented Dec 18, 2024

Question about GBFF output reporting all 0 annotations (vs. txt and gff3 file #354

Question about GBFF output reporting all 0 annotations (vs. txt and gff3 file #354

Comments

patriciatran commented Dec 13, 2024

manalcric commented Dec 16, 2024 • edited Loading

Dx-wmc commented Dec 17, 2024

simone-pignotti commented Dec 17, 2024 • edited Loading

oschwengers commented Dec 17, 2024

oschwengers commented Dec 18, 2024

manalcric commented Dec 16, 2024 •

edited

Loading

simone-pignotti commented Dec 17, 2024 •

edited

Loading