Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Failed to serialize event #419

Closed
andrea-matsec opened this issue Jan 3, 2024 · 1 comment
Closed

[BUG] Failed to serialize event #419

andrea-matsec opened this issue Jan 3, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@andrea-matsec
Copy link

andrea-matsec commented Jan 3, 2024

Describe the bug
Strelka fails to serialize events. I believe this is happening only when there's a pdf_load_error but I'm not 100% certain.

Environment details

  • Operating System: N/A
  • Architecture: Kubernetes

Steps to reproduce
Unfortunately I can't share the file to reproduce this error, but this is the even that can't be serialized.

{
    'file': {
        'depth': 0, 
        'flavors': {
            'mime': ['application/pdf']
        }, 
        'scanners': ['ScanEntropy', 'ScanExiftool', 'ScanOcr', 'ScanPdf', 'ScanYara'], 
        'size': 220710, 
        'tree': {
            'node': '6d343886-98a2-4258-8d99-9b0be8d4f63a', 
            'root': '6d343886-98a2-4258-8d99-9b0be8d4f63a'
        }
    }, 
    'scan': {
        'entropy': {
            'elapsed': 0.000211, 
            'entropy': 7.9965489035155235
        }, 
        'exiftool': 
            {
                'elapsed': 7.299652, 
                'sourcefile': '/dev/shm/tmpp5fr8i53', 
                'exiftoolversion': 12.6, 
                'filename': 'tmpp5fr8i53', 
                'directory': '/dev/shm', 
                'filesize': '221 kB', 
                'filemodifydate': '2024:01:03 22:49:24+00:00', 
                'fileaccessdate': '2024:01:03 22:49:24+00:00', 
                'fileinodechangedate': '2024:01:03 22:49:24+00:00', 
                'filepermissions': '-rw-------', 
                'filetype': 'PDF', 
                'filetypeextension': 'pdf', 
                'mimetype': 'application/pdf', 
                'pdfversion': 1.7, 
                'linearized': 'Yes', 
                'encryption': 'Standard V5.6 (256-bit)', 
                'warning': '[minor] Decryption is very slow for encryption V5.6 or higher', 
                'useraccess': 'Print, Modify, Copy, Annotate, Fill forms, Extract, Print high-res'
            }, 
            'ocr': {
                'elapsed': 0.026397, 
                'flags': ['uncaught_exception'], 
                'exception': 'Traceback (most recent call last):\\n\\n  File \\"/usr/local/lib/python3.10/dist-packages/strelka-0.0.0-py3.10.egg/strelka/strelka.py\\", line 779, in scan_wrapper\\n    self.scan(data, file, options, expire_at)\\n\\n  File \\"/usr/local/lib/python3.10/dist-packages/strelka-0.0.0-py3.10.egg/strelka/scanners/scan_ocr.py\\", line 29, in scan\\n    data = doc.get_page_pixmap(0).tobytes(\\"png\\")\\n\\n  File \\"/usr/local/lib/python3.10/dist-packages/fitz/utils.py\\", line 922, in get_page_pixmap\\n    return doc[pno].get_pixmap(\\n\\n  File \\"/usr/local/lib/python3.10/dist-packages/fitz/fitz.py\\", line 5447, in __getitem__\\n    raise IndexError(\\"page not in document\\")\\n\\nIndexError: page not in document\\n'
            }, 
            'pdf': {
                'elapsed': 0.025871, 
                'flags': ['pdf_load_error'], 
                'images': 0, 
                'lines': 0, 
                'words': 0, 
                'xref_object': set()
            }, 
            'yara': {
                'elapsed': 0.000709, 
                'rules_loaded': 1, 
                'matches': ['test']
            }
        }
    }

'xref_object': set() looks suspicious to me.

Expected behavior
The event can be serialized

Release

  • Release: 0.23.11.10

Additional context
Add any other context about the problem here.

@andrea-matsec andrea-matsec added the bug Something isn't working label Jan 3, 2024
@phutelmyer
Copy link
Contributor

phutelmyer commented Jan 3, 2024

Thanks for reporting this @andrea-matsec.
I have a fix for this (to be honest, I thought I already implemented - must have been a dream).

I'll push it out, along with a new release, tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants