-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: assert self._pageseq != 0 #48
Comments
That's exactly the intention -- it should be just a warning, but it shouldn't prevent extraction of any other annotations. From your description it sounds like pdfannots failed to produce any output until you removed the problematic annotation. Is that correct? Are you able to share the affected PDF? |
sure, here is the warning message and the pdf: WARNING: Unsupported annotation subtype: /'Popup'
WARNING: Unsupported annotation subtype: /'Popup'
Traceback (most recent call last):
File "/opt/homebrew/bin/pdfannots", line 8, in <module>
sys.exit(main())
File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/cli.py", line 141, in main
doc = process_file(
File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/__init__.py", line 472, in process_file
page.annots.sort()
File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/types.py",
line 226, in __lt__
return self.pos < other.pos
File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/types.py", line 182, in __lt__
assert self._pageseq != 0
AssertionError |
Ok, so the warning messages are a red herring here, the real issue is the assertion failure -- looks like one of the (supported) annotations wasn't encountered during the text traversal. I'll have a closer look on the weekend. Thanks for providing the sample! |
…mponents issue #48 demonstrates a PDF where all text is chars within a figure, and there are no lines/boxes
Hi, thanks again for pdfannots! I recently encountered a small issue where an unsupported annotation type completely shut down the annotation extraction. While it's understandable that not every fancy annotation type can be extracted, pdfannots shouldn't completely abort, bur rather simply skip the annotation.
It took me a bit to find the problematic annotation, which was even exacerbated by the fact that it wasn't an annotation visible in normal PDF readers, but probably some result of a bad quality PDF OCR scan.
The error was:
WARNING: Unsupported annotation subtype: /'Popup'
The text was updated successfully, but these errors were encountered: