Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounding boxes are incorrect for rotated layout items #1067

Open
dhdaines opened this issue Nov 27, 2024 · 1 comment
Open

Bounding boxes are incorrect for rotated layout items #1067

dhdaines opened this issue Nov 27, 2024 · 1 comment

Comments

@dhdaines
Copy link
Contributor

pdfminer.six reports the boundaries of glyphs in LTChar like this:

(x0, y0) = apply_matrix_pt(self.matrix, bbox_lower_left)                                                                                                                                                   
(x1, y1) = apply_matrix_pt(self.matrix, bbox_upper_right) 
# ... normalize them
LTComponent.__init__(self, (x0, y0, x1, y1))                                                                                                                                         

This gives the two opposite corners of the glyph, but these do not define the bounding box when the text matrix (or the CTM) contains rotation, in which case it's necessary to use all four corners. To illustrate, take this PDF (please) and visualize what pdfminer.six gives (yes, the code is very long! it's at the bottom...):
image

It should probably look like this:
image

This is also true for images (and maybe other things). You can fairly easily detect whether the bounding box needs to be calculated with all four corners because either b or c in the transformation matrix are negative (this means that when Y increases, X decreases, or vice versa, and thus the upper left / bottom right corners will be further left / further down), for example: https://github.com/dhdaines/playa/blob/main/playa/utils.py#L272

Code to generate samples above:

from pdfminer.converter import PDFPageAggregator                                                                                                                                                           
from pdfminer.pdfdocument import PDFDocument                                                                                                                                                               
from pdfminer.pdfinterp import PDFPageInterpreter, PDFResourceManager                                                                                                                                      
from pdfminer.pdfpage import PDFPage                                                                                                                                                                       
from pdfminer.pdfparser import PDFParser
from pdfminer.layout import LTChar
import pdfplumber
from pdfplumber.utils import bbox_to_rect
rsrc = PDFResourceManager()                                                                                                                                                                                
agg = PDFPageAggregator(rsrc, pageno=1)                                                                                                                                                                    
interp = PDFPageInterpreter(rsrc, agg)                                                                                                                                                                     
pdf = PDFDocument(PDFParser(open("samples/rotated.pdf", "rb")))                                                                                                                                                      
pdfpage = next(PDFPage.create_pages(pdf))                                                                                                                                                
interp.process_page(pdfpage)
height = pdfpage.mediabox[3]
layout = agg.result
glyphs = []
for item in layout:
    if isinstance(item, LTChar):
        x0, x1, y0, y1 = item.x0, item.x1, height - item.y0, height - item.y1
        if y1 < y0:
            y0, y1 = y1, y0
        glyphs.append(bbox_to_rect((x0, y0, x1, y1)))
pp = pdfplumber.open("samples/rotated.pdf")
img = pp.pages[0].to_image()
img.draw_rects(glyphs)
@dhdaines
Copy link
Contributor Author

dhdaines commented Jan 2, 2025

After proof and testing note that it's actually b * d < 0 or a * c < 0 that you need to test to know if the two corners of the bbox are still valid after transformation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant