Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is an issue with the image generated by the page.get_pixmap() function #2964

Closed
1339503169 opened this issue Jan 3, 2024 · 6 comments
Labels
fix developed release schedule to be determined Fixed in next release upstream bug bug outside this package

Comments

@1339503169
Copy link

Description of the bug

img_test.pdf
The image converted through the page.get_pixmap() function has characters that were not originally present in the PDF. The source file has characters that appear to be 'From (Shipper) 发件人', but the actual image displayed does not match the PDF. The converted image is like this, with the red box indicating the error. You can compare it with img_test. pdf for comparison

image

How to reproduce the bug

here is the code i used to generate image

'''
import fitz
document = fitz.open('./data/img_test.pdf')
page = document.load_page(0)
rotate = int(0)
zoom_x, zoom_y = 2, 2
trans = fitz.Matrix(zoom_x, zoom_y).prerotate(rotate)
pix = page.get_pixmap(matrix=trans, alpha=False)
pix.save('data/img_test.png')
'''
what should I do to get the correct picture

PyMuPDF version

1.23.7 or earlier

Operating system

Windows

Python version

3.8

@JorjMcKie JorjMcKie added the upstream bug bug outside this package label Jan 4, 2024
@JorjMcKie
Copy link
Collaborator

Submitted bug report in https://bugs.ghostscript.com/show_bug.cgi?id=707451.

@cbm755
Copy link
Contributor

cbm755 commented Jan 5, 2024

Just FYI, that file renders incorrectly in Evince on Fedora GNU/Linux (which is completely independent of PyMuPDF).

image

@JorjMcKie
Copy link
Collaborator

Just FYI, that file renders incorrectly in Evince on Fedora GNU/Linux (which is completely independent of PyMuPDF).

Thanks for this Colin. Yeah, maybe there is a general issue with these files. I am sure we will soon here from our friends at MuPDF.

@robinwatts
Copy link
Collaborator

The file does indeed look broken. We have a fix in 1.24 that improves it.

The text now says "1 Front(Shipper)", albeit with dodgy spacing.

Essentially, it's a broken file, and we're doing as well with it as we can.

The commit in question is:

https://git.ghostscript.com/?p=mupdf.git;a=commitdiff;h=0a5b60420

I'll see about pulling this back to 1.23.x so you can get access to it soon.

@JorjMcKie JorjMcKie added the fix developed release schedule to be determined label Jan 31, 2024
@JorjMcKie
Copy link
Collaborator

The MuPDF team has developed a fix that will at least improve the rendering of this type of pages.

@julian-smith-artifex-com
Copy link
Collaborator

Fixed in 1.24.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix developed release schedule to be determined Fixed in next release upstream bug bug outside this package
Projects
None yet
Development

No branches or pull requests

5 participants