-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider removing PyMuPDF for dependency that is not AGPL licensed #486
Comments
Hi @madisonmay, Thanks for pointing this out!
|
Quick update here, we're looking for suggestions from everyone 🙏 We have two main options:
|
That is indeed a tricky issue, could you explain the first point @fg-mindee ? How can we make the license working if it is optional ? |
@charlesmindee I have to check but the idea was to put all our PDF features in an extra build (not the core/default one) |
For reference, here are some properly licensed alternatives:
|
pdf2image seems to be easy to use and to integrate, what do you think @fg-mindee ? |
We should definitely give it a try then. It's MIT license so it's as permissive as it can get 👌 |
Hi @madisonmay, after a lot of investigation it seems that there is no python library to render pdf to images without using poppler ( cc @fg-mindee |
Hi @charlesmindee, thank you for the ping. I think we may have recently found a decent option for this niche (at least for pdf to image rendering)! Checkout PyPDFium2 (PyPI, Github) -- haven't done a full benchmark but anecdotally seems quite fast and is Apache-2.0. You'd still need something else for text extraction. That being said, the extras option seems like the a fine resolution to me! |
Thanks a lot for the suggestion @madisonmay 🙏 |
FYI, pypdfium is incompatible with Python 3.8.1 (but no problem with 3.7, or 3.8.10+ so far) |
Hi folks! Love the project, but the dependency on PyMuPDF can pose a problem for commercial use because of it's AGPL license (I think use of PyMuPDF may technically require the project itself to be GPL licensed, although I'm no expert here). Would you consider integrating a PR that replaces the PyMuPDF dependency with an alternative?
The text was updated successfully, but these errors were encountered: