You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 9, 2018. It is now read-only.
The result that pdf2htmlEX outputs is great, and is very suitable to replace Acrobat Reader. One of the features that makes Acrobat favorable above the browser output, is the ability to search in the document.
Feature request: add an search-API in the library, so it is possible to perform text-searches in the document.
Features of the API could be:
search (iterate through results / search direction)
search & replace
case (in)sensitive search
regular expression search
mark a selection
search in PDF bookmarks
add bookmarks to search results
When this API works, a next step could be to implement an GUI that makes use of this API. I will make another issue for that.
The text was updated successfully, but these errors were encountered:
Possible solution would be either searching text nodes in DOM and highlight them or generate inverted index to use in search (using https://github.com/fagbokforlaget/pdfiijs or pdftotext and feed it into indexing system).
@iapain the library you're proposing sounds great, certainly since I've got both a PDF-file and a pdftotext-output. Does the snowball-js support the following use-case?
My use-case is that I've got fragments from the pdftotext, that I would like to show/mark in the original PDF with its original markup. It would be awesome if I can use pdf2htmlEX in order to preserve the markup from the PDF.
I've been digging through the changelog / release notes / blogspot posts, and found out it is possible to search the output, and compare the html like diffs.
Can you elaborate a bit more on those features, because I could not find any documentation about that.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The result that
pdf2htmlEX
outputs is great, and is very suitable to replace Acrobat Reader. One of the features that makes Acrobat favorable above the browser output, is the ability to search in the document.Feature request: add an search-API in the library, so it is possible to perform text-searches in the document.
Features of the API could be:
When this API works, a next step could be to implement an GUI that makes use of this API. I will make another issue for that.
The text was updated successfully, but these errors were encountered: