You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thanks for this great functionality by simplifying the usage of tesseract from R and the possibility to download the language files with a single line of r-code. This is powerful!
It would also be really nice in case it would be possible to output an ocr'ed document in hocr format or as a searchable pdf directly. This would make the package even more simple to use for people (like me) that doesn't have the skills to configure or override the settings in tesseract.
With an additional parameter "output" to the ocr function that could be one of {"text", "hocr" or "pdf"} it could look like this:
out <- ocr("test.tif", engine = tesseract("swe"), output = "hocr")
I think this would make this R package very strong in terms of how widely it could be used.
Again, thanks for a great work!
The text was updated successfully, but these errors were encountered:
First of all, thanks for this great functionality by simplifying the usage of tesseract from R and the possibility to download the language files with a single line of r-code. This is powerful!
It would also be really nice in case it would be possible to output an ocr'ed document in hocr format or as a searchable pdf directly. This would make the package even more simple to use for people (like me) that doesn't have the skills to configure or override the settings in tesseract.
With an additional parameter "output" to the ocr function that could be one of {"text", "hocr" or "pdf"} it could look like this:
out <- ocr("test.tif", engine = tesseract("swe"), output = "hocr")
I think this would make this R package very strong in terms of how widely it could be used.
Again, thanks for a great work!
The text was updated successfully, but these errors were encountered: