Unable to extract text from pdf using pdfplumber #1268
OIDeveloper
started this conversation in
Ask for help with specific PDFs
Replies: 1 comment
-
Hi @OIDeveloper, the (cid:#) texts come from the PDF missing a mapping of the font characters to unicode. If you're curious, you can read more here: |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi Team,
We are encountering an issue while processing the attached PDF using pdfplumber. Instead of extracting the actual text, we are receiving "cid" values. Could you please advise on how we can resolve this issue?
Attaching pdf along with the text extracted through pdfplumber.
example.pdf
Beta Was this translation helpful? Give feedback.
All reactions