Skip to content

janvogt/repro-weasyprint-cmap-utf8

Repository files navigation

Invalid CMap table generated

This is a repro for an unrecoverable UTF-8 in a PDF generated by weasyprint.

Steps for reproduction:

  1. weasyprint example.html example.pdf

  2. cp example.pdf example_fixed.pdf

  3. Use hex editor to fix the cmap table of example_fixed.pdf by changing
<10006c5b> <6c5b>

to

<00006c5b> <6c5b>
  1. ./p2t.py

  2. See that the character has been recovered in example_fixed.txt but not in example.txt

Step 3 is not necessary anymore after Kozea/WeasyPrint#1571 (comment) is fixed.

Requirements

This needs the python packages weasyprint (version 54.1) and pdftotext installed.

If you have a working nix setup use the provided default.nix by calling

nix-shell

If you have direnv and nix, just use

direnv allow

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published