Fix content data stream concatenation mangling output in cfFilterPDFToPDF #56

sergio-gdr · 2024-06-11T16:25:49Z

When running the following command on example.pdf
./pdftopdf 1 1 1 1 "" example.pdf > output.pdf

the resultant output.pdf is mangled.

This is better explained in this bug report.

It seems that the logic that provides the content data streams and concatenates them to form a single stream in an XObject is assuming a correct separation between the contents of successive streams.
To better illustrate: in example.pdf above (note that the sample pdf's have been run through qpdf --qdf --recompress-flate --compress-streams=n --object-streams=disable for illustration purposes), page 1's contents are

 /Contents [9 0 R 11 0 R 13 0 R 15 0 R]

and looking at objects 9-11:

%% Contents for page 1
%% object 9
9 0 obj
<</Length 10 0 R>>
stream
q
endstream
endobj

%% object 10
10 0 obj
1
endobj

%% object 11
11 0 obj
<</Length 12 0 R>>
stream
q 0.1 0 0 0.1 0 0 cm
%% only the beginning of object 11 shown

This results in the following in output.pdf above:

%% resultant XObject. irrelevant info excluded
11 0 obj
<</Subtype /Form /Type /XObject /Length 12 0 R>>
stream
qq 0.1 0 0 0.1 0 0 cm

so the concatenation of streams 9 and 11 result in the (invalid) command 'qq', confusing pdf parsers and mangling the output.

With the patch applied, the output becomes

%% resultant XObject. irrelevant info excluded
11 0 obj
<</Subtype /Form /Type /XObject /Length 12 0 R>>
stream
q
q 0.1 0 0 0.1 0 0 cm

I'm not sure if this is the best solution for the problem, but hopefully the analysis can at least point to that.

…eStreamData When concatenating the data streams for the page's contents, add a new line at the end of each data stream to avoid cases where the concatenation might result in a corruption. Eg (extracted from a real pdf): %% Contents for page 1 %% Stream 1 9 0 obj << /Length 10 0 R >> stream q endstream endobj 10 0 obj 1 endobj %% Stream 2 11 0 obj << /Length 12 0 R >> stream q 0.1 0 0 0.1 0 0 cm the output pdf results in qq 0.1 0 0 0.1 0 0 cm with the effect that 'qq' is not being parsed correctly, effectively mangling the contents. Signed-off-by: Sergio Gómez <[email protected]>

sergio-gdr and others added 2 commits June 11, 2024 10:44

qpdf-xobject.cxx: Correct coding style

f074ad2

tillkamppeter merged commit dd698ec into OpenPrinting:master Jun 11, 2024

sergio-gdr mentioned this pull request Jun 24, 2024

Backport data stream concatenation fix from libcupsfilters OpenPrinting/cups-filters#587

Merged

tillkamppeter mentioned this pull request Dec 11, 2024

pdftopdf scaling regression in 1.28.x and 2.0.0 OpenPrinting/cups-filters#549

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix content data stream concatenation mangling output in cfFilterPDFToPDF #56

Fix content data stream concatenation mangling output in cfFilterPDFToPDF #56

sergio-gdr commented Jun 11, 2024

Fix content data stream concatenation mangling output in cfFilterPDFToPDF #56

Fix content data stream concatenation mangling output in cfFilterPDFToPDF #56

Conversation

sergio-gdr commented Jun 11, 2024