Skip to content

Commit

Permalink
Fix parsing of larger streams.
Browse files Browse the repository at this point in the history
Using ".{$length}" in the regular expression causes an error while
compiling the regular expression if the declared length of the stream is
over 65534 bytes line: "Quantifier in {,} bigger than 65534 in regex"

This is a limitation of Perl's regular expression engine; use a direct
substring operation instead to fix this issue so large streams can be
parsed successfully.
  • Loading branch information
deven committed Jun 23, 2022
1 parent 639b28b commit 043f98f
Showing 1 changed file with 17 additions and 5 deletions.
22 changes: 17 additions & 5 deletions lib/PDF/Data.pm
Original file line number Diff line number Diff line change
Expand Up @@ -819,12 +819,24 @@ sub parse_objects {
defined(my $length = $stream->{Length})
or warn join(": ", $self->{-file} || (), "Byte offset $offset: Object #$id: Stream length not found in metadata!\n");
$length //= 0;
s/\A\r?\n(.{$length})\s*endstream$ws//s
or s/\A\r?\n((?>(?:[^e]+|(?!endstream\s)e)*))\s*endstream$ws//s
or croak join(": ", $self->{-file} || (), "Byte offset $offset: Invalid stream definition!\n");
$stream->{-data} = $1;
s/\A\r?\n//;

# If the declared stream length is missing or invalid, determine the shortest possible length to make the stream valid.
unless (substr($_, $length) =~ /\A(\s*endstream$ws)/) {
if (/\A((?>(?:[^e]+|(?!endstream\s)e)*))\s*endstream$ws/) {
$length = length($1);
} else {
croak join(": ", $self->{-file} || (), "Byte offset $offset: Invalid stream definition!\n");
}
}

$stream->{-data} = substr($_, 0, $length);
$stream->{-id} = $id;
$stream->{Length} //= length $1;
$stream->{Length} //= $length;

$_ = substr($_, $length);
s/\A\s*endstream$ws//;

$self->filter_stream($stream) if $stream->{Filter};
} elsif ($token eq "endobj") { # Indirect object definition: 999 0 obj ... endobj
my ($id, $object) = splice @objects, -2;
Expand Down

0 comments on commit 043f98f

Please sign in to comment.