Skip to content

Commit

Permalink
pcre2posix: additional updates for recent changes (#338)
Browse files Browse the repository at this point in the history
* pcre2posix: make code warning free and update ChangeLog

Somehow previous fix was not ammended to include this code change,
take the opportunity to update ChangeLog and do other cleanup so
it will be at least worth a PR.

Those found responsible have been sacked

* pcre2posix: fix crash on recent regerror code

Since 0710ce2 (pcre2posix: avoid snprintf quirks in regerror (#333),
2023-11-15), a call for snprintf was replaced by a pair of strncpy
and buf[errbuf_size - 1] = 0, but it didn't account for the case
where errbuf_size == 0.

Make the code conditional to mimic the original logic and avoid
crashing.

* doc: document that the POSIX interface is not POSIX compatible

POSIX 1003.1-2008 requires that regoff_t be at least as large as
ssize_t or ptrdiff_t, but we use int and therefore any match is
restricted to what that can hold, even in 64-bit architectures.
  • Loading branch information
carenas authored Nov 16, 2023
1 parent 13a933e commit c06a4b8
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 11 deletions.
11 changes: 7 additions & 4 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -142,16 +142,19 @@ above because \b and \B are defined in terms of \w.
option, and (?aP) also sets (?aT) so that (?-aP) disables all ASCII
restrictions on POSIX classes.

37. If PCRE2_FIRSTLINE was set on an anchored pattern, pcre2_match() and
pcre2_dfa_match() misbehaved. PCRE2_FIRSTLINE is now ignored for anchored
37. If PCRE2_FIRSTLINE was set on an anchored pattern, pcre2_match() and
pcre2_dfa_match() misbehaved. PCRE2_FIRSTLINE is now ignored for anchored
patterns.

38. Add a test for ridiculous ovector offset values to the substring extraction
38. Add a test for ridiculous ovector offset values to the substring extraction
functions.

39. Make OP_REVERSE use IMM2_SIZE for its data instead of LINK_SIZE, for
39. Make OP_REVERSE use IMM2_SIZE for its data instead of LINK_SIZE, for
consistency with OP_VREVERSE.

40. In some legacy environments with a pre C99 snprintf, pcre2_regerror could
return an incorrect value when the provided buffer was too small.


Version 10.42 11-December-2022
------------------------------
Expand Down
2 changes: 1 addition & 1 deletion HACKING
Original file line number Diff line number Diff line change
Expand Up @@ -742,7 +742,7 @@ different (but fixed) length.
Variable-length backward assertions whose maximum matching length is limited
are also supported. For such assertions, the first opcode inside each branch is
OP_VREVERSE, followed by the minimum and maximum lengths for that branch,
unless these happen to be equal, in which case OP_REVERSE is used. These
unless these happen to be equal, in which case OP_REVERSE is used. These
IMM2_SIZE values occupy two code units each in 8-bit mode, and 1 code unit in
16/32 bit modes.

Expand Down
9 changes: 7 additions & 2 deletions doc/pcre2posix.3
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ documentation for a description of PCRE2's native API, which contains much
additional functionality.
.P
\fBIMPORTANT NOTE\fP: The functions described here are NOT thread-safe, and
should not be used in multi-threaded applications. Use the native API instead.
should not be used in multi-threaded applications. They are also limited to
processing subjects that are not bigger than 2GB. Use the native API instead.
.P
These functions are wrapper functions that ultimately call the PCRE2 native
API. Their prototypes are defined in the \fBpcre2posix.h\fP header file, and
Expand Down Expand Up @@ -74,7 +75,7 @@ captured substrings. It also defines some constants whose names start with
.sp
Note that these functions are just POSIX-style wrappers for PCRE2's native API.
They do not give POSIX regular expression behaviour, and they are not
thread-safe.
thread-safe or even POSIX compatible.
.P
Those POSIX option bits that can reasonably be mapped to PCRE2 native options
have been implemented. In addition, the option REG_EXTENDED is defined with the
Expand Down Expand Up @@ -298,6 +299,10 @@ entire portion of \fIstring\fP that was matched; subsequent elements relate to
the capturing subpatterns of the regular expression. Unused entries in the
array have both structure members set to -1.
.P
\fIregmatch_t\fP as well as the \fIregoff_t\fP typedef it uses are defined in
\fBpcre2posix.h\fP and are not warranted to have the same size or layout as other
similarly named types from other libraries that provide POSIX-style matching.
.P
A successful match yields a zero return; various error codes are defined in the
header file, of which REG_NOMATCH is the "expected" failure code.
.
Expand Down
12 changes: 8 additions & 4 deletions src/pcre2posix.c
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ static int message_len(const char *message, int offset)
char buf[12];

/* 11 magic number comes from the format below */
return strlen(message) + 11 + snprintf(buf, sizeof(buf), "%d", offset);
return (int)strlen(message) + 11 + snprintf(buf, sizeof(buf), "%d", offset);
}

/*************************************************
Expand Down Expand Up @@ -198,9 +198,13 @@ if (preg != NULL && (int)preg->re_erroffset != -1)
}
else
{
ret = len = strlen(message);
strncpy(errbuf, message, errbuf_size);
if (errbuf_size <= len) errbuf[errbuf_size - 1] = '\0';
len = strlen(message);
if (errbuf_size != 0)
{
strncpy(errbuf, message, errbuf_size);
if (errbuf_size <= len) errbuf[errbuf_size - 1] = '\0';
}
ret = (int)len;
}

do {
Expand Down

0 comments on commit c06a4b8

Please sign in to comment.