-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linking in Oniguruma (-lonig) breaks POSIX regex #233
Comments
I am sorry for any inconvenience this may cause. #include <regex.h>
#include <string.h>
#include <stdio.h>
#include <dlfcn.h>
/* MacOS X */
#define POSIX_REGEX_PATH "libSystem.B.dylib"
typedef int regcomp_type(regex_t* preg, const char* pattern, int cflags);
typedef int regexec_type(const regex_t* preg, const char* string,
size_t nmatch, regmatch_t pmatch[], int eflags);
typedef void regfree_type(regex_t *preg);
int main(void)
{
void* handle;
regcomp_type* regcomp_func;
regexec_type* regexec_func;
regfree_type* regfree_func;
regmatch_t pmatch[1];
regex_t regex = {0};
const char str[] = "1234:abcd\n";
if ((handle = (void* )dlopen(POSIX_REGEX_PATH, RTLD_NOLOAD)) == NULL) {
char* e = dlerror();
fprintf(stderr, "ERROR: dlopen: %s\n", e);
return -1;
}
regcomp_func = dlsym(handle, "regcomp");
if (regcomp_func == NULL) {
char* e = dlerror();
fprintf(stderr, "ERROR: %s\n", e);
return -1;
}
if (regcomp_func == regcomp) {
fprintf(stdout, "Use POSIX regex as is.\n");
regexec_func = regexec;
regfree_func = regfree;
}
else {
fprintf(stdout, "Escape Oniguruma regex.\n");
regexec_func = dlsym(handle, "regexec");
if (regexec_func == NULL) {
char* e = dlerror();
fprintf(stderr, "ERROR: %s\n", e);
return -1;
}
regfree_func = dlsym(handle, "regfree");
if (regfree_func == NULL) {
char* e = dlerror();
fprintf(stderr, "ERROR: %s\n", e);
return -1;
}
}
regcomp_func(®ex, "[0-9]\\{1,20\\}", 0);
if (regexec_func(®ex, str, 1, pmatch, 0) == 0) {
char buf[64];
size_t len = (size_t)pmatch[0].rm_eo - pmatch[0].rm_so;
fprintf(stdout, "Matched len: %zu.\n", len);
memcpy(buf, str + pmatch[0].rm_so, len);
buf[len] = '\0';
fprintf(stdout, "Matched str: '%s'.\n", buf);
} else {
fprintf(stderr, "Failed to match.\n");
}
regfree_func(®ex);
return 0;
} |
looks like the regmatch_t/regoff_t types you got the libc's definition of dont match the ones ONIG is using, therefore accessing the pmatch member invokes UB. i'd suggest to use the pmatch-less test you developed to check whether old onig is loaded and then do one of the following 2 things:
another alternative is to always use the onig_ prefixed version and types (using the onig header rather than regex.h) and make onig a hard dependency. |
Thank you both for the advice and suggestions. In our case, since we're doing some pretty basic pattern matching on small strings, the workaround we ended up going with is to match using BRE and So just to summarize for anyone else who maintains a shared lib running into this issue, some workaround options are:
Thanks again for the help! |
This issue relates to #210 and it has been fixed since 6.9.5-rc1. But I'm a bit stuck on how to accommodate for older versions of Oniguruma from the perspective of shared libraries.
The core issue is that Oniguruma <= 6.9.4 will break POSIX regex when it is linked in (
-lonig
). For example:The bug does not seem to affect BRE, so if I change the syntax to BRE and remove
REG_EXTENDED
, the above code will work as expected when linking in Oniguruma. However, BRE returns unexpected results when usingpmatch
.In my case, I maintain a PHP extension that is always built as a shared library and is dynamically linked into a build of PHP that is often built with
-lonig
since it was unbundled from ext/mbstring in PHP 7.4. The part of the extension I'm working on is decoupled from the Zend Engine so there is no way to detect if Oniguruma has been loaded or not. I can still detect "broken regex behavior", but I don't know what the workaround for this issue would be.Do you have any advice for shared libraries to get POSIX regex working when they are dynamically loaded into a build with
-lonig
?The text was updated successfully, but these errors were encountered: