-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap-Buffer-Overflow in pcre2(src/pcre2test.c:2945:7 in utf82ord) #235
Comments
a somehow minified replication is (note that skipping utf validation and providing an invalid subject is not supported, but it doesn't seem to trigger the bug):
but only when doing a partial match and is triggered by a bug in a function that is only used internally by
|
This is obviously just another instance of using no_utf_check with invalid UTF-8, which is documented to give undefined behaviour. |
I am afraid it is actually a real bug in the utf82ord() function inside pcretest.c which needs updating to deal with invalid UTF (which as you said was not expected, and normally shouldn't happen unless I added a way to know the size of the buffer it is using, so it will be able to better detect it was passed invalid utf and do not read past its buffer to decode an UTF character that is not really there but there seem to be other issues with it that will need addressing independently as shown in:
Although the partial match bug might be there already as well as shown by:
|
When match_invalid_utf is enabled, invalid UTF-8 data can't match but it was mistakenly getting printed as part of a partial match eventhough the ovector correctly didn't include it, as shown by: PCRE2 version 10.34 2019-11-21 re> /(?<=..)X/match_invalid_utf,allvector data> XX\x80\=ph,ovector=1 Partial match: \x{80} ** ovector[1] is not equal to the subject length: 2 != 3 0: 2 2 Fix the logic to print instead the empty match that was returned and as a side effect avoid a buffer overread when trying to decode UTF-8 that was missing code units. Fixes: PCRE2Project#235
When match_invalid_utf is enabled, invalid UTF-8 data can't match but it was mistakenly getting printed as part of a partial match eventhough the ovector correctly didn't include it, as shown by: PCRE2 version 10.34 2019-11-21 re> /(?<=..)X/match_invalid_utf,allvector data> XX\x80\=ph,ovector=1 Partial match: \x{80} ** ovector[1] is not equal to the subject length: 2 != 3 0: 2 2 Fix the logic to print instead the empty match that was returned and as a side effect avoid a buffer overread when trying to decode UTF-8 that was missing code units. Fixes: PCRE2Project#235
When match_invalid_utf is enabled, invalid UTF-8 data can't match but it was mistakenly getting printed as part of a partial match eventhough the ovector correctly didn't include it, as shown by: PCRE2 version 10.34 2019-11-21 re> /(?<=..)X/match_invalid_utf,allvector data> XX\x80\=ph,ovector=1 Partial match: \x{80} ** ovector[1] is not equal to the subject length: 2 != 3 0: 2 2 Fix the logic to print instead the empty match that was returned and address a buffer overread when trying to decode UTF-8 that was missing code units. Fixes: PCRE2Project#235
When match_invalid_utf is enabled, invalid UTF-8 data can't match but it was mistakenly getting printed as part of a partial match eventhough the ovector correctly didn't include it, as shown by: PCRE2 version 10.34 2019-11-21 re> /(?<=..)X/match_invalid_utf,allvector data> XX\x80\=ph,ovector=1 Partial match: \x{80} ** ovector[1] is not equal to the subject length: 2 != 3 0: 2 2 Fix the logic to print instead the empty match that was returned and address a buffer overread when trying to decode UTF-8 that was missing code units. Fixes: #235
We found a heap-buffer-overflow in pcre2-10.43-DEV(src/pcre2test.c:2945:7 in utf82ord),which can also be reproduced on pcre2-10.42.
Command Input
pcre2test -d poc_file /dev/null
poc_file are attached.
Sanitizer Dump
Environment
we built pcre2 with AddressSanitizer (ASAN) .
poc_file.zip
The text was updated successfully, but these errors were encountered: