Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix integer overflow when parsing Perl-extended named backrefs #246

Merged
merged 1 commit into from
Mar 4, 2025

Conversation

cmazakas
Copy link
Member

This issue comes to us as: https://issues.oss-fuzz.com/issues/42512790

Note that Perl seems to reject the regex better than us, flagging \k-- as invalid in the first place.

We seem to treat \g and \k as literally the same, while Perl doesn't. Perl rejects \g-- as unterminated token.

Our code seems to parse \k- appropriately, with the assumption that it's a valid synonym for \g, but then we run into overflow issues when parsing the number generated by the fuzzer which is LONG_MIN.

I'm not going to change the behavior of \k here but instead I simply just tried my hand at squashing the overflows.

@cmazakas cmazakas force-pushed the cve-42512790 branch 2 times, most recently from 54f580b to 7485c7c Compare February 27, 2025 17:48
auto int_min = (std::numeric_limits<std::intmax_t>::min)();
auto int_max = (std::numeric_limits<std::intmax_t>::max)();

if ((i < 0) && (mark_count < int_min - i)) { i = -1; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct: i < 0 implies an error occurred when parsing the number (negative indexes are always invalid right?), plus mark_count comes from an unsigned value, so is always > 0. Also if you debug the first test case in the new tests, then i ends up being set to 2, and only fails later "by chance" because there are no subsequent marked sub-expressions.

So I would be inclined to remove this line, and instead put the subsequent overflow checks inside a if(i >=0) block. Actually, I'm not sure what should happen if i = 0 either, but it's not handled at all at present, so I'm assuming that's definitely wrong ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, in this case, a negative i should always be invalid because we do the negative checking up-front.

Because the fuzzer found the case \k--1234 (in essence). We correctly parse the first -, then we feed -1234 into .toi() so yeah, we get back a -1234 from this.

Maybe we should just remove these checks or at least, only handle the positive case here because we can still feed \k-LONG_MAX or \kLONG_MAX into the regex parser here.

So what we can do is...

         std::intmax_t i = this->m_traits.toi(pc, m_end, 10);
         if((i < 0) && syn_end)
         {
           // ...
         }
         if((i < 0) && !syn_end) { return fail(....); }
         if(negative) { /* insert checked arithmetic here, if we still need it */ }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traits::toi only handles unsigned numbers, so toi sees the '-' and immediately fails with a -1 result. Any negative result is an error here. The next if((i < 0) && syn_end) branch is a "we have an error looking for a number, lets look for a name instead", so we leave that alone in case a name starts with a literal '-' (I don't think they can, but that's another issue). So after a bit of experimenting I think we can simplify right down to:

         if(negative && (i > 0))
         {
            auto mark_count = static_cast<std::intmax_t>(m_mark_count);

            i = (mark_count - i) + 1;  // no overflow since both mark_count and i are positive
         }

There can be no overflow as long as (mark_count - i) is evaluated first since:

INT_MIN <= mark_count - i < INT_MAX

So both the subtraction, and then the adding of 1 are safe.

Note, only tested on Windows msvc/gcc.

@jzmaddock jzmaddock merged commit f851a08 into boostorg:develop Mar 4, 2025
51 checks passed
@cmazakas cmazakas deleted the cve-42512790 branch March 4, 2025 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants