Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegexOptions.NonBacktracking isn't always matching semantics of backtracking engines #65607

Closed
stephentoub opened this issue Feb 19, 2022 · 2 comments

Comments

@stephentoub
Copy link
Member

stephentoub commented Feb 19, 2022

We recently tried to instill the NonBacktracking engine with the ability to match the semantics of the backtracking engine, in terms of returning the same matches it would under the same circumstances. We've missed some circumstances, though. Here's an example from @olsaarik:

var r = new Regex(".{4}x|ab");
var r2 = new Regex(".{4}x|ab", RegexOptions.NonBacktracking);
Assert.Equal(r.Match("aabax").Value, r2.Match("aabax").Value); // fails

"the problem here is that the first phase stops when it matches on ab, then the reverse second phase extends the match backwards from there, but since there's no x it can't go all the way back to the beginning and then the third phase will start from the wrong point"

@ghost
Copy link

ghost commented Feb 19, 2022

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

We recently tried to install the NonBacktracking engine with the ability to match the semantics of the backtracking engine, in terms of returning the same matches it would under the same circumstances. We've missed some circumstances, though. Here's an example from @olsaarik:

var r = new Regex(".{4}x|ab");
var r2 = new Regex(".{4}x|ab", RegexOptions.NonBacktracking);
Assert.Equal(r.Match("aabax").Value, r2.Match("aabax").Value); // fails

"the problem here is that the first phase stops when it matches on ab, then the reverse second phase extends the match backwards from there, but since there's no x it can't go all the way back to the beginning and then the third phase will start from the wrong point"

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions

Milestone: 7.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 19, 2022
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Mar 28, 2022
@stephentoub
Copy link
Member Author

@olsaarik, this is fixed by #68199 and can be closed, yes?

@ghost ghost locked as resolved and limited conversation to collaborators May 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants