-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incorrect parsing of regex? #280
Comments
Regulex seems to confirm my understanding of this pattern, unless I'm misunderstanding the diagram too. |
Gah. This is probably a bug in the literal optimizer being too aggressive. I'll check it out tomorrow. |
No worries, I would appreciate it. |
When a pattern is partially anchored and literal prefixes are detected, we would scan for those prefixes like normal. On a match, we'd verify whether the text starting at that prefix matched the rest of the regex. This process doesn't work that well for partially anchored regexes, since the prefix match presupposes that the regex is allowed to match at that position. But if, say, a regex like `^z|a` finds a `z` in the middle of the string, then it has lost the fact that `z` needs to appear at the beginning of the string, and can therefore falsely report a match. We could spend some effort and make this case work, but the literal optimizer is already too complex. We need to simplify it to make future optimizations like this possible. Fixes rust-lang#280.
@blaenk Interestingly, running In any case, the underlying cause was indeed an aggressive literal optimization. #281 should fix it! |
Ahhh. Thank you very much for fixing this! |
No rush but do you have any idea when a new release will be out on crates.io with this fix? Or is there some set release schedule? |
It's uploading. :-) |
Thank you very much once again!! |
No problem. |
I hadn't worked with rust in months so I made major updates of both rustc and my dependencies (including regex). I noticed that my program was now failing in a weird way and I finally tracked it down to this bug (regression?), either that or I'm doing something very wrong here:
That assertion is failing even though it shouldn't AFAIK. It certainly didn't before.
The regex is supposed to match if the string begins with
.
or ends with~
. Even without the groupings/capture groups it fails. Somehow it is matching on"favicon.png"
.I can't imagine something like this got through though so I'm thinking it may be a misunderstanding I have of how that regex should work. I tried this just now in the chrome console with javascript and I don't see this problem there, though I am aware that regex implementations tend to differ, I wasn't aware that they differ in something like this.
The text was updated successfully, but these errors were encountered: