Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent matching with repeated backreferences and match_unset_backref #335

Closed
addisoncrump opened this issue Nov 14, 2023 · 1 comment

Comments

@addisoncrump
Copy link
Contributor

addisoncrump commented Nov 14, 2023

Discovered by #322.

The following regex demonstrates the issue:

  re> /(a)|\1+/match_unset_backref
data> ba
 0: a
 1: a
data> ba\=no_jit
 0: 

I believe a similar, related case is the following:

  re> /(a)|\1+/match_unset_backref
data> bbbb
No match
data> bbbb\=no_jit
 0: 

What's very curious is that this does not appear without the repetition:

  re> /(a)|\1/match_unset_backref
data> ba
 0: 
data> ba\=no_jit
 0:
data> a
 0: a
 1: a
data> a\=no_jit
 0: a
 1: a

Finally, it appears with fixed repetitions, but not range repetitions:

  re> /(a)|\1{128}/match_unset_backref
data> ba
 0: a
 1: a
data> ba\=no_jit
 0:
data>
  re> /(a)|\1{,128}/match_unset_backref
data> ba
 0: 
data> ba\=no_jit
 0: 
data> 

This implies to me that there is some issue with how the JIT handles repetitions of empty backreferences.

@zherczeg
Copy link
Collaborator

Fixed in 936fef2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants