-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regexp_match skips first match when returning match #3803
Comments
I'm not sure if I'm missing something but |
If parity with Postgres is desired, then this would be considered a bug. Relevant extract:
From: https://www.postgresql.org/docs/current/functions-matching.html Also it might be somewhat confusing as returning a not-null value in the output ListArray indicates a match was found (else it would be null instead of a StringArray), but resultant StringArray itself is empty without the match. The behaviour seems somewhat inconsistent? |
|
Makes sense, we should probably update the function's docs to match |
|
Describe the bug
In some cases
regexp_match
will skip first and only match.e.g. if pattern is
foo
and string to match isfoo
then should return single matchfoo
. Currently returning empty array for the match (correctly finds there is a match, but doesn't return the match correctly).To Reproduce
Example test in arrow-string/src/regexp.rs
Will panic with:
Can see the right (actual) has empty
StringArray[]
whereas expected contains the match:StringArray["foo"]
Expected behavior
Test should succeed.
Additional context
Seems its because by default skipping the first match in a capture group:
arrow-rs/arrow-string/src/regexp.rs
Lines 210 to 218 in 79518cf
Where in the test example above,
caps
has value:Relevant regex doc: https://docs.rs/regex/latest/regex/struct.Regex.html#method.captures
Specifically:
Original issue: apache/datafusion#5479
The text was updated successfully, but these errors were encountered: