-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of regexp_count
#13011
Comments
I'm not sure I can handle this but want to try. I can unassign if somebody is already working on this |
take |
I am wondering if this is not really caused by doing more work - regexp_match only needs to find the first match and can return early, while regexp_count needs to find all matches and return the count. |
Possible @Dandandan however I don't think it would explain the difference with regexp_replace which I would think would have to do even more work. |
Yeah that is suspicious I agree, I think it is worth checking whether regexp_replace in the benchmark does replace all matches or just the first value. |
I added the global flags for regexp_replace so it actually replaces all the matches:
There is definitely a change compared to the benchmark taken by @xinlifoobar however the count function is still marginally slower. |
So is the conclusion we can't reproduce the difference in benchmark numbers, and thus we should close this ticket? |
It's still 50% slower than replace so I think it may be worth investigating at some point. Is it a huge blocker? Unlikely unless this is a hotspot in someone's processing pipeline |
Is your feature request related to a problem or challenge?
@Omega359 and @xinlifoobar added the
regexp_count
function in #12970However,
regexp_count
seems to be significantly slower thanregexp_match
andregexp_like
. (see #12080 (comment) for a benchmark)Describe the solution you'd like
I would like to review the benchmark for
regexp_count
and figure out why it is slower and if so try to improve itDescribe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: