-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regexp simplification #1787
Regexp simplification #1787
Conversation
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
git push Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beautiful!
It took me a while to understand simplifyConcat
and simplifyConcatAlternate
, but with many tests with good coverage, it wasn't difficult to see what it is doing.
Well done, and I look forward to see more optimizations here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good, I just have a few comments and questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job!
Signed-off-by: Cyril Tovena <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #1787 +/- ##
==========================================
+ Coverage 64.01% 64.34% +0.32%
==========================================
Files 121 122 +1
Lines 9027 9143 +116
==========================================
+ Hits 5779 5883 +104
- Misses 2844 2853 +9
- Partials 404 407 +3
|
This PRs introduces simplification of regexp filter by parsing regexp and replacing them with combination of
bytes.Contains
aka literal filter.The reasoning behind is that many regexp can be simplified to execute faster than actual regexp that requires backtracking and string capturing.
Currently we only support recursively those regexp operations:
Obviously we can support more (I have a lot of ideas) but I wanted to started with this for now and focus on what is next based on usage.
Now regarding benchmark I'm really impressed see below:
Basically the smallest improvement is 5x and the biggest one is 333x 😮 !
Fixes #1542 and more !