-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement for some libcudf regex functions for long strings #13322
Performance improvement for some libcudf regex functions for long strings #13322
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of disparate changes here. I can't review them in detail, but I wonder what is the relationship between them all? Is there something common between the optimizations?
Yes. There are two main changes here. One is the change to the signatures of the two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice speedups!
Adds a `move_to()` function the `cudf::string_view::const_iterator` class to help minimize character counting when creating and incrementing the iterator on multi-byte UTF8 characters. The function simply moves the iterator from the current character position to the given one. This is just a shortcut for the form ``` itr += (new_position - itr.position()); ``` This pattern is repeated many times in #13322 and likely future PRs that require the same behavior. The PR also includes an update to the `string_view::begin()` to set the byte-offset directly rather than waste instructions calculating it. Authors: - David Wendt (https://github.com/davidwendt) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: #13428
I'm reviewing this right now. Please pardon my prior absence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of clarifications and nitpicks. Looks good to me.
/merge |
Description
Changes the internal regex logic to minimize character counting to help performance with longer strings. The improvement applies mainly to libcudf regex functions that return strings (i.e. extract, replace, split). The changes here also improve the internal device APIs for clarity to improve maintenance. The most significant change makes the position variables input-only and returning an optional pair to indicate a successful match.
There are some more optimizations that are possible here where character positions are passed back and forth that could be replaced with byte positions to further reduce counting. Initial measurements showed this noticeably slowed down small strings so more analysis is required before continuing this optimization.
Reference: #13048
More Detail
First, there is a change to some internal regex function signatures. Notable the
reprog_device::find()
andreprog_device::extract()
member functions declared incpp/src/strings/regex/regex.cuh
that are used by all the libcudf regex functions. The in/out parameters are now input-only parameters (pass by value) and the return is an optional pair that includes the match result. Also, thebegin
parameter is now an iterator and theend
parameter now has a default. This change requires updating all the definitions and uses of thefind
andextract
member functions.Using an iterator as the
begin
parameter allows for some optimizations in the calling code to minimize character counting that may be needed for processing multi-byte UTF-8 characters. Rather than using thecudf::string_view::byte_offset()
member function to convert character positions to byte positions, an iterator can be incremented as we traverse through the string which helps reduce some character counting. So the changes here involve removing some calls tobyte_offset()
and incrementing (really moving) iterators with a pattern likeitr += (new_pos - itr.position());
There is another PR #13428 to make amove_to
iterator member function.It is possible to reduce the character counting even more as mentioned above but further optimization requires some deeper analysis.
Checklist