Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a .str.find_many with AhoCorasick (similar to extract_many) #19923

Closed
jpfeuffer opened this issue Nov 22, 2024 · 2 comments · Fixed by #19952
Closed

Add a .str.find_many with AhoCorasick (similar to extract_many) #19923

jpfeuffer opened this issue Nov 22, 2024 · 2 comments · Fixed by #19952
Assignees
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature

Comments

@jpfeuffer
Copy link

jpfeuffer commented Nov 22, 2024

Description

I love your new"ish" .str.extract_many.
However, I would have assumed that a find_many (that returns the first start position of each of the matches) was actually the much more useful and more general method.
I barely ever had use case where I would need my (string literal!) patterns returned but had many use cases where I wanted to know the positions of those matches.

In any case, I am not saying that extract_many should go but would love to see a find_many equivalent.

It actually sounds relatively easy and I might consider it as a first Rust project that I could look at with some help. If you consider it a worthy addition.

I have not found a similar issue/request so far.

@jpfeuffer jpfeuffer added the enhancement New feature or an improvement of an existing feature label Nov 22, 2024
@jpfeuffer
Copy link
Author

Similar to: #17304
Changes would just be to not use the "push" method and return the start indices:

fn push(val: &str, builder: &mut ListStringChunkedBuilder, ac: &AhoCorasick, overlapping: bool) {

@jpfeuffer
Copy link
Author

That's awesome. Thank you so much for the quick implementation @ritchie46

@c-peters c-peters added the accepted Ready for implementation label Nov 25, 2024
@c-peters c-peters moved this to Done in Backlog Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants