-
Notifications
You must be signed in to change notification settings - Fork 26
Sub-match extraction #14
Comments
Thanks for the issue! So my response to this is kinda complex. Part of it is because the design space and use cases are both really quite large, so it's hard to interpret what it is you actually want. But to get the simple question out of the way first:
The only thing I'm aware of in this space is maybe Ragel, but definitely re2c. Neither of those things are Rust, but I believe they can be made to work in a OK, so here's the thing. When I wrote that text in the README, I was thinking about submatch extraction for fully compiled DFAs. That's what re2c provides AIUI. But it cannot actually be done with a DFA. It isn't powerful enough. Instead, you need a computationally more expressive structured called tagged automata. I personally don't plan on implementing that in this crate. What I'm working on right now is moving the various regex engines from the Popping up a level, I would be genuinely curious to hear more about your use case if you could share it. |
First of all the easy one, I meant "works on Sadly I can't (yet) get into all the use case details, but I can say my use case involves code running on a variety of embedded devices (hence I've only briefly looked at Ragel and re2c, so maybe they do have a solution. However, I'm going to have many many different regexes over time and want to run on many different architectures. Given this, compiling each regex to architecture specific assembly doesn't seem like the best approach. I certainly appreciate all your work, and it sounds like your likely headed towards implementing what I need (except the dreaming parts 😄) with a |
Hyperscan might be another one, but it is quite large. |
Out of curiosity, when you say "no-std" here, do you also mean "no-alloc"? The regex crate will eventually support |
If I'm in dream land then no |
The |
First of all, thank you for all your work on this crate (and others!)!
I just wanted to throw an issue in here to express an interest in having support for sub-match extraction. I understand this is probably unlikely (as stated) but perhaps others might chime in on this ticket if there's broader interest in this feature.
In the meantime, do you have suggestions / recomendations for byte-oriented no-std compatible regex processing supporting sub-match extraction?
The text was updated successfully, but these errors were encountered: