Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a regular expression library #265

Merged
merged 7 commits into from
Jan 16, 2024
Merged

Add a regular expression library #265

merged 7 commits into from
Jan 16, 2024

Conversation

jasal82
Copy link
Contributor

@jasal82 jasal82 commented Dec 15, 2023

This adds basic regular expression functionality to Koto. It uses the regex crate and mimics its interface as far as possible. The following operations are supported:

  • is_match
  • find
  • find_all (iterable)
  • captures (index and named)
  • replace_all

I tried to reuse the original Match object but it didn't work out due to lifetime issues, so I replaced it with my own representation. Maybe there is an easier way, feel free to change.

@irh
Copy link
Contributor

irh commented Dec 15, 2023

Thanks for the PR, a regex module is going to be super useful 👍

A couple of changes I'd like to make before landing this:

  • I'd like to rename the library to regex, I know Python uses re, but regex as a name seems more explicit/discoverable/googleable.
  • I'd like to add some example code under koto/test/libs, maybe you have some tests you built the lib against? No worries if not.
  • Would it make sense to give Captures an iterable interface (like Matches) rather than making it a container? If map access is desirable then maybe .to_map() could be used on the iterator output?

I can take care of moving this forward so don't worry about making changes (unless you feel like it), although it won't be until after the weekend.

@jasal82
Copy link
Contributor Author

jasal82 commented Dec 15, 2023

Thanks for your comments. Here are my thoughts:

  • Rename to regex sounds good, I tend to keep namespaces short, especially for things that are used very often. But I agree that it might be clearer. Then I'd also suggest to rename the regex() method to pattern() or new(). Not sure what the Koto convention is here.
  • I have some basic tests that I can add.
  • Users usually expect random access to the capture groups and map access on the named capture groups. Not every capture group is named, so I think the interface you suggest would feel rather awkward. There's also a capture_all method in the original regex crate (which is currently missing from the Koto module) which returns an iterator over the captures for all individual matches. There I would keep the iterator in Koto and have it dereference to a Captures.

@irh
Copy link
Contributor

irh commented Jan 2, 2024

Hi @jasal82, just pinging to say I haven't forgotten about this PR, and I'll get around to it as soon as I can.

@irh irh force-pushed the regex-library branch 2 times, most recently from 42f8945 to cc1f029 Compare January 9, 2024 20:30
@irh
Copy link
Contributor

irh commented Jan 9, 2024

@jasal82 I've updated the regex lib now to the point where I'd be happy to land it, if you have a moment could you take a look at the API changes I made? No problem if you don't have time, we can always revisit it later if you have notes.

@irh irh merged commit c01d5d1 into koto-lang:main Jan 16, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants