Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Oniguruma implementation #2297

Open
Truncated opened this issue Jun 19, 2024 · 2 comments
Open

Question about Oniguruma implementation #2297

Truncated opened this issue Jun 19, 2024 · 2 comments

Comments

@Truncated
Copy link

Flavor Request

Apologies as I'm not honestly sure how to ask this question, but here's what I'm trying to figure out:

I'm trying to write a VSCode Textmate Grammar-based language for VSCode, and having a heck of a time figuring out the regex implementation differences.

I found that VSCode uses its own mod of textMate grammars and directly pulls Oniguruma via a dedicated binding here. This is updated, but not frequently. It's at 6.9.8 right now, which is really only .1 out of latest.

So Ruby and PHP both use Oniguruma, but they tend to be a lot more further back in versions. https://rubular.com/ uses Ruby 2.5.9., which I believe has version of Oniguruma v6.1.3.- PCRE is in the ballpark, but it's got enough differences to be problematic.

I'd like to see a raw version of Oni directly implemented, but selfishly I'd like to see the most compatible with the VSCode implementation. Is this even a flavor request or does it need to be some kind of implementation through an intermediary?

@slevithan
Copy link

slevithan commented Sep 5, 2024

Yes, this is a flavor request. Oniguruma is a powerful regex engine/flavor but has quite a few differences from PCRE, including lots of edge cases that work differently.

Note that, although you're right that TextMate grammars use Oniguruma, Ruby doesn't. Or rather, Ruby 1.9 did. Ruby 1.8 used its own/different flavor, and Ruby 2.0+ uses Onigmo by default. Onigmo is an Oniguruma fork that is very similar in its syntax and behavior, but has made enough changes/extensions (plus fallen behind compared to newer versions of Oniguruma) to consider it a different flavor.

@slevithan
Copy link

slevithan commented Dec 29, 2024

@firasdib regex101 could support Oniguruma using only JavaScript (without running a new server backend for it) by using either one of the following libraries:

  • oniguruma-to-es, an incredibly robust/accurate Oniguruma to native JavaScript RegExp transpiler that I maintain.
  • vscode-oniguruma, which gives access to the real Oniguruma C library compiled to WASM.
    • Although there are upsides to this, the downside is it requires downloading a 450+KB WASM file. Additionally, vscode-oniguruma (at least as of v2.0.1) doesn't offer access to named subpattern matches on match results (only subpattern start/end positions by subpattern index).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants