Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support inlined nodes in grammar #500

Merged
merged 1 commit into from
Jun 17, 2023
Merged

Conversation

OmarTawfik
Copy link
Contributor

@OmarTawfik OmarTawfik commented Jun 12, 2023

Closes #323

  • Created ProductionDefinition enum to let parent Production hold common properties, similar to how ParserDefinition, ScannerDefinition, etc... work today.
  • Added Production::inlined boolean property, which defaults to false
  • Inlined productions no longer produce ProductionKind, TokenKind, or RuleKind.

To test this end-to-end, I inlined a Scanner, a Parser, and a PrecedenceParser in the current grammar.

Will follow up in the next few PRs with:

  • inlining additional nodes.
  • adding validation to make sure references between inlined nodes are valid, and that they are not leaked to public APIs.

@changeset-bot
Copy link

changeset-bot bot commented Jun 12, 2023

🦋 Changeset detected

Latest commit: 25d9ede

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
changelog Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@OmarTawfik OmarTawfik force-pushed the add-inlined-productions branch from 3d9d62b to b79f40c Compare June 12, 2023 12:35
@OmarTawfik OmarTawfik marked this pull request as ready for review June 12, 2023 14:32
@OmarTawfik OmarTawfik requested a review from a team as a code owner June 12, 2023 14:32
@OmarTawfik OmarTawfik linked an issue Jun 12, 2023 that may be closed by this pull request
OmarTawfik added a commit that referenced this pull request Jun 16, 2023
- move parsing tokens to a common function
- remove the need to pass token names as static strings everywhere,
since we already have `strum::AsRefStr` deriver

This decreases nesting, improves readability of the parser code, and
decreases the generated code by roughly 30%.
But most importantly, it unblocks #498 and #500 by making it easier to
generate named or unnamed nodes at each parser root.
@OmarTawfik OmarTawfik force-pushed the preserve-empty-nodes branch from 0f613cc to 7ff35c7 Compare June 17, 2023 02:11
Base automatically changed from preserve-empty-nodes to main June 17, 2023 02:28
- Created `ProductionDefinition` enum to let parent `Production` hold common properties, similar to how `ParserDefinition`, `ScannerDefinition`, etc... work today.
- Added `Production::inlined` boolean property, which defaults to `false`
- Inlined productions no longer produce `ProductionKind`, `TokenKind`, or `RuleKind`.

To test this end-to-end, I inlined a `Scanner`, a `Parser`, and a `PrecedenceParser` in the current grammar.

Will follow up in the next few PRs with:

- inlining additional nodes.
- adding validation to make sure references between inlined nodes are valid, and that they are not leaked to public APIs.
@OmarTawfik OmarTawfik force-pushed the add-inlined-productions branch from b79f40c to 25d9ede Compare June 17, 2023 04:25
@OmarTawfik OmarTawfik enabled auto-merge June 17, 2023 04:25
@OmarTawfik OmarTawfik added this pull request to the merge queue Jun 17, 2023
Merged via the queue into main with commit 73ddac9 Jun 17, 2023
@OmarTawfik OmarTawfik deleted the add-inlined-productions branch June 17, 2023 04:43
@github-actions github-actions bot mentioned this pull request Jun 17, 2023
github-merge-queue bot pushed a commit that referenced this pull request Jul 7, 2023
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and publish to npm
yourself or [setup this action to publish
automatically](https://github.com/changesets/action#with-publishing). If
you're not ready to do a release yet, that's fine, whenever you add more
changesets to main, this PR will be updated.


# Releases
## @nomicfoundation/[email protected]

### Minor Changes

- [#502](#502)
[`c383238`](c383238)
Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Added error
recovery i.e. a CST is _always_ produced, even if there are errors. The
erroneous/skipped text is in the CST as a `TokenKind::SKIPPED` token.

- [#501](#501)
[`cb221fe`](cb221fe)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - generate
typescript string enums for CST kinds

- [#517](#517)
[`8bd5446`](8bd5446)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - extract inlined
and sub-expressions in language grammar

- [#518](#518)
[`b3b562b`](b3b562b)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - fill in missing
CST node names

- [#515](#515)
[`f24e873`](f24e873)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - switch over the
NPM package to use CommonJS modules instead of ES modules.

- [#498](#498)
[`44f1ff7`](44f1ff7)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - flatten unnamed
CST nodes into parent nodes

- [#502](#502)
[`c383238`](c383238)
Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Use the Rowan
model for the CST i.e. TokenNodes contain the string content, and
RuleNodes contain only the combined _length_ of their children's text.

- [#499](#499)
[`1582d60`](1582d60)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - preserve correct
ranges on empty rule nodes

- [#500](#500)
[`73ddac9`](73ddac9)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - inlining CST
nodes that offer no additional syntactic information

- [#512](#512)
[`72dc3d3`](72dc3d3)
Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Expression
productions now correctly wrap the recursive 'calls' in a rule node

## [email protected]

### Minor Changes

- [#502](#502)
[`c383238`](c383238)
Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Added error
recovery i.e. a CST is _always_ produced, even if there are errors. The
erroneous/skipped text is in the CST as a `TokenKind::SKIPPED` token.

- [#501](#501)
[`cb221fe`](cb221fe)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - generate
typescript string enums for CST kinds

- [#517](#517)
[`8bd5446`](8bd5446)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - extract inlined
and sub-expressions in language grammar

- [#518](#518)
[`b3b562b`](b3b562b)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - fill in missing
CST node names

- [#515](#515)
[`f24e873`](f24e873)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - switch over the
NPM package to use CommonJS modules instead of ES modules.

- [#498](#498)
[`44f1ff7`](44f1ff7)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - flatten unnamed
CST nodes into parent nodes

- [#502](#502)
[`c383238`](c383238)
Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Use the Rowan
model for the CST i.e. TokenNodes contain the string content, and
RuleNodes contain only the combined _length_ of their children's text.

- [#499](#499)
[`1582d60`](1582d60)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - preserve correct
ranges on empty rule nodes

- [#500](#500)
[`73ddac9`](73ddac9)
Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - inlining CST
nodes that offer no additional syntactic information

- [#512](#512)
[`72dc3d3`](72dc3d3)
Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Expression
productions now correctly wrap the recursive 'calls' in a rule node

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add Production::inlined: bool flag
2 participants