Implement support for contextual keywords #598

Xanewok · 2023-09-19T11:26:38Z

Fixes #568.

First few commits are small cleanups to the PG code, the last one implements the support along with a new CST test.

This introduces a new ScannerDefinitionNode::ContextualKeyword that includes the literal and the underlying parser.
The reason why the identifier parser is used is to be able to reference its token kind in the parser generator, as we don't intentionally specify any built-in Identifier parsers (well, we do hardcode Identifier in one place but it's a hack I'd rather not spread. Do we need to handle YulIdentifier as well/separately?).

For the version ranges that it's not a contextual keyword, it's still included in the trie as usual, so the old machinery works as expected and the lexer prioritises that match over a catch-all Identifier we use now.

However, for the version ranges that it is a contextual keyword, the version check is moved outside to a new Lexer::as_contextual_keyword, which attempts to promote a given token to a contextual keyword and eats that if it's the expected token kind.

This way, we can eat TokenKind::SomeContextualKeyword if necessary, since parse_token attempts to promote a keyword, but for a given version where it's contextual, it's not returned early from the lexer, so we lex an Identifier by default.

changeset-bot · 2023-09-19T11:26:41Z

🦋 Changeset detected

Latest commit: d14285e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@nomicfoundation/slang	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

I knew I forgot about something important...

It was introduced alongside the `revert` contextual keyword, see ethereum/solidity#11037.

See ethereum/solidity#12288

AntonyBlakey

As a hack, this works, but I don't like the way it confuses the model re keywords in general, and my discomfort is telling me we should do it properly. Will discuss f2f.

Non-controversial bits separated from #598. This updates versions for the `revert` keyword (and the associated statement, introduced in 0.8.4) and the `global` (introduced in 0.8.13 for the using directive) contextual keywords in both the new DSL and our old YAML spec. --------- Co-authored-by: Omar Tawfik <[email protected]>

Xanewok · 2023-09-24T20:11:40Z

We decided to hold this off until we migrate to a DSL v2, as that will change how keywords will be defined and identifier<>keyword will interact.

Closes #568 There is still one outstanding issue where we return a `Vec<TokenKind>` from `next_token`; it'd like to return a more specialized type and ideally pass it on stack (2x2 bytes), rather than on-heap (extra 3x8 bytes for the Vec handle + indirection). We should name it better and properly show that we can return at most 2 token kinds (single token kind or identifier + kw combo). To do: - [x] Return tokens from `next_token` via stack Apart from that, I think this is a more correct approach than #598, especially accounting for the new keyword definition format in DSL v2. The main change is that we only check the keyword trie and additionally the (newly introduced) compound keyword scanners only after the token has been lexed as an identifier. For each context, we collect Identifier scanners used by the keywords and attempt promotion there. The existing lexing performance is not impacted from what I've seen when running the sanctuary tests and I can verify (incl. CST tests) that we now properly parse source that uses contextual keywords (e.g. `from`) and that the compound keywords (e.g. `ufixedMxN`) are properly versioned. This adapts the existing `codegen_grammar` interface that's a leftover from DSLv1; I did that to work on finishing #638; once this is merged and we now properly parse contextual keywords, I'll move to clean it up and reduce the parser codegen indirection (right now we go from v2 -> v1 model -> code generator -> Tera templates; it'd like to at least cut out the v1 model and/or simplify visiting v2 from the existing `CodeGenerator`). Please excuse the WIP comments in the middle; the first and the last ones should make sense when reviewing. I can simplify this a bit for review, if needed.

Xanewok added 5 commits September 19, 2023 13:09

cleanup: Use BTreeMap for CodeGenerator::{scanner,parser}_functions

389faeb

cleanup: Use BTreeMap for CodeGenerator::scanner_contexts

2c711ef

cleanup: Mark top_level_scanner_names as unused in templates

f110aeb

cleanup: Hoist the Identifier hack in PG trie code

2afccfe

feat: Implement contextual keywords

e61468c

Xanewok requested a review from a team as a code owner September 19, 2023 11:26

Xanewok added 11 commits September 19, 2023 13:33

Add a changeset file

e71778b

Remove an outdated note

65979c3

feat: Support definitions of only contextual keyword scanners

d955cdc

fix: Allow for multiple contextual keywords

861176b

I knew I forgot about something important...

feat: Support the from contextual keyword

637577c

tests: Add a contextual_keywords CST test

08f4272

feat: Support contextual error/revert/global keywords

59c30dd

fix: Mark RevertStatement as introduced in 0.8.4

510d8cf

It was introduced alongside the `revert` contextual keyword, see ethereum/solidity#11037.

fix: Mark using global binding as introduced in 0.8.13

5ffc74f

See ethereum/solidity#12288

Add changeset file

a912a5d

fixup: Re-generate language.rs for the npm build

d14285e

AntonyBlakey reviewed Sep 21, 2023

View reviewed changes

Xanewok mentioned this pull request Sep 21, 2023

Update keyword versions #599

Merged

Xanewok closed this Sep 24, 2023

Xanewok mentioned this pull request Dec 27, 2023

Implement (contextual) keywords and use their versioning from v2 #723

Merged

1 task

Xanewok deleted the feat-contextual-keywords branch January 4, 2024 17:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement support for contextual keywords #598

Implement support for contextual keywords #598

Xanewok commented Sep 19, 2023

changeset-bot bot commented Sep 19, 2023 •

edited

Loading

AntonyBlakey left a comment

Xanewok commented Sep 24, 2023

Implement support for contextual keywords #598

Implement support for contextual keywords #598

Conversation

Xanewok commented Sep 19, 2023

changeset-bot bot commented Sep 19, 2023 • edited Loading

🦋 Changeset detected

AntonyBlakey left a comment

Choose a reason for hiding this comment

Xanewok commented Sep 24, 2023

changeset-bot bot commented Sep 19, 2023 •

edited

Loading