-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexer: say that lifetime-like tokens can't be immediately followed by ' #1479
Conversation
…d by ' Forms like 'ab'c are rejected, so we need some way to explain why they don't tokenise as two consecutive LIFETIME_OR_LABEL tokens. Address this by adding "not immediately followed by `'`" to each of the lexer rules for the lifetime-like tokens. This also means there can be no ambiguity between CHAR_LITERAL and these tokens (at present we don't say how such ambiguities are resolved).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll go ahead and merge, but for the most part the reference has not done a good job of handling ambiguity and precedence in the lexer or grammar. I'm not sure this is the ultimate approach to take, since I think there are several other rules that have ambiguity.
For example, there is nothing that clarifies if x'a'
is a RESERVED_TOKEN_SINGLE_QUOTE or a IDENTIFIER CHAR_LITERAL. One possibility is to have a disambiguation rule that prefers "longest match" for ambiguity. So 'a'a
would be a CHAR_LITERAL because that is a longer match than two LIFETIME_TOKENs (or 'a'1
is a CHAR INTEGER, because CHAR is longer than LIFETIME). Then we wouldn't need to explicitly state these kinds of rules. But I don't know if that is the best approach.
Update books ## rust-lang/reference 3 commits in 3417f866932cb1c09c6be0f31d2a02ee01b4b95d..5afb503a4c1ea3c84370f8f4c08a1cddd1cdf6ad 2024-03-06 21:29:54 UTC to 2024-02-28 04:06:45 UTC - Input format (rust-lang/reference#1459) - Lexer: say that lifetime-like tokens can't be immediately followed by ' (rust-lang/reference#1479) - Patterns and enums (rust-lang/reference#1460) ## rust-lang/rust-by-example 2 commits in 57f1e708f5d5850562bc385aaf610e6af14d6ec8..e093099709456e6fd74fecd2505fdf49a2471c10 2024-03-08 23:30:57 UTC to 2024-02-26 21:10:20 UTC - While-Let Unable to compile code example on page (rust-lang/rust-by-example#1819) - Update new_types.md wording (rust-lang/rust-by-example#1823) ## rust-lang/rustc-dev-guide 14 commits in 7b0ef5b..8a5d647 2024-03-11 10:37:18 UTC to 2024-02-29 09:46:28 UTC - update rustc-driver-interacting-with-the-ast.md (rust-lang/rustc-dev-guide#1930) - Update rustc-driver-getting-diagnostics.md (rust-lang/rustc-dev-guide#1931) - Document that test names cannot contain dots (rust-lang/rustc-dev-guide#1927) - Update overview.md (rust-lang/rustc-dev-guide#1898) - actually need to fix two occurances (rust-lang/rustc-dev-guide#1925) - fix broken links (rust-lang/rustc-dev-guide#1924) - next-solver: document caching (rust-lang/rustc-dev-guide#1923) - Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (rust-lang/rustc-dev-guide#1914) - Use different type in an example (rust-lang/rustc-dev-guide#1908) - Update run-make test description (rust-lang/rustc-dev-guide#1920) - Add some more details on feature gating (rust-lang/rustc-dev-guide#1891) - make shell.nix better (rust-lang/rustc-dev-guide#1858) - opaque types in new solver (rust-lang/rustc-dev-guide#1918) - add implied bounds doc (rust-lang/rustc-dev-guide#1915)
Update books ## rust-lang/reference 3 commits in 3417f866932cb1c09c6be0f31d2a02ee01b4b95d..5afb503a4c1ea3c84370f8f4c08a1cddd1cdf6ad 2024-03-06 21:29:54 UTC to 2024-02-28 04:06:45 UTC - Input format (rust-lang/reference#1459) - Lexer: say that lifetime-like tokens can't be immediately followed by ' (rust-lang/reference#1479) - Patterns and enums (rust-lang/reference#1460) ## rust-lang/rust-by-example 2 commits in 57f1e708f5d5850562bc385aaf610e6af14d6ec8..e093099709456e6fd74fecd2505fdf49a2471c10 2024-03-08 23:30:57 UTC to 2024-02-26 21:10:20 UTC - While-Let Unable to compile code example on page (rust-lang/rust-by-example#1819) - Update new_types.md wording (rust-lang/rust-by-example#1823) ## rust-lang/rustc-dev-guide 14 commits in 7b0ef5b..8a5d647 2024-03-11 10:37:18 UTC to 2024-02-29 09:46:28 UTC - update rustc-driver-interacting-with-the-ast.md (rust-lang/rustc-dev-guide#1930) - Update rustc-driver-getting-diagnostics.md (rust-lang/rustc-dev-guide#1931) - Document that test names cannot contain dots (rust-lang/rustc-dev-guide#1927) - Update overview.md (rust-lang/rustc-dev-guide#1898) - actually need to fix two occurances (rust-lang/rustc-dev-guide#1925) - fix broken links (rust-lang/rustc-dev-guide#1924) - next-solver: document caching (rust-lang/rustc-dev-guide#1923) - Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (rust-lang/rustc-dev-guide#1914) - Use different type in an example (rust-lang/rustc-dev-guide#1908) - Update run-make test description (rust-lang/rustc-dev-guide#1920) - Add some more details on feature gating (rust-lang/rustc-dev-guide#1891) - make shell.nix better (rust-lang/rustc-dev-guide#1858) - opaque types in new solver (rust-lang/rustc-dev-guide#1918) - add implied bounds doc (rust-lang/rustc-dev-guide#1915)
Rollup merge of rust-lang#122339 - rustbot:docs-update, r=ehuss Update books ## rust-lang/reference 3 commits in 3417f866932cb1c09c6be0f31d2a02ee01b4b95d..5afb503a4c1ea3c84370f8f4c08a1cddd1cdf6ad 2024-03-06 21:29:54 UTC to 2024-02-28 04:06:45 UTC - Input format (rust-lang/reference#1459) - Lexer: say that lifetime-like tokens can't be immediately followed by ' (rust-lang/reference#1479) - Patterns and enums (rust-lang/reference#1460) ## rust-lang/rust-by-example 2 commits in 57f1e708f5d5850562bc385aaf610e6af14d6ec8..e093099709456e6fd74fecd2505fdf49a2471c10 2024-03-08 23:30:57 UTC to 2024-02-26 21:10:20 UTC - While-Let Unable to compile code example on page (rust-lang/rust-by-example#1819) - Update new_types.md wording (rust-lang/rust-by-example#1823) ## rust-lang/rustc-dev-guide 14 commits in 7b0ef5b..8a5d647 2024-03-11 10:37:18 UTC to 2024-02-29 09:46:28 UTC - update rustc-driver-interacting-with-the-ast.md (rust-lang/rustc-dev-guide#1930) - Update rustc-driver-getting-diagnostics.md (rust-lang/rustc-dev-guide#1931) - Document that test names cannot contain dots (rust-lang/rustc-dev-guide#1927) - Update overview.md (rust-lang/rustc-dev-guide#1898) - actually need to fix two occurances (rust-lang/rustc-dev-guide#1925) - fix broken links (rust-lang/rustc-dev-guide#1924) - next-solver: document caching (rust-lang/rustc-dev-guide#1923) - Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (rust-lang/rustc-dev-guide#1914) - Use different type in an example (rust-lang/rustc-dev-guide#1908) - Update run-make test description (rust-lang/rustc-dev-guide#1920) - Add some more details on feature gating (rust-lang/rustc-dev-guide#1891) - make shell.nix better (rust-lang/rustc-dev-guide#1858) - opaque types in new solver (rust-lang/rustc-dev-guide#1918) - add implied bounds doc (rust-lang/rustc-dev-guide#1915)
Forms like
'ab'c
are rejected, so we need some way to explain why they don't tokenise as two consecutive LIFETIME_OR_LABEL tokens.I think the best way to do this, given the Reference's current approach, is simply to add "not immediately followed by
'
" to the lexer rules for the lifetime-like tokens.That matches what the implementation (
lifetime_or_char()
) is doing, so it's not likely to be wrong, and this chapter already has some cases of lookahead of this sort.It also means there can be no ambiguity between CHAR_LITERAL and these tokens (I think the intent is that we have a traditional "longest matching token wins" rule, which would give the right result here, but that isn't explicitly stated anywhere).