-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up diagnostics in uncommon_codepoints (potentially splitting up the lint as well) #120228
Comments
Hello, I'm interested in taking a go at this. Could anyone mentor me on this? |
@rustbot claim |
I'm too busy in the coming weeks to fully mentor but I can answer questions. Please make a thread in the diagnostics channel on rust-lang.zulipchat.org and ask questions there, ccing me ("Manish Goregaokar"). |
I would start by implementing the checks for 2-5 using the existing APIs. The relevant code is here:
You'll want that to emit a different lint message based on context. The lint messages are pulled from a diagnostics type https://github.com/rust-lang/rust/blob/master/compiler/rustc_lint/src/lints.rs#L1111 which links to rust/compiler/rustc_lint/messages.ftl Line 243 in 021861a
I think the first change to make would actually be to make this diagnostic type contain a vector of characters, which it prints out as a list. Once we have that done, we should add more versions of it that have different messages, for Technical, Exclusion, etc. |
…diagnostics-uncommon-codepoints, r=Manishearth Split Diagnostics for Uncommon Codepoints: Add List to Display Characters Involved This Pull Request adds a list of the uncommon codepoints involved in the `uncommon_codepoints` lint, as outlined as a first step in rust-lang#120228. Example rendered diagnostic: ``` error: identifier contains an uncommon Unicode codepoint: 'µ' --> $DIR/lint-uncommon-codepoints.rs:3:7 | LL | const µ: f64 = 0.000001; | ^ | note: the lint level is defined here --> $DIR/lint-uncommon-codepoints.rs:1:9 | LL | #![deny(uncommon_codepoints)] | ^^^^^^^^^^^^^^^^^^^ ``` (Retrying rust-lang#120258.)
…diagnostics-uncommon-codepoints, r=Manishearth Split Diagnostics for Uncommon Codepoints: Add List to Display Characters Involved This Pull Request adds a list of the uncommon codepoints involved in the `uncommon_codepoints` lint, as outlined as a first step in rust-lang#120228. Example rendered diagnostic: ``` error: identifier contains an uncommon Unicode codepoint: 'µ' --> $DIR/lint-uncommon-codepoints.rs:3:7 | LL | const µ: f64 = 0.000001; | ^ | note: the lint level is defined here --> $DIR/lint-uncommon-codepoints.rs:1:9 | LL | #![deny(uncommon_codepoints)] | ^^^^^^^^^^^^^^^^^^^ ``` (Retrying rust-lang#120258.)
Rollup merge of rust-lang#120259 - HTGAzureX1212:HTGAzureX1212/split-diagnostics-uncommon-codepoints, r=Manishearth Split Diagnostics for Uncommon Codepoints: Add List to Display Characters Involved This Pull Request adds a list of the uncommon codepoints involved in the `uncommon_codepoints` lint, as outlined as a first step in rust-lang#120228. Example rendered diagnostic: ``` error: identifier contains an uncommon Unicode codepoint: 'µ' --> $DIR/lint-uncommon-codepoints.rs:3:7 | LL | const µ: f64 = 0.000001; | ^ | note: the lint level is defined here --> $DIR/lint-uncommon-codepoints.rs:1:9 | LL | #![deny(uncommon_codepoints)] | ^^^^^^^^^^^^^^^^^^^ ``` (Retrying rust-lang#120258.)
…e-identifier-types, r=fmease,Manishearth Split Diagnostics for Uncommon Codepoints: Add Individual Identifier Types This pull request further modifies the `uncommon_codepoints` lint, adding the individual identifier types of `Technical`, `Not_NFKC`, `Exclusion` and `Limited_Use` to the diagnostic message. Example rendered diagnostic: ``` error: identifier contains a Unicode codepoint that is not used in normalized strings: 'ij' --> $DIR/lint-uncommon-codepoints.rs:6:4 | LL | fn dijkstra() {} | ^^^^^^^ = note: this character is included in the Not_NFKC Unicode general security profile ``` Second step of rust-lang#120228.
Rollup merge of rust-lang#120840 - HTGAzureX1212:HTGAzureX1212/unicode-identifier-types, r=fmease,Manishearth Split Diagnostics for Uncommon Codepoints: Add Individual Identifier Types This pull request further modifies the `uncommon_codepoints` lint, adding the individual identifier types of `Technical`, `Not_NFKC`, `Exclusion` and `Limited_Use` to the diagnostic message. Example rendered diagnostic: ``` error: identifier contains a Unicode codepoint that is not used in normalized strings: 'ij' --> $DIR/lint-uncommon-codepoints.rs:6:4 | LL | fn dijkstra() {} | ^^^^^^^ = note: this character is included in the Not_NFKC Unicode general security profile ``` Second step of rust-lang#120228.
Currently we have the
uncommon_codepoints
lint, which lints on anything which isIdentifier_Status=Restricted
.It may be worth improving the diagnostics there by splitting it into multiple different specialized diagnostics. In the long run, some of these might be something that should be promoted to a separate lint so that they can individually be allowed.
The diagnostics I can think of are:
The first one can be implemented by taking the set of Rust syntax characters, expanding that to their confusables set, and then winnowing it down to the set of characters that is allowed in an identifier. This could belong in a separate check in the unicode-security crate.
The others can be implemented by checking the
identifier_type()
of characters in the ident.I might be able to mentor this, I can provide diagnostic text for these when needed.
The text was updated successfully, but these errors were encountered: