Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update string rules when inside a forward reference #10586

Open
dhruvmanila opened this issue Mar 25, 2024 · 5 comments
Open

Update string rules when inside a forward reference #10586

dhruvmanila opened this issue Mar 25, 2024 · 5 comments
Labels
bug Something isn't working help wanted Contributions especially welcome

Comments

@dhruvmanila
Copy link
Member

dhruvmanila commented Mar 25, 2024

This is happening because the checker is analyzing the string inside the annotation (three strings inside Literal). The ranges aren't correct in that case. We need to find a holistic solution to this because it's happening to all rules working on strings.

For example, here UP025 is triggered for the u'\\n' which is inside the string type annotation.

/Users/dhruv/playground/ruff/src/UP025.py:1:12: UP025 [*] Remove unicode literals from strings
  |
1 | NEWLINE_SEQUENCE: "te.Literal[u'\\n', '\\r\\n', '\\r']" = "\n"
  |            ^^^^^ UP025
  |
  = help: Remove unicode prefix

ℹ Safe fix
1   |-NEWLINE_SEQUENCE: "te.Literal[u'\\n', '\\r\\n', '\\r']" = "\n"
  1 |+NEWLINE_SEQENCE: "te.Literal[u'\\n', '\\r\\n', '\\r']" = "\n"

Found 1 error.
[*] 1 fixable with the --fix option.

We could potentially fix the range for the rules to be useful in the context of type annotation or ignore them completely. Nevertheless, we should completely avoid rules which might update the quotes because otherwise we need the context of the quote which contains the actual type annotation.

Originally posted by @dhruvmanila in #10546 (comment)

@dhruvmanila dhruvmanila added bug Something isn't working linter Related to the linter labels Mar 25, 2024
@dhruvmanila
Copy link
Member Author

dhruvmanila commented Apr 10, 2024

Hey @RaulWCosta, sorry for the late reply but yeah it is available.

I think the first step would be to fix the range. I was able to find out the root cause of the problem and it's here:

// Otherwise, consider this a "complex" annotation.
let mut expr = parse_expression(value)?;
relocate_expr(&mut expr, range);
Ok((expr, AnnotationKind::Complex))

The example mentioned in the PR description contains escaped characters which makes this a complex annotation. This means that parse_type_annotation takes the else path which parses it as an expression and then relocates the parsed expression and any of it's child expressions to the same range. This is where the bug is. The relocation logic isn't correct as it doesn't relocate any of the sub-expressions relative to the root expression. cc @charliermarsh I think there was some discussion around this logic.

For example,

import typing as t

NEWLINE_SEQUENCE: "t.Literal[u'\\n', '\\r\\n', '\\r']" = "\n"

Along with the following patch to print the AST:

diff --git a/crates/ruff_python_parser/src/typing.rs b/crates/ruff_python_parser/src/typing.rs
index 18efe4041..09507977c 100644
--- a/crates/ruff_python_parser/src/typing.rs
+++ b/crates/ruff_python_parser/src/typing.rs
@@ -38,7 +38,9 @@ pub fn parse_type_annotation(
     } else {
         // Otherwise, consider this a "complex" annotation.
         let mut expr = parse_expression(value)?;
+        println!("{:#?}", expr);
         relocate_expr(&mut expr, range);
+        println!("{:#?}", expr);
         Ok((expr, AnnotationKind::Complex))
     }
 }

We can see that the relocated ranges aren't correct as they're set to the same range regardless of the position of the node:

diff --git a/a.txt b/b.txt
index bb13b780d..26b45c862 100644
--- a/a.txt
+++ b/b.txt
@@ -1,12 +1,13 @@
    ⋮  1 │+
  1 ⋮  2 │ Subscript(
  2 ⋮  3 │     ExprSubscript {
  3 ⋮    │-        range: 0..30,
    ⋮  4 │+        range: 38..74,
  4 ⋮  5 │         value: Attribute(
  5 ⋮  6 │             ExprAttribute {
  6 ⋮    │-                range: 0..9,
    ⋮  7 │+                range: 38..74,
  7 ⋮  8 │                 value: Name(
  8 ⋮  9 │                     ExprName {
  9 ⋮    │-                        range: 0..1,
    ⋮ 10 │+                        range: 38..74,
 10 ⋮ 11 │                         id: "t",
 11 ⋮ 12 │                         ctx: Load,
 12 ⋮ 13 │                     },
@@ -20,11 +21,11 @@ Subscript(
 20 ⋮ 21 │         ),
 21 ⋮ 22 │         slice: Tuple(
 22 ⋮ 23 │             ExprTuple {
 23 ⋮    │-                range: 10..29,
    ⋮ 24 │+                range: 38..74,
 24 ⋮ 25 │                 elts: [
 25 ⋮ 26 │                     StringLiteral(
 26 ⋮ 27 │                         ExprStringLiteral {
 27 ⋮    │-                            range: 10..15,
    ⋮ 28 │+                            range: 38..74,
 28 ⋮ 29 │                             value: StringLiteralValue {
 29 ⋮ 30 │                                 inner: Single(
 30 ⋮ 31 │                                     StringLiteral {
@@ -42,7 +43,7 @@ Subscript(
 42 ⋮ 43 │                     ),
 43 ⋮ 44 │                     StringLiteral(
 44 ⋮ 45 │                         ExprStringLiteral {
 45 ⋮    │-                            range: 17..23,
    ⋮ 46 │+                            range: 38..74,
 46 ⋮ 47 │                             value: StringLiteralValue {
 47 ⋮ 48 │                                 inner: Single(
 48 ⋮ 49 │                                     StringLiteral {
@@ -60,7 +61,7 @@ Subscript(
 60 ⋮ 61 │                     ),
 61 ⋮ 62 │                     StringLiteral(
 62 ⋮ 63 │                         ExprStringLiteral {
 63 ⋮    │-                            range: 25..29,
    ⋮ 64 │+                            range: 38..74,
 64 ⋮ 65 │                             value: StringLiteralValue {
 65 ⋮ 66 │                                 inner: Single(
 66 ⋮ 67 │                                     StringLiteral {

The next step would be to go through the rules which work on strings (search for StringLiteral, BytesLiteral, FString, StringLike) and decide on the following:

  1. Is it ok to highlight this violation? If not, skip this rule through checker.semantic().in_type_definition() check. For example,
    pub(crate) fn runtime_string_union(checker: &mut Checker, expr: &Expr) {
    if !checker.semantic().in_type_definition() {
    return;
    }
  2. Is it safe to fix this violation? If not, avoid creating the fix for it. For example,
    /// UP007
    pub(crate) fn use_pep604_annotation(
    checker: &mut Checker,
    expr: &Expr,
    slice: &Expr,
    operator: Pep604Operator,
    ) {
    // Avoid fixing forward references, types not in an annotation, and expressions that would
    // lead to invalid syntax.
    let fixable = checker.semantic().in_type_definition()
    && !checker.semantic().in_complex_string_type_definition()
    && is_allowed_value(slice);

That said, I don't think any of the violations that diagnosis string quotes should be highlighted or fixed as it requires context of the surrounding quote of the annotation itself.

@dhruvmanila
Copy link
Member Author

Thanks for the update! I'll assign the issue to you but do not feel any pressure to complete it :)

@RaulWCosta RaulWCosta removed their assignment Apr 26, 2024
@RaulWCosta
Copy link

Hey @dhruvmanila, thank you for the clear explanation, but I believe I won't have time to do the fix

@dhruvmanila
Copy link
Member Author

@RaulWCosta Hey, no worries, thanks for informing :)

@dhruvmanila
Copy link
Member Author

We'd appreciate any contributions here. Refer to the issue description to understand what's happening and this comment on the root cause analysis and pointers on how to resolve it. Feel free to ping me for any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Contributions especially welcome
Projects
None yet
Development

No branches or pull requests

2 participants