Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perflint] fix invalid hoist in perf401 #14369

Merged
merged 22 commits into from
Dec 12, 2024

Conversation

w0nder1ng
Copy link
Contributor

@w0nder1ng w0nder1ng commented Nov 15, 2024

This should fix #14362. This new fix currently deletes lines like this:

- tmp = 1; result = []
- for i in range(10):
-   result.append(i+1)
+ result = [i+1 for i in range(10)]

Is there a convenient way to get every statement within a given TextRange to detect when this is happening?

Copy link
Contributor

github-actions bot commented Nov 15, 2024

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+18 -20 violations, +0 -0 fixes in 5 projects; 50 projects unchanged)

apache/airflow (+6 -6 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3074:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3074:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use a list comprehension to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use `list.extend` to create a transformed list
+ docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use a list comprehension to create a transformed list
- providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use `list.extend` with an async comprehension to create a transformed list
+ providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use an async list comprehension to create a transformed list
- scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use `list.extend` to create a transformed list
+ scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use a list comprehension to create a transformed list

apache/superset (+8 -8 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- superset/db_engine_specs/base.py:107:9: PERF401 Use `list.extend` to create a transformed list
+ superset/db_engine_specs/base.py:107:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:245:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:245:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:251:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:251:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:261:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:261:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:281:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:281:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:293:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:293:9: PERF401 Use a list comprehension to create a transformed list
+ superset/tasks/cache.py:208:13: PERF401 Use `list.extend` to create a transformed list
- superset/tasks/cache.py:208:13: PERF401 Use a list comprehension to create a transformed list
+ tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use `list.extend` to create a transformed list
- tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use a list comprehension to create a transformed list

bokeh/bokeh (+1 -3 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- src/bokeh/plotting/_figure.py:479:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/plotting/_figure.py:485:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/server/contexts.py:310:17: PERF401 Use `list.extend` to create a transformed list
+ src/bokeh/server/contexts.py:310:17: PERF401 Use a list comprehension to create a transformed list

latchbio/latch (+3 -3 violations, +0 -0 fixes)

- src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use `list.extend` to create a transformed list
+ src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use a list comprehension to create a transformed list
- src/latch/registry/utils.py:70:13: PERF401 Use `list.extend` to create a transformed list
+ src/latch/registry/utils.py:70:13: PERF401 Use a list comprehension to create a transformed list
- src/latch_cli/services/cp/utils.py:54:9: PERF401 Use `list.extend` to create a transformed list
+ src/latch_cli/services/cp/utils.py:54:9: PERF401 Use a list comprehension to create a transformed list

pandas-dev/pandas (+0 -0 violations, +0 -0 fixes)


Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
PERF401 38 18 20 0 0

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+18 -20 violations, +0 -0 fixes in 4 projects; 51 projects unchanged)

apache/airflow (+6 -6 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3074:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3074:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use a list comprehension to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use `list.extend` to create a transformed list
+ docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use a list comprehension to create a transformed list
- providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use `list.extend` with an async comprehension to create a transformed list
+ providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use an async list comprehension to create a transformed list
- scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use `list.extend` to create a transformed list
+ scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use a list comprehension to create a transformed list

apache/superset (+8 -8 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- superset/db_engine_specs/base.py:107:9: PERF401 Use `list.extend` to create a transformed list
+ superset/db_engine_specs/base.py:107:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:245:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:245:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:251:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:251:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:261:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:261:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:281:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:281:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:293:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:293:9: PERF401 Use a list comprehension to create a transformed list
+ superset/tasks/cache.py:208:13: PERF401 Use `list.extend` to create a transformed list
- superset/tasks/cache.py:208:13: PERF401 Use a list comprehension to create a transformed list
+ tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use `list.extend` to create a transformed list
- tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use a list comprehension to create a transformed list

bokeh/bokeh (+1 -3 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- src/bokeh/plotting/_figure.py:479:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/plotting/_figure.py:485:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/server/contexts.py:310:17: PERF401 Use `list.extend` to create a transformed list
+ src/bokeh/server/contexts.py:310:17: PERF401 Use a list comprehension to create a transformed list

latchbio/latch (+3 -3 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use `list.extend` to create a transformed list
+ src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use a list comprehension to create a transformed list
- src/latch/registry/utils.py:70:13: PERF401 Use `list.extend` to create a transformed list
+ src/latch/registry/utils.py:70:13: PERF401 Use a list comprehension to create a transformed list
- src/latch_cli/services/cp/utils.py:54:9: PERF401 Use `list.extend` to create a transformed list
+ src/latch_cli/services/cp/utils.py:54:9: PERF401 Use a list comprehension to create a transformed list

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
PERF401 38 18 20 0 0

@charliermarsh charliermarsh added bug Something isn't working fixes Related to suggested fixes for violations labels Nov 16, 2024
@Skylion007
Copy link
Contributor

Skylion007 commented Nov 17, 2024

@w0nder1ng Another fun edge case that PR #14369 doesn't currently fix. Scoping rules are different between list comprehensions and for loops:

another example from pytorch

     if kwargs is None:
         kwargs = {}
-    impl_args = []
-    for a in args:
-        impl_args.append(_helper(a, map_fn))
+    impl_args = [_helper(a, map_fn) for a in args]
     impl_kwargs = {}
     for k in kwargs.keys():
         impl_kwargs[k] = _helper(a, map_fn)

the a on the line impl_kwargs[k] = _helper(a, map_fn) is now undefined (and will error with F821) since the list comprehension was converted from a loop. RUFF immeaditely flags this with an F821 error, and I test and confirm the temporary variable a does not leave the listcomp scope, while it does leave the for loop one.

@w0nder1ng If you want a good test bed, just run the autofix on PyTorch (or another large codebase) and see after applying the fixes if any other ruff rule violations are immediately detected.

@Skylion007
Copy link
Contributor

Skylion007 commented Nov 17, 2024

Also another really minor nit is that it can duplicate comments.

-        for dtype in ["f16", "bf16"]:
-            kernels.append(
-                cls(
-                    aligned=True,
-                    dtype=dtype,
-                    sm_range=(80, SM[SM.index(80) + 1]),
-                    apply_dropout=False,
-                    preload_mmas=True,
-                    block_i=128,
-                    block_j=64,
-                    max_k=96,
-                    # Sm80 has a faster kernel for this case
-                    dispatch_cond="cc == 86 || cc == 89",
-                )
+        # Sm80 has a faster kernel for this case
+        kernels.extend(
+            cls(
+                aligned=True,
+                dtype=dtype,
+                sm_range=(80, SM[SM.index(80) + 1]),
+                apply_dropout=False,
+                preload_mmas=True,
+                block_i=128,
+                block_j=64,
+                max_k=96,
+                # Sm80 has a faster kernel for this case
+                dispatch_cond="cc == 86 || cc == 89",
             )
+            for dtype in ["f16", "bf16"]
+        )

See here.

@w0nder1ng
Copy link
Contributor Author

Another fun edge case that PR #14369 doesn't currently fix

I think in this case, the lint shouldn't apply at all. If applying the fix breaks the code, then someone manually doing the same thing also wouldn't work. It should probably check that all references to the loop variable are inside the for loop before reporting the lint.

I'll also try to fix the duplicate comments. I'm starting to see why this didn't have a fix before :)

@w0nder1ng
Copy link
Contributor Author

The comment duplication happened when the comment was inside the append body, it should be fixed now.

// ```
let for_loop_target = checker
.semantic()
.lookup_symbol(id.as_str())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, resolve_name returns None for the for-loop target. Also, references to it outside of the for-loop are not included in its list of references; e.g.

def f():
    result = []
    for val in range(5):
        result.append(val * 2)
    print(val) # this reference is not included in the list of references

Copy link
Member

@MichaReiser MichaReiser Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that we build the semantic model lazily as we traverse the AST for checking and it hasn't reached the point after the for loop yet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charliermarsh any suggestions on how to best find all usages of target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a working version would look something like this:

let for_loop_read = checker.semantic().resolve_load(for_loop_target_name_expr);
let for_loop_binding = match for_loop_read {
    ReadResult::Resolved(id) => checker.semantic().binding(id),
    _ => unreachable!("for loop target must exist"),
};

Unfortunately, resolve_load requires an &mut SemanticModel and I don't see a way to get one from the checker. Am I missing something here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't look deeply, but often in those cases you need to make it a deferred rule, so it runs after the semantic model has been built (e.g., run it in deferred_for_loops).

@MichaReiser
Copy link
Member

Could you take a look why https://github.com/apache/superset/blob/e528cb48c44543c14c1ac9a93528b147bcaecfde/scripts/benchmark_migration.py#L128 is no longer reported. I might have missed something obvious but it isn't clear to me why the diagnostic isn't reported anymore.

@w0nder1ng
Copy link
Contributor Author

w0nder1ng commented Nov 18, 2024

I suspect the regressions are because of the resolve_name currently being a lookup_symbol. In the example you gave, I think it's finding a different foreign_key binding on line 113 and concluding that the symbol is used outside of the for loop. Once the binding has the right scope, the only regressions should be ones where the lint shouldn't have applied.

@Skylion007
Copy link
Contributor

Could you take a look why https://github.com/apache/superset/blob/e528cb48c44543c14c1ac9a93528b147bcaecfde/scripts/benchmark_migration.py#L128 is no longer reported. I might have missed something obvious but it isn't clear to me why the diagnostic isn't reported anymore.

Because the lambda arg here https://github.com/apache/superset/blob/e528cb48c44543c14c1ac9a93528b147bcaecfde/scripts/benchmark_migration.py#L131C28-L131C33 is shadowing the variable "model" in the for loop. In this specific case, it's probably okay since it's in a lambda arg, but in general it shouldn't apply the fix there. I suppose only references rvalues are problematic outside of the forloop, not lvalues (if all they do is immediately get overwritten after all).

@Skylion007
Copy link
Contributor

Okay hopefully last nit:

     reduction_axes: List[int] = []
-    for i in range(input_rank):
-        if i != axis:
-            reduction_axes.append(i)
+    reduction_axes.extend(i for i in range(input_rank) if i != axis)

Doesn't seem to like to hoist if there are any type_hints on the original [] instantiation. Is this intentional?

@w0nder1ng
Copy link
Contributor Author

I'll take a look

@w0nder1ng
Copy link
Contributor Author

w0nder1ng commented Nov 18, 2024

Annotated assigns have a different statement type than normal assigns, and I wasn't handling it. The fix should work on type-annotated lists now.

@Skylion007
Copy link
Contributor

@w0nder1ng Another fun edge case that PR #14369 doesn't currently fix. Scoping rules are different between list comprehensions and for loops:

another example from pytorch

     if kwargs is None:
         kwargs = {}
-    impl_args = []
-    for a in args:
-        impl_args.append(_helper(a, map_fn))
+    impl_args = [_helper(a, map_fn) for a in args]
     impl_kwargs = {}
     for k in kwargs.keys():
         impl_kwargs[k] = _helper(a, map_fn)

the a on the line impl_kwargs[k] = _helper(a, map_fn) is now undefined (and will error with F821) since the list comprehension was converted from a loop. RUFF immeaditely flags this with an F821 error, and I test and confirm the temporary variable a does not leave the listcomp scope, while it does leave the for loop one.

@w0nder1ng If you want a good test bed, just run the autofix on PyTorch (or another large codebase) and see after applying the fixes if any other ruff rule violations are immediately detected.

This one is still occurring sadly. Maybe because it references another loop lol?

@Skylion007
Copy link
Contributor

Skylion007 commented Nov 19, 2024

Despite this minor false positive, looks like all the other fixes worked well on the PyTorch codebase (for the torch/torchgen folders) pytorch/pytorch#140980. 👍

@w0nder1ng
Copy link
Contributor Author

I'm still stuck on two things.

  1. How can I determine whether the binding statement has another statement on the line?
result = []; test = [] # deleting this whole line is wrong, but only deleting the binding statement also is wrong
# since it would leave the comment if there were no other statement
for ...
  1. How can I get the semantic model to find references after the for loop?
result = []
for i in range(10):
  result.append(i+1)
print(i) # "fixing" this for loop is invalid because something relies on the loop variable, but the semantic model doesn't see this reference yet

@Skylion007
Copy link
Contributor

Skylion007 commented Nov 25, 2024

@diceroll123 @MichaReiser Any idea how to tackle these edge cases?

@MichaReiser
Copy link
Member

@Skylion007 no, not from the top of my head and I haven't had time yet to look into it more closely

@diceroll123
Copy link
Contributor

diceroll123 commented Nov 25, 2024

I did not look into it any further after my last comment on #14551.

This PR and my PR contain a bunch of overlapping changes (while addressing different bugs), and I don't want to step on any toes or spend more time on it if it's not going to be considered, so if one or the other is merged, or some kind of plan is made, we can go from there I suppose! No strong feelings on my part. 😄

@Skylion007
Copy link
Contributor

One other place this autofix breaks (but ruff does appropriately catch it!). Apply it to sympy/tensor/index_methods.py in sympy. Fixing this loop breaks: https://github.com/sympy/sympy/blob/ba68537e73dd40fb34fa9c5aaf46c5a6e1ef12c7/sympy/tensor/index_methods.py#L427 . I suspect because it's using unusual python syntax to iterate over an implicit tuple.

for d in dbase, dexp:

transforming that to a list comprehension likely breaks

I will say that even it's current form, this autofix is extremely useful, and the encountered edge cases are very rare in real code, and usually caught by other checks.

@Skylion007
Copy link
Contributor

This unaccepted rule would likely have to tackle many of the same problems. Did they solve this edge case? #11769

@w0nder1ng
Copy link
Contributor Author

I think that's all of the edge cases, except for the one involving the loop variable. I'd be happy to try to convert the rule into a binding rule, but I think that might be outside of the scope of this PR.

@w0nder1ng
Copy link
Contributor Author

Even after moving the lint into deferred_for_loops, it still doesn't properly find the loop variable:

def f():
    result = []
    for i in range(5):
    #   ^ running checker.semantic().resolve_name() on this ExprName returns None.
        result.append(i * 2)
        #             ^ when I run binding.statement() on this binding, 
        #               it returns the "i = 1" statement, even though that is clearly wrong
    i = 1
  # ^ lookup_symbol returns this i
    print(i) 

Am I missing something here, or is this behavior incorrect? See the last commit for the code I used to get this result.

@dhruvmanila
Copy link
Member

    for i in range(5):
    #   ^ running checker.semantic().resolve_name() on this ExprName returns None.

That might be because i is itself the definition. What I see in the semantic model is that an entry in the resolved_names is created only by the resolve_load function which would work on the ExprName with the ExprContext::Load context. But, here the i variable that is the for target would have the ExprContext::Store context (https://play.ruff.rs/44c3909e-4f46-4a46-bd6e-36ea78223322) which is why I think you're getting None there.

let binding_string = checker
.locator()
.slice(for_loop_target.statement(checker.semantic()).unwrap());
dbg!(binding_string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dbg!(binding_string);

@MichaReiser
Copy link
Member

MichaReiser commented Dec 9, 2024

^ lookup_symbol returns this i

This is because the semantic model now captures the state after running the entire program. It's complete, and running lookup_symbol gives you the most recent binding for i after evaluating the entire function scope. That means you now have to do some extra work to get the for-loop target binding:

    let last_target_binding = checker
        .semantic()
        .lookup_symbol(id)
        .expect("For loop to exist");

    let mut bindings = [last_target_binding].into_iter().chain(
        checker
            .semantic()
            .shadowed_bindings(checker.semantic().scope_id, last_target_binding)
            .filter_map(|shadowed| shadowed.same_scope().then_some(shadowed.shadowed_id())),
    );

    let target_binding = bindings
        .find_map(|binding_id| {
            let binding = checker.semantic().binding(binding_id);

            if dbg!(binding.statement(checker.semantic())).and_then(Stmt::as_for_stmt)
                == Some(for_stmt)
            {
                Some(binding)
            } else {
                None
            }
        })
        .expect("for target binding must exist");

    drop(bindings);

There might be a better way to detect if the binding belongs to the target that e.g. avoids picking up named expressions in the iterator part.

})
.expect("for target binding must exist");
let target_binding = checker.semantic().binding(target_binding_id);
// TODO: should this be a HashMap?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this a vector, but if there's enough references, it might make more sense for it to be a HashMap instead. What do you think?

@MichaReiser
Copy link
Member

MichaReiser commented Dec 10, 2024

It seems we're now skipping over the two result.append usages in https://github.com/bokeh/bokeh/blob/829b2a75c402d0d0abd7e37ff201fbdfd949d857/src/bokeh/plotting/_figure.py#L485

Is this intentional? Should we change the implementation not to provide a fix but still the usage?

Edit: I think this is intentional because kw is shadowed.

Comment on lines 259 to 278
let shadowed_references: Vec<_> = checker
.semantic()
.shadowed_bindings(checker.semantic().scope_id, target_binding_id)
.flat_map(|shadowed| {
let shadowed_binding = checker.semantic().binding(shadowed.shadowed_id());
shadowed_binding.references()
})
.collect();

drop(bindings);

if target_binding
.references()
.filter(|r_id| !shadowed_references.contains(r_id))
.map(|reference| checker.semantic().reference(reference))
.any(|r| !for_stmt.range.contains_range(r.range()))
{
return;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this code is doing. Would you mind adding a comment explaining why looking at the shadowed_references is necessary?

@w0nder1ng
Copy link
Contributor Author

It seems we're now skipping over the two result.append usages in https://github.com/bokeh/bokeh/blob/829b2a75c402d0d0abd7e37ff201fbdfd949d857/src/bokeh/plotting/_figure.py#L485

Is this intentional? Should we change the implementation not to provide a fix but still the usage?

Edit: I think this is intentional because kw is shadowed.

It would be shadowed if there weren't a return. A smaller example:

def f(switch):
    i = 1
#   ^ this is the `i` which actually gets returned
    if switch:
        items = [1, 2, 3, 4]
        result = []
        #   v if it weren't for the return, this `i` would be the binding
        for i in items:
            result.append(i + 1)
        # v unconditional return means that the binding is not actually relevant
        return result
    #       v fix thinks this `i` is the loop variable
    return [i]

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uff, this rule is complicated. Thanks for pushing through it

I made it through the detection logic and pushed a few smaller simplifications. I'm now about to review the fix and noticed the parse call. I suggest to either not provide a fix in the multi assignment case (it's not that common) or to use the SimpleTokenizer to avoid using the full parser here. It might even simplify some of your code because it allows e.g. searching for the ;

Comment on lines 462 to 474
let binding_text = locator
.slice(locator.full_lines_range(binding_stmt_range))
.trim_whitespace_start();
let tokens = parse(binding_text, Mode::Module).map_err(|err| {
anyhow!(
"Failed to tokenize binding statement: {err} {}",
binding_text
)
})?;
tokens
.tokens()
.iter()
.any(|token| matches!(token.kind(), TokenKind::Semi))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards not providing a fix in this case to avoid the extra complexity.

Either way, we should avoid parsing here. We can use the SimpleTokenizer if we need tokenization and can't just use the AST

@w0nder1ng
Copy link
Contributor Author

w0nder1ng commented Dec 11, 2024

I've removed the tokenization without changing the rest of the code. The code now iterates over the characters directly because I was having trouble with the SimpleTokenizer not properly tokenizing some lines.

Is it worth also removing the code to handle multiple statements to simplify the logic?

@MichaReiser
Copy link
Member

The code now iterates over the characters directly because I was having trouble with the SimpleTokenizer not properly tokenizing some lines.

I pushed a change that uses the SimpleTokenizer and BackwardsTokenizer instead. That also revealed that we should remove any leading ;.

But I think we're good now and can land this change. Thank you for working on this very involved. rule

@MichaReiser MichaReiser merged commit 2eac00c into astral-sh:main Dec 12, 2024
21 checks passed
@Skylion007
Copy link
Contributor

Beautiful job landing this rule @w0nder1ng . Any plans on tackling the dict comprehension version since the logic is very similar?

@w0nder1ng
Copy link
Contributor Author

Sure, when I have the opportunity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixes Related to suggested fixes for violations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF401 new preview fixes invalidly hoists extend to list compre
6 participants