Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ruff --fix] (v0.0.239) When working with numpy arrays mask == True is not the same as mask is True #2443

Closed
FrancescElies opened this issue Feb 1, 2023 · 14 comments
Labels
bug Something isn't working type-inference Requires more advanced type inference.

Comments

@FrancescElies
Copy link

After running ruff --fix the last line of the following script gets reformatted as np.where(mask is True) but in this context is not the same.

import numpy as np

a = np.arange(10)
mask = a > 5
np.where(mask == True)  # equivalent to `mask.nonzero()

Maybe in that context one should not be using np.where(mask == True) but it's equivalent mask.nonzero() instead, at anyrate we have some old code using that pattern.

See below an interactive session showing is not the same.

In [2]: import numpy as np
In [13]: a = np.arange(10)

In [14]: mask = a > 5

In [15]: np.where(mask == True)
Out[15]: (array([6, 7, 8, 9], dtype=int64),)

In [18]: np.where(mask is True)
Out[18]: (array([], dtype=int64),)

What do you think about this corner case?

@charliermarsh charliermarsh added the bug Something isn't working label Feb 1, 2023
@charliermarsh
Copy link
Member

Ah yeah, this has come up before. Unfortunately it's really hard to avoid without disabling the rule entirely, though I'm open to suggestions on heuristics.

My current suggestion has been to disable that rule if you're working in any context that requires boolean comparisons like that.

@spaceone
Copy link
Contributor

spaceone commented Feb 1, 2023

This could potentially also happen with sqlalchemy I guess.
There you have query.where(foo==bar) queries.

@FrancescElies
Copy link
Author

FrancescElies commented Feb 2, 2023

Ah yeah, this has come up before. Unfortunately it's really hard to avoid without disabling the rule entirely, though I'm open to suggestions on heuristics.

Can you point me out in the direction where this code lives?
Which kind of info do we have in that context, what could we use? regex? do we have maybe the variable type available (like pyright or mypy does)?

It looks like {something}.where({avar} == {True|False}) might help but for sure there will be other cases too, thus ... not sure what to suggest.

My current suggestion has been to disable that rule if you're working in any context that requires boolean comparisons like that.

On the other hand at least in our case we have many people coming from the c-embedded world and writing python in a c-ish way and in general I find it also helpful.

@conbrad
Copy link

conbrad commented Feb 16, 2023

This could potentially also happen with sqlalchemy I guess. There you have query.where(foo==bar) queries.

Verified today this is the case.

@conbrad
Copy link

conbrad commented Feb 16, 2023

Potentially something that could be defined as a Trivia: https://github.com/charliermarsh/ruff/blob/main/crates/ruff_python_formatter/src/trivia.rs#L46

E.g. a FilterFunction.

Not sure if it would make more sense to do the analysis at the AST level though, and check for sqlalchemy or numpy parent nodes.

@charliermarsh
Copy link
Member

We could consider ignoring these expressions inside of all .where calls, regardless of whether it's NumPy, SQLAlchemy, etc. We'd be trading off false negatives for false positives.

@conbrad
Copy link

conbrad commented Mar 14, 2023

We could consider ignoring these expressions inside of all .where calls, regardless of whether it's NumPy, SQLAlchemy, etc. We'd be trading off false negatives for false positives.

I think this would wind up chasing a lot of stragglers. E.g. .filter, .having are just a couple of other predicate functions in SQLAlchemy that expect == over is I believe.

@JonathanPlasse
Copy link
Contributor

Maybe this should wait until we can determine the type of objects.

@zanieb
Copy link
Member

zanieb commented Apr 24, 2023

To share another real world example, I encountered this using ruff in a project that uses SQLAlchemy. Ruff fixed comparisons with None to use is/is_not instead of ==/!= but this broke our SQLAlchemy filters. As a solution, I've switched to use of explicit .is_ and .is_not SQLAlchemy operators when doing comparisons with None which seems like better practice anyway.

PrefectHQ/prefect@1b64587

Our usage seems harder to detect as a special case since we construct the list of filters outside of the call. It only seems feasible to avoid by detecting the type of the object in the comparison.

@conbrad
Copy link

conbrad commented Apr 24, 2023

To share another real world example, I encountered this using ruff in a project that uses SQLAlchemy. Ruff fixed comparisons with None to use is/is_not instead of ==/!= but this broke our SQLAlchemy filters. As a solution, I've switched to use of explicit .is_ and .is_not SQLAlchemy operators when doing comparisons with None which seems like better practice anyway.

Wonder if those operators were influenced by the recent language/tooling co-design. I seem to remember having issues with the is operators not behaving the same as the == ones but that may have just been a PEBKAC issue.

@zanieb
Copy link
Member

zanieb commented Apr 25, 2023

#1852 has some previous discussion as well.

@charliermarsh if you'd like I can open a new issue that tracks the general problem replacing == with is and enumerate the datatypes this is an issue for so you can close out duplicates that are for specific cases?

@JonathanPlasse
Copy link
Contributor

Could be labeled type-inference

@zanieb
Copy link
Member

zanieb commented May 21, 2023

Here's the aforementioned tracking issue #4560

@charliermarsh
Copy link
Member

Thanks tons @madkinsz

@charliermarsh charliermarsh closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working type-inference Requires more advanced type inference.
Projects
None yet
Development

No branches or pull requests

6 participants