Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-bugbear] Implement class-as-data-structure (B903) #9601

Merged
merged 18 commits into from
Jan 7, 2025

Conversation

diceroll123
Copy link
Contributor

@diceroll123 diceroll123 commented Jan 21, 2024

Summary

Adds class-as-data-structure rule (B903).

Took some creative liberty with this by allowing the class to have any decorators or base classes. There are years-old issues on pylint that don't approve of the strictness when it comes to these things.

Especially considering that dataclass is a decorator and namedtuple can be a base class. I feel ignoring those explicitly is redundant all things considered, but it's not a hill I'm willing to die on!

See: #970

Test Plan

cargo test

Copy link
Contributor

github-actions bot commented Jan 21, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+40 -0 violations, +0 -0 fixes in 10 projects; 45 projects unchanged)

PlasmaPy/PlasmaPy (+4 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ tests/particles/test_decorators.py:437:5: B903 Class could be dataclass or namedtuple
+ tests/particles/test_decorators.py:456:5: B903 Class could be dataclass or namedtuple
+ tests/particles/test_decorators.py:494:5: B903 Class could be dataclass or namedtuple
+ tests/particles/test_decorators_annotations.py:16:1: B903 Class could be dataclass or namedtuple

apache/airflow (+3 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ airflow/www/security_appless.py:27:1: B903 Class could be dataclass or namedtuple
+ providers/src/airflow/providers/celery/executors/celery_executor_utils.py:200:1: B903 Class could be dataclass or namedtuple
+ providers/src/airflow/providers/google/cloud/operators/dataflow.py:64:1: B903 Class could be dataclass or namedtuple

apache/superset (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ superset/views/base_api.py:133:1: B903 Class could be dataclass or namedtuple

langchain-ai/langchain (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ libs/core/tests/unit_tests/tracers/test_langchain.py:103:5: B903 Class could be dataclass or namedtuple

pandas-dev/pandas (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ pandas/io/sas/sas7bdat.py:91:1: B903 Class could be dataclass or namedtuple

pypa/pip (+2 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ tests/lib/__init__.py:1347:1: B903 Class could be dataclass or namedtuple
+ tests/unit/test_network_auth.py:331:5: B903 Class could be dataclass or namedtuple

python/mypy (+7 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ misc/perf_checker.py:14:1: B903 Class could be dataclass or namedtuple
+ mypy/nodes.py:659:1: B903 Class could be dataclass or namedtuple
+ mypy/plugins/attrs.py:96:1: B903 Class could be dataclass or namedtuple
+ mypy/stubgen.py:182:1: B903 Class could be dataclass or namedtuple
+ mypy/stubutil.py:326:1: B903 Class could be dataclass or namedtuple
+ mypy/test/teststubtest.py:189:1: B903 Class could be dataclass or namedtuple
+ mypyc/ir/class_ir.py:453:1: B903 Class could be dataclass or namedtuple

reflex-dev/reflex (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ tests/units/utils/test_serializers.py:53:5: B903 Class could be dataclass or namedtuple

tiangolo/fastapi (+18 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ docs_src/dependencies/tutorial002.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial002_an.py:12:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial002_an_py310.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial002_an_py39.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial002_py310.py:9:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial003.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial003_an.py:12:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial003_an_py310.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial003_an_py39.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial003_py310.py:9:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial004.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial004_an.py:12:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial004_an_py310.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial004_an_py39.py:11:1: B903 Class could be dataclass or namedtuple
+ docs_src/dependencies/tutorial004_py310.py:9:1: B903 Class could be dataclass or namedtuple
+ docs_src/python_types/tutorial010.py:1:1: B903 Class could be dataclass or namedtuple
+ tests/test_jsonable_encoder.py:17:1: B903 Class could be dataclass or namedtuple
+ tests/test_jsonable_encoder.py:22:1: B903 Class could be dataclass or namedtuple

zulip/zulip (+2 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ confirmation/models.py:252:1: B903 Class could be dataclass or namedtuple
+ zerver/lib/markdown/__init__.py:390:1: B903 Class could be dataclass or namedtuple

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
B903 40 40 0 0 0

@diceroll123 diceroll123 force-pushed the add-R0903 branch 2 times, most recently from 4894b8f to 366bd8e Compare January 21, 2024 22:57
@diceroll123 diceroll123 marked this pull request as draft January 21, 2024 23:47
@diceroll123 diceroll123 marked this pull request as ready for review January 22, 2024 00:30
@diceroll123 diceroll123 force-pushed the add-R0903 branch 2 times, most recently from 02b14f3 to 40bc3c0 Compare January 22, 2024 00:38
@zanieb
Copy link
Member

zanieb commented Jan 24, 2024

Hi! Thanks for implementing this.

I'm hesitant to include this rule in Ruff. It looks like "too few methods" is used as a proxy for "this should be a dataclass" — it seems like a rule like "it only has an __init__, it should be a dataclass" would make more sense? Are there more use cases for this rule?

Separately, there's a false positive here for nested classes used for configuration — perhaps those should be excluded from the rule?

@MichaReiser MichaReiser added the rule Implementing or modifying a lint rule label Jan 24, 2024
@diceroll123
Copy link
Contributor Author

Hi! Thanks for implementing this.

I'm hesitant to include this rule in Ruff. It looks like "too few methods" is used as a proxy for "this should be a dataclass" — it seems like a rule like "it only has an __init__, it should be a dataclass" would make more sense? Are there more use cases for this rule?

Separately, there's a false positive here for nested classes used for configuration — perhaps those should be excluded from the rule?

Tweaked accordingly!

Changing up the name of the rule slightly may be in order, I figure? 🤔 Please advise!

@diceroll123 diceroll123 force-pushed the add-R0903 branch 3 times, most recently from d51d701 to 514affe Compare January 25, 2024 02:06
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left an inline comment with a rule name suggestion. @AlexWaygood do you have ideas/opinions on a name for this rule?

@AlexWaygood
Copy link
Member

@AlexWaygood do you have ideas/opinions on a name for this rule?

Possibly something like consider-using-dataclass could work well... but there might be other reasons you might want to consider using a dataclass, of course (such as code generation for several useful methods like __repr__, __eq__, etc.)!

I think the current name is okay, albeit a bit verbose, and there is some advantage to it having the same name as the original pylint rule

@MichaReiser
Copy link
Member

MichaReiser commented Feb 26, 2024

Possibly something like consider-using-dataclass could work well... but there might be other reasons you might want to consider using a dataclass, of course (such as code generation for several useful methods like repr, eq, etc.)!

I like the name but it, unfortunately, doesn't fit our rule naming convention

I think the current name is okay, albeit a bit verbose, and there is some advantage to it having the same name as the original pylint rule

We can use the same rule code as pylint, but I would prefer another name that's more explicit about its motivation. Having a different name might also help to set expectations, considering that we're probably changing the semantics of the rule.

@AlexWaygood
Copy link
Member

Maybe something like NonDataclassSimpleClass?

@MichaReiser
Copy link
Member

This is interesting #10146 I haven't looked at how the two rules overlap but maybe there's an opportunity.

@AlexWaygood
Copy link
Member

This is interesting #10146 I haven't looked at how the two rules overlap but maybe there's an opportunity.

That is interesting, indeed... it's flagging a different antipattern, but one that has the same solution

@diceroll123
Copy link
Contributor Author

Let me know if there's any action items blocking this PR, not sure what to make of these comments tbh

@MichaReiser MichaReiser added packaging Package management related, i.e. uv accepted Ready for implementation and removed packaging Package management related, i.e. uv labels Apr 5, 2024
@MichaReiser
Copy link
Member

Note for reviewing: The rule itself is a style rule but we need to find a good name that expresses the motivation to use dataclasses/named tuples instead of enforcing a number of public methods. The name should read as allow(rule-name).

But the intent of the rule itself fits into ruff as a style rule.

@Rizhiy
Copy link

Rizhiy commented Sep 9, 2024

@zanieb @MichaReiser Is there any progress on merging this? I think it would be a beneficial rule to add.

In addition to suggesting data-classes this also covers the case where people create empty "base" classes, which is an antipattern in my opinion.

Ideally, I would also like to have a rule which flags classes with __init__ and one public classes (no decorators, no parent classes). I believe that is another antipattern, such classes are frequently better refactored to functions.

@MichaReiser
Copy link
Member

@zanieb @MichaReiser Is there any progress on merging this? I think it would be a beneficial rule to add.

Not much other than that we have to come up with a good rule name that captures the motivation. Do you have any suggestions for a good rule name?

@Rizhiy
Copy link

Rizhiy commented Sep 10, 2024

Not much other than that we have to come up with a good rule name that captures the motivation. Do you have any suggestions for a good rule name?

What is the naming convention/naming rules?

IMHO, there are three different rules here, they can be named:

  • For empty base class: unnecessary-base-class
  • For dataclass case (no public methods except __init__): prefer-dataclass
  • For class with __init__ and one public method: potential-obfuscated-function

@dylwil3 dylwil3 changed the title [pylint] - Implement too-few-public-methods rule (PLR0903) [flake8-bugbear] Implement class-as-data-structure rule (B903) Jan 3, 2025
@dylwil3 dylwil3 changed the title [flake8-bugbear] Implement class-as-data-structure rule (B903) [flake8-bugbear] Implement class-as-data-structure (B903) Jan 3, 2025
@dylwil3
Copy link
Collaborator

dylwil3 commented Jan 3, 2025

Just some updates here:

  1. I've made the rule rather strict, much closer to the bugbear implementation, since there seemed to be quite a few false positives in the ecosystem results. There were classes in the ecosystem results where the suggestion to replace with a namedtuple or dataclass didn't make any sense because there were things happening in the __init__ function that were too complicated to appear in either of these structures.
  2. With that in mind, I've moved the rule to B903 and changed the name to better reflect the behavior.

Gonna check out the ecosystem results and possibly update the documentation, but otherwise I think this is close to done unless there are any objections!

Update: Ecosystem results look much better to me now!

@dylwil3 dylwil3 merged commit 78e26ce into astral-sh:main Jan 7, 2025
21 checks passed
@dylwil3
Copy link
Collaborator

dylwil3 commented Jan 7, 2025

Thanks for implementing this @diceroll123 and sorry for the delay on getting it over the line!

dcreager added a commit that referenced this pull request Jan 7, 2025
* main:
  Use uv consistently throughout the documentation (#15302)
  [red-knot] Eagerly normalize `type[]` types (#15272)
  [`pyupgrade`] Split `UP007` to two individual rules for `Union` and `Optional` (`UP007`, `UP045`) (#15313)
  [red-knot] Improve symbol-lookup tracing (#14907)
  [red-knot] improve type shrinking coverage in red-knot property tests (#15297)
  [`flake8-return`] Recognize functions returning `Never` as non-returning (`RET503`) (#15298)
  [`flake8-bugbear`] Implement `class-as-data-structure` (`B903`) (#9601)
  Avoid treating newline-separated sections as sub-sections (#15311)
  Remove call when removing final argument from `format` (#15309)
  Don't enforce `object-without-hash-method` in stubs (#15310)
  Don't special-case class instances in binary expression inference (#15161)
  Upgrade zizmor to the latest version in CI (#15300)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants