-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataclasses - Improve the performance of asdict/astuple for common types and default values #103000
Comments
) Co-authored-by: Carl Meyer <[email protected]> Co-authored-by: Alex Waygood <[email protected]>
Since the PR has been merged, presumably the issue can be closed? |
There was another possible optimisation that was originally included in the PR, but which we decided to postpone to a separate PR, since it was a distinct change that deserved to be considered separately. @DavidCEllis mentioned that he might not be interested in filing a followup PR anymore. But I'd still like to do some more benchmarks before closing this issue — if the second optimization provides a decent speedup, I might file my own PR (listing @DavidCEllis as a co-author). |
The implementation I had originally included the recursion skip for things that weren't 'atomic' if dict_factory is dict:
result = {}
for f in fields(obj):
value = getattr(obj, f.name)
if type(value) not in _ATOMIC_TYPES:
value = _asdict_inner(value, dict_factory)
result[f.name] = value
return result My thinking is that without the recursion skip while you could still write it in the same way: if dict_factory is dict:
result = {}
for f in fields(obj):
result[f.name] = _asdict_inner(getattr(obj, f.name), dict_factory))
return result You would probably write it as a comprehension instead: if dict_factory is dict:
return {f.name: _asdict_inner(value, dict_factory) for f in fields(obj)} This currently introduces additional overhead associated with the implementation of comprehensions. Based on testing a 'walrus' comprehension implementation1 with the recursion skip I would expect this to be slower than the regular loop, though I could be wrong. However if/when PEP-709 for inline comprehensions lands, then I would expect this to no longer matter and the comprehension is clearly a better fit. Given that I probably wouldn't submit a PR until it was possible to compare with PEP-709 implemented. I wouldn't want to reject it as an insignificant improvement if this was due to the comprehension overhead. Footnotes
|
…python#103005) Co-authored-by: Carl Meyer <[email protected]> Co-authored-by: Alex Waygood <[email protected]>
Yeah, I tried this patch out locally: diff --git a/Lib/dataclasses.py b/Lib/dataclasses.py
index 4026c8b779..079c28fae3 100644
--- a/Lib/dataclasses.py
+++ b/Lib/dataclasses.py
@@ -1317,11 +1317,18 @@ def _asdict_inner(obj, dict_factory):
if type(obj) in _ATOMIC_TYPES:
return obj
elif _is_dataclass_instance(obj):
- result = []
- for f in fields(obj):
- value = _asdict_inner(getattr(obj, f.name), dict_factory)
- result.append((f.name, value))
- return dict_factory(result)
+ # fast path for the common case
+ if dict_factory is dict:
+ return {
+ f.name: _asdict_inner(getattr(obj, f.name), dict_factory)
+ for f in fields(obj)
+ }
+ else:
+ result = []
+ for f in fields(obj):
+ value = _asdict_inner(getattr(obj, f.name), dict_factory)
+ result.append((f.name, value))
+ return dict_factory(result) And measured it with this benchmark: from dataclasses import dataclass, asdict
import timeit
@dataclass
class Foo:
x: int
@dataclass
class Bar:
x: Foo
y: Foo
z: Foo
@dataclass
class Baz:
x: Bar
y: Bar
z: Bar
foo = Foo(42)
bar = Bar(foo, foo, foo)
baz = Baz(bar, bar, bar)
print(timeit.timeit(lambda: asdict(baz), number=50_000)) And it does indeed seem to be slower than the current |
…python#103005) Co-authored-by: Carl Meyer <[email protected]> Co-authored-by: Alex Waygood <[email protected]>
Co-authored-by: David Ellis <[email protected]>
That's two great |
* main: pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182) pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364) pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539) pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292) pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202) pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013) pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355) pythongh-103960: Dark mode: invert image brightness (python#103983) pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253) pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354) pythongh-101819: Harden _io init (python#104352) pythongh-103247: clear the module cache in a test in test_importlib/extensions/test_loader.py (pythonGH-104226) pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (python#103849) pythongh-74895: adjust tests to work on Solaris (python#104326) pythongh-101819: Refactor _io in preparation for module isolation (python#104334) pythongh-90953: Don't use deprecated AST nodes in clinic.py (python#104322) pythongh-102327: Extend docs for "url" and "headers" parameters to HTTPConnection.request() pythongh-104328: Fix typo in ``typing.Generic`` multiple inheritance error message (python#104335)
* main: (27 commits) pythongh-87849: fix SEND specialization family definition (pythonGH-104268) pythongh-101819: Adapt _io.IOBase.seek and _io.IOBase.truncate to Argument Clinic (python#104384) pythongh-101819: Adapt _io._Buffered* methods to Argument Clinic (python#104367) pythongh-101819: Refactor `_io` futher in preparation for module isolation (python#104369) pythongh-101819: Adapt _io.TextIOBase methods to Argument Clinic (python#104383) pythongh-101117: Improve accuracy of sqlite3.Cursor.rowcount docs (python#104287) pythonGH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (python#92185) pythongh-104357: fix inlined comprehensions that close over iteration var (python#104368) pythonGH-90208: Suppress OSError exceptions from `pathlib.Path.glob()` (pythonGH-104141) pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182) pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364) pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539) pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292) pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202) pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013) pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355) pythongh-103960: Dark mode: invert image brightness (python#103983) pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253) pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354) pythongh-101819: Harden _io init (python#104352) ...
Feature or enhancement
Improve the performance of asdict/astuple in common cases by making a shortcut for common types that are unaffected by deepcopy in the inner loop. Also special casing for the default
dict_factory=dict
to construct the dictionary directly.The goal here is to improve performance in common cases without significantly impacting less common cases, while not changing the API or output in any way.
Pitch
In cases where a dataclass contains a lot of data of common python types (eg: bool/str/int/float) currently the inner loops for
asdict
andastuple
require the values to be compared to check if they are dataclasses, namedtuples, lists, tuples, and then dictionaries before passing them todeepcopy
. This proposes to special case and shortcut objects of types wheredeepcopy
returns the object unchanged.It is much faster for these cases to instead check for them at the first opportunity and shortcut their return, skipping the recursive call and all of the other comparisons. In the case where this is being used to prepare an object to serialize to JSON this can be quite significant as this covers most of the remaining types handled by the stdlib
json
module.Note: Anything that skips deepcopy with this alteration is already unchanged as
deepcopy(obj) is obj
is always True for these types.Currently when constructing the
dict
for a dataclass, a list of tuples is created and passed to thedict_factory
constructor. In the case where thedict_factory
constructor is the default -dict
- it is faster to construct the dictionary directly.Previous discussion
Discussed here with a few more details and earlier examples: https://discuss.python.org/t/dataclasses-make-asdict-astuple-faster-by-skipping-deepcopy-for-objects-where-deepcopy-obj-is-obj/24662
Code Details
Types to skip deepcopy
This is the current set of types to be checked for and shortcut returned, ordered in a way that I think makes more sense for
dataclasses
than the original ordering copied from thecopy
module. These are known to be safe to skip as they are all sent to_deepcopy_atomic
(which returns the original object) in thecopy
module.Function changes
With that added the change is essentially replacing each instance of
inside
_asdict_inner
, withInstances of subclasses of these types are not guaranteed to have
deepcopy(obj) is obj
so this checks specifically for instances of the base types.Performance tests
Test file: https://gist.github.com/DavidCEllis/a2c2ceeeeda2d1ac509fb8877e5fb60d
Results on my development machine (not a perfectly stable test machine, but these differences are large enough).
Main
Current Main python branch:
Modified
Modified Branch:
Linked PRs
dataclasses.asdict
for the common case #104364The text was updated successfully, but these errors were encountered: