-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataclasses - Improve the performance of _dataclass_{get,set}state
#103032
Comments
Final timings:
But, since we now do more work during dataclass creation, I wanted to measure this effect as well. Here's my small benchmark: from dataclasses import dataclass
from timeit import timeit
def create_zero():
@dataclass(frozen=True, slots=True)
class ZeroFields:
pass
return ZeroFields
ZeroFields = create_zero()
def create_one():
@dataclass(frozen=True, slots=True)
class OneField:
foo: str
def create_eight():
@dataclass(frozen=True, slots=True)
class EightFields:
foo: str
bar: int
baz: int
spam: list[int]
eggs: dict[str, str]
x: bool
y: bool
z: bool
return EightFields
EightFields = create_eight()
def create_nested():
@dataclass(frozen=True, slots=True)
class NestedFields:
z1: ZeroFields
z2: ZeroFields
e1: EightFields
e2: EightFields
for f in [create_zero, create_one, create_eight, create_nested]:
print("Testing", f.__name__)
res = timeit(f, number=100)
print('Result', res)
print() Here are the results:
So, we are basically trading "startup time" with "runtime time". One more thing, notice Here's my final patch: cls.__getstate__, cls.__setstate__ = _dataclass_states(cls) and: def _dataclass_states(cls):
getters = []
setters = []
for index, f in enumerate(fields(cls)):
getters.append(f'self.{f.name}')
setters.append(f'object.__setattr__(self, "{f.name}", state[{index}])')
getstate = _create_fn(
'__getstate__',
('self',),
[f'return ({", ".join(getters)})'],
)
setstate = _create_fn(
'__setstate__',
('self', 'state'),
setters if setters else ['pass'],
)
return getstate, setstate Please, share your feedback and ideas. |
Maybe an
Note that Codefrom dataclasses import dataclass, fields
from timeit import timeit
from operator import attrgetter
from statistics import mean, stdev
import sys
@dataclass(frozen=True, slots=True)
class EightFields:
foo: str
bar: int
baz: int
spam: list[int]
eggs: dict[str, str]
x: bool
y: bool
z: bool
data = EightFields(
"a", 1, 2, [1, 2, 3, 4, 5], {'a': 'a', 'b': 'b', 'c': 'c'},
True, False, True,
)
def current(self):
return [getattr(self, f.name) for f in fields(self)]
def hardcoded(self):
return self.foo, self.bar, self.baz, self.spam, self.eggs, self.x, self.y, self.z
def attrgetter_on_the_fly(self):
return attrgetter(*map(attrgetter('name'), fields(self)))(self)
attrgetter_prebuilt = attrgetter(*map(attrgetter('name'), fields(data)))
def attrgetter_prebuilt_wrapped(self):
return attrgetter_prebuilt(self)
funcs = current, hardcoded, attrgetter_on_the_fly, attrgetter_prebuilt, attrgetter_prebuilt_wrapped
for f in funcs:
print(f(data))
print()
times = {f: [] for f in funcs}
def stats(f):
ts = [t * 1e9 for t in sorted(times[f])[:5]]
return f'{round(mean(ts)):4} ± {round(stdev(ts)):2} ns '
for _ in range(25):
for f in funcs:
number = 10000
t = timeit(lambda: f(data), number=number) / number
times[f].append(t)
for f in sorted(funcs, key=stats):
print(stats(f), getattr(f, '__name__', 'attrgetter_prebuilt'))
print()
print(sys.version) |
So, we need to avaluate Moreover, simple from operator import attrgetter
def __getstate__(self):
if self.__dataclass_attrgetter__ is None:
return ()
return self.__dataclass_attrgetter__(self)
if fields(cls):
cls.__dataclass_attrgetter__ = attrgetter(*map(attrgetter('name'), fields(cls)))
else:
cls.__dataclass_attrgetter__ = None
cls.__getstate__ = __getstate__ And here are the timings:
With only one generated method (we still have to do something similar to
So, as you can see the creation time with only one method is slower that the hardcode patch and dump times are also worth. Maybe there's something we can improve? But, in the current state is not worth it. |
My first instinct is: "Let's fix Second instinct: Don't check every time, use a function instead of from operator import attrgetter
def __getstate__(self):
return self.__dataclass_attrgetter__(self)
if fields_ := fields(cls):
cls.__dataclass_attrgetter__ = attrgetter(*map(attrgetter('name'), fields_))
else:
cls.__dataclass_attrgetter__ = lambda _: ()
cls.__getstate__ = __getstate__ Can you tell why the wrapper is needed?
That looks odd. Does that mean that creating the attrgetter took negative time? |
I think I've just messed something up while copy-pasting 🤔 |
Actually, without the extra attribute (and calling from operator import attrgetter
if fields_ := fields(cls):
getter = attrgetter(*map(attrgetter('name'), fields_))
cls.__getstate__ = lambda self: getter(self)
else:
cls.__getstate__ = lambda self: () |
Every dataclass must be created; many dataclasses are never pickled at all. So I am not super excited about trading creation time for pickling performance (even though the latter could happen many times per class.) Haven't followed all of the numbers or options here (looks like we don't have reliable numbers for the latest attrgetter option(s) yet?) but IMO for this to be worth it we should have very small impact on dataclass creation time. |
Feature or enhancement
I've noticed this comment yesterday:
cpython/Lib/dataclasses.py
Lines 1128 to 1131 in 1fd603f
So, out of curiosity I've decided to try this out. How fast can I make it?
The results are here:
Here's the simple benchmark that I was using:
Here's the very rough version of what I am planning to do:
Things to do:
__setstate__
similar supportDoes it look like a good enough speed up to make this change?
CC @ericvsmith and @carljm
The text was updated successfully, but these errors were encountered: