-
-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subclassing support #312
Subclassing support #312
Conversation
CI fails, but I don't think this is due to the PR code AFAICT. |
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## main #312 +/- ##
==========================================
+ Coverage 97.47% 97.55% +0.08%
==========================================
Files 20 21 +1
Lines 1582 1675 +93
==========================================
+ Hits 1542 1634 +92
- Misses 40 41 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put some comments on what I think deserves some additional scrutiny.
tests/conftest.py
Outdated
@@ -13,7 +13,10 @@ def genconverter(request): | |||
|
|||
@pytest.fixture(params=(True, False)) | |||
def converter(request, converter_cls): | |||
return converter_cls(detailed_validation=request.param) | |||
args = dict(detailed_validation=request.param) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change means that in the parametrized tests for inheritance, we always test Converter(detailed_validation=True, include_subclasses=True)
and Converter(detailed_validation=False, include_subclasses=False)
. I don't think the two features (subclass support and detailed validation support) interact each other, even in the structuring side. If they were interacting, we should also test Converter(detailed_validation=False, include_subclasses=True)
and the other way around...
src/cattrs/gen.py
Outdated
if getattr(converter, "include_subclasses", False): | ||
cl_tree = tuple(_make_class_tree(cl)) | ||
lines += [ | ||
" if '__cattrs_already_recursed_subclasses__' in o:", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I am shoving my own structuring state in the dictionary to be structured with this hidden key. The goal is to make sure I pass in the structuring function or one of the subclasses function exactly twice for the given dictionary.
If there is no mechanism to keep the stack state this code will end in an infinite recursion. Unfortunately I didn't see a practical way to do it except by putting cattrs
private state in the data structure given by the user which sounds hacky...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying to myself: this cannot stay as-is. If the structuring fails we pollute the original mapping with those weird looking keys (at least we know where it comes from...). This bit me in unit tests where failing tests polluted the reference unstructured dictionaries used for parametrized tests and in turn broke even more tests. Plus it obviously does not work if the mapping is immutable...
I am thinking to put the mapping in a custom class like:
@define
class _cattrsStackMsg:
msg: Mapping[str, Any]
origin: str
That is used to carry the mapping over. This requires a new predicate though, but I think I can make it work at the price of an additional complexity.
First question: would you be open to the possibility of limiting this new behavior to individual classes, instead of a flag on the Converter instead? So you'd do: from cattrs import Converter
c = Converter()
c.customize(Parent, _cattrs_include_subclasses=True) ( I'm trying not to overload the Converter |
Yes, no problem to have this feature opt-in by class type. I had thought about it myself before, but I wasn't sure how to do it so I went with the global flag approach. I did not mention it before, but I think the current unstructure behavior of EDIT This code is not demonstrating the problem, see my next message for the correct demonstration. from attrs import defin
from cattrs import Converter, BaseConverter
@define
class A:
a: int
@define
class B(A):
b: int
c = Converter()
c.unstructure(B(1, 2)) # {"a": 1} --> Silently unstructured as A
bc = BaseConverter()
bc.unstructure(B(1, 2)) # {"a": 1, "b": 2} --> OK. This is what I think everyone would expect. |
I'm not seeing that result though: from attrs import define
from cattrs import Converter
@define
class A:
a: int
@define
class B(A):
b: int
c = Converter()
print(c.unstructure(B(1, 2))) # Outputs `{'a': 1, 'b': 2}` |
Yes, sorry, you are perfectly right, this what happens when you write untested code from memory... Here is the code I should have posted instead which is less bad as the one I posted above: from attrs import defin
from cattrs import Converter, BaseConverter
@define
class A:
a: int
@define
class B(A):
b: int
c = Converter()
# Happily and exactly unstructured as A even if B was given. No warning, no error, but not fundamentally wrong either.
print(c.unstructure(B(1, 2), unstructure_as=A)) # {"a": 1}
bc = BaseConverter()
# BaseConverter on the other hand behaves in what I think several people (myself included) would expect.
bc.unstructure(B(1, 2), unstructure_as=A) # {"a": 1, "b": 2} Obviously the situation is kind made-up but the real-world situation is when you have a 3rd class: @define
class C:
a: A
print(c.unstructure(C(B(1, 2)))) # {"a": {"a": 1}}
print(bc.unstructure(C(B(1, 2)))) # {"a": {"a": 1, "b": 2}} In that situation I think a lot of people would agree that |
(Unrelated to the discussion: I've upgraded Poetry on the main branch to use version 1.2.2 which may (hopefully) fix the issue that's stopping the CI, so please rebase.) I feel like we're getting bogged down here discussing what is essentially an issue of run-time vs compile-time dispatch. The approach that the What you want here is the ability to selectively opt-in into runtime dispatch for a class. Here's a code snippet that works: from attrs import define
from cattrs import Converter
@define
class A:
a: int
@define
class B(A):
b: int
@define
class C:
a: A
c = Converter()
# This hook is for instances of A, but not instances of subclasses.
base_hook = c.gen_unstructure_attrs_fromdict(A)
def unstructure_a(val: A) -> dict:
"""
If val is an instance of `A`, use the hook.
If val is an instance of a subclass, dispatch on its exact
runtime type.
"""
if val.__class__ is A:
return base_hook(val)
return c.unstructure(val, unstructure_as=val.__class__)
# This needs to use function dispatch, using singledispatch will again
# match A and all subclasses, which is not what we want.
c.register_unstructure_hook_func(lambda cls: cls is A, unstructure_a)
print(c.unstructure(C(B(1, 2)))) # {'a': {'a': 1, 'b': 2}} I think this can be part of the solution we're looking for. |
Rebased.
Oh yeah, definitely, this is what I have actually implemented in this PR. I even limited the runtime dispatch to the subclasses of the parent type and I generate all of those unstructuring function at compile time of the parent class unless there is a RecursionError. I am wondering if this runtime dispatch unstructuring behavior for subclasses should always be on by default for backward compatibility reasons especially considering that the runtime impact is fairly minimal I think (but I did not measure it). But if you think it is not worth it, I am OK with that too. What I want to do in the end is to be able to structure / unstructure without having to create union types everywhere. |
Sorry, I'm somewhat busy at work but I'm thinking about this. |
No problem. Take your time and thanks for keeping me updated. |
Alright, coming back to this. I'm slowly cooling on the Now I'm not super sure what the exact interface for all of this would be yet so we'll have to play around with it. I also believe I've figured out a cleaner way to implement this. Put the following into """Strategies for customizing subclass behaviors."""
from gc import collect
from typing import Dict, Optional, Tuple, Type, Union
from ..converters import Converter
def include_subclasses(
cl: Type, converter: Converter, subclasses: Optional[Tuple[Type]] = None
) -> None:
"""
Modify the given converter so that the attrs/dataclass `cl` is
un/structured as if it was a union of itself and all its subclasses
that are defined at the time when this strategy is applied.
Subclasses are detected using the `__subclasses__` method, or
they can be explicitly provided.
"""
# Due to https://github.com/python-attrs/attrs/issues/1047
collect()
# This hook is for instances of A, but not instances of subclasses.
base_hook = converter.gen_unstructure_attrs_fromdict(cl)
def unstructure_a(val: cl, c=converter) -> Dict:
"""
If val is an instance of `A`, use the hook.
If val is an instance of a subclass, dispatch on its exact
runtime type.
"""
if val.__class__ is cl:
return base_hook(val)
return c.unstructure(val, unstructure_as=val.__class__)
# This needs to use function dispatch, using singledispatch will again
# match A and all subclasses, which is not what we want.
converter.register_unstructure_hook_func(lambda cls: cls is cl, unstructure_a)
if subclasses is not None:
subclass_union = Union[(cl, subclasses)]
else:
subclass_union = Union[(cl, *cl.__subclasses__())]
dis_fn = converter._get_dis_func(subclass_union)
base_struct_hook = converter.gen_structure_attrs_fromdict(cl)
def structure_a(val: dict, _, c=converter, cl=cl) -> cl:
dis_cl = dis_fn(val)
if dis_cl is cl:
return base_struct_hook(val, cl)
return c.structure(val, dis_cl)
converter.register_structure_hook_func(lambda cls: cls is cl, structure_a) Then, this is how it's used: from attrs import define
from cattrs import Converter
from cattrs.strategies.subclasses import include_subclasses
@define
class A:
a: int
@define
class B(A):
b: int
@define
class C:
a: A
c = Converter()
include_subclasses(A, c)
print(c.unstructure(C(B(1, 2)))) # {'a': {'a': 1, 'b': 2}}
print(c.structure({"a": {"a": 1, "b": 2}}, C)) Does this implementation work for your use case? If so, please incorporate it into this PR and update the docs and tests. Later in a different PR we can change it to allow for pluggable union strategies, like we discussed. |
Hi there, So I finally found some time to update this PR. Here what I did (and did not):
Overall I like this approach a lot. This is much cleaner and less intrusive than my first attempt. I think the strategy API is nice. Now I don't think this is finished yet. Here is what I think is missing and I'd like your input to get a good API:
|
I went ahead and I added the overrides argument in I will note though that I did not find a way to get this behavior outside of the |
Hello! Sorry for the pause, I was attending a retreat and dealing with some other projects. In the meantime, I've merged tagged union support into Before I left I was playing around with how to make the union strategy configurable for def include_subclasses(
cl: Type,
converter: Converter,
subclasses: Optional[Tuple[Type]] = None,
union_strategy: Optional[Callable[[Any, BaseConverter], Any]] = None,
) -> None:
"""
Modify the given converter so that the attrs/dataclass `cl` is
un/structured as if it was a union of itself and all its subclasses
that are defined at the time when this strategy is applied.
Subclasses are detected using the `__subclasses__` method, or
they can be explicitly provided.
"""
# Due to https://github.com/python-attrs/attrs/issues/1047
collect()
if subclasses is not None:
subclass_union = Union[(cl, subclasses)]
else:
subclass_union = Union[(cl, *cl.__subclasses__())]
if union_strategy is None:
# This hook is for instances of A, but not instances of subclasses.
base_hook = converter.gen_unstructure_attrs_fromdict(cl)
def unstruct_hook(val: cl, c=converter) -> Dict:
"""
If val is an instance of `A`, use the hook.
If val is an instance of a subclass, dispatch on its exact
runtime type.
"""
if val.__class__ is cl:
return base_hook(val)
return c.unstructure(val, unstructure_as=val.__class__)
dis_fn = converter._get_dis_func(subclass_union)
base_struct_hook = converter.gen_structure_attrs_fromdict(cl)
def struct_hook(
val: dict, _, c=converter, cl=cl, base_struct_hook=base_struct_hook
) -> cl:
dis_cl = dis_fn(val)
if dis_cl is cl:
return base_struct_hook(val, cl)
return c.structure(val, dis_cl)
else:
union_strategy(subclass_union, converter)
unstruct_hook = converter._unstructure_func.dispatch(subclass_union)
struct_hook = converter._union_struct_registry[subclass_union]
# This needs to use function dispatch, using singledispatch will again
# match A and all subclasses, which is not what we want.
converter.register_unstructure_hook_func(lambda cls: cls is cl, unstruct_hook)
converter.register_structure_hook_func(lambda cls: cls is cl, struct_hook) It's not as clean as I would have wanted it but probably good enough for now. Can you incorporate the pluggable union strategy changes too, while keeping support for subclasses that you added? |
aebf0d3
to
005bc24
Compare
Hello and happy new year! I finally took the time to get back at this. I made some progresses, but there are still rough edges and I think some clean up could be made. As proposed by you above Here is what is working:
The less good:
|
Thanks for the update and happy new year! I'm busy finishing up TypedDict support, after that's done I can come back to this and help out before we release the next version. Thanks for your patience ;) |
Gentle ping 😃 |
Gentle ack, this is next on my list, I promise ;) |
Alright, I have spare cycles now to help with this. Ok, I see the issue with circular references, will try to work on it a little. |
A closure factory should return only one inner function. If not, there is a risk of messing up all free variables like cl for instance here. All inner generated functions had the last iterated cl value which obviously did not work well...
Allows to propagate overrides easily to child classes.
* Adapted the code provided by Tin * New test parameters Some unstructuring tests with the union strategy fail: infinite recursion error.
Overrides still need to be implemented.
The overrides with union strategy still fail though. Restructured the code in add_subclasses: * Skipping the classes without subclasses broke the overrides (without union strategy) and also the unstructuring of unknown class. * The hook functions are created inline and the closure variables are passed as default argument to workaround the scoping problems encontered before.
Unfortunately it does not work...
So I rebased and I completed the documentation. Finally. |
Sweet, a release is long overdue so as soon as I'm finished with TypedDicts I'll come over to get this merged in. |
I pushed a commit to fix the linting error. I was using black 22.6.0 instead of 23.1.0. The later removed a line in _unions.py compared to the former. |
Great job @matmel ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might do an editing pass on the docs, but looks good.
Hi guys 👋 It's great that the subclass feature is finally coming in the next release. I've been hoping for it for quite a while, it allows me to get rid of some manual workarounds 👍 I do have one question, though. I'm currently playing around with it and trying to incorporate it into our codebase. In most places, it worked out-of-the-box, but I encountered problems with one of my class hierarchies. When I looked a bit deeper, I suddenly understood what is the problem: the subclass tree is constructed during the call of
where The fact that the classes are collected in I can imagine at least two use cases where a delayed lookup would be necessary:
Maybe I'm even misusing the feature, but then I'd be also happy about getting your input :D |
@AdrianSosic I might have a solution for you, but please can you make a separate issue (you can just copy/paste your text into it) so it's easier to find for other folks? |
Sure, will do. After such an enormous cliffhanger, I'm more than curious what you have to say :D |
Solves #272
Without this PR, the subclass support is as follows:
BaseConverter
: unstructure behaves as expected in face of anattrs
subclass given instead of theattrs
original classConverter
cannot unstructure a subclass when it has been typed with a parent class, some attributes will be missing.BaseConverter
andConverter
fail at structuring a dictionary that is structured as a parent class even if the attributes match the child class. Those attributes are lost in the structuring process.With this PR:
BaseConverter
Converter
can structure and unstructure subclasses when parent class is mentioned. This behavior is opt-in though and must be asked for at Converter instantiation withinclude_subclasses=True
parameter.