Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic dataclasses with Hashable typevar bound do not work #170

Closed
mishamsk opened this issue Sep 20, 2023 · 3 comments
Closed

Generic dataclasses with Hashable typevar bound do not work #170

mishamsk opened this issue Sep 20, 2023 · 3 comments
Labels
wontfix This will not be worked on

Comments

@mishamsk
Copy link
Contributor

  • mashumaro version: 3.10 (3.9.1 same stuff)
  • Python version: 3.11
  • Operating System: MacOS Ventura

Description

Apparently adding a bound to TypeVar breaks mashumaro:

from dataclasses import dataclass
from typing import Generic, Hashable, TypeVar

from mashumaro.mixins.dict import DataClassDictMixin

T = TypeVar("T")


@dataclass
class Foo(Generic[T], DataClassDictMixin):
    foo: T


T_b = TypeVar("T_b", bound=Hashable)


@dataclass
class Bar(
    Generic[T_b], DataClassDictMixin
):  # mashumaro.exceptions.UnserializableField: Field "bar" of type Hashable in Bar is not serializable
    bar: T_b

although the second one is obviously a narrower type, so if the first one is ok, the second should definitely be fine.

@Fatal1ty
Copy link
Owner

Unbounded TypeVar assumes bound to Any which is handled by mashumaro using pass-through strategy by default. This by-design thing was made for convenience long before pass_through helper was introduced. When you specify upper bound Hashable it will work the same way as if you were using Hashable without TypeVar. I'm not sure we can handle any hashable type out of the box because of building (de)serialization methods on the compilation time. Maybe you have some thoughts on how to improve it? Meanwhile you can register a custom (de)serialization method for Hashable that will work the way you need:

from dataclasses import dataclass
from typing import Generic, Hashable, TypeVar

from mashumaro import pass_through
from mashumaro.config import BaseConfig
from mashumaro.mixins.dict import DataClassDictMixin


T_b = TypeVar("T_b", bound=Hashable)


@dataclass
class Bar(
    Generic[T_b], DataClassDictMixin
):
    bar: T_b

    class Config(BaseConfig):
        serialization_strategy = {
            Hashable: pass_through
        }

@mishamsk
Copy link
Contributor Author

Got it. I see where this is coming from historically. Seems like either no one even bothered, or no one ever uses Generic base classes directly, instead specifying ser_strategy for different concrete versions...or no one ever used that with non serializable types...because the current logic does exhibit an "unexpected" behavior:

from dataclasses import dataclass
from datetime import datetime
from typing import Generic, Hashable, TypeVar

from mashumaro.mixins.json import DataClassJSONMixin

T = TypeVar("T")


@dataclass
class Foo(Generic[T], DataClassJSONMixin):
    foo: T


FooStr = Foo[str]
FooDict = Foo[datetime]

print(FooStr(foo="foo").to_json())
print(FooDict(foo=datetime.now()).to_json()) # TypeError: Object of type datetime is not JSON serializable

mashumaro is perfectly capable of handling datetime, but not in this case...

One way would be to stop making assumptions and only create serialization methods on concrete versions (I haven't played with it, but probably it can be handled by overriding __getitem__). But this can potentially break old code.

Another is to treat any unsupported bound as Any. Supported as in - out of the box, or via serialization strategy. I think this approach should not break any existing code, and unify the behavior.

Not doing anything is also an option. In my case using a config is perfectly fine. Thanks for the tip here.

@Fatal1ty
Copy link
Owner

Fatal1ty commented Sep 21, 2023

You're right. This library is based on pre-compilation of (de)serialization method for concrete models. I have written about generic dataclass usage patterns here, but basically you need a new class that will keep its from_* / to_* methods. If you have the following generic aliases:

FooStr = Foo[str]
FooDict = Foo[datetime]

there can't be different from_* / to_* methods for them, because they are aliases to the same one class Foo.

I haven't played with it, but probably it can be handled by overriding getitem

You can play with __class_getitem__ to return a new child class that will trigger compilation... but I might have a better solution for you. In this PR I'm working on decoder / encoder functionality that will allow you to work with dataclasses (even without mixins) or any other types. Your example could be written like this:

from dataclasses import dataclass
from datetime import datetime
from typing import Generic, TypeVar

from mashumaro.codecs.json import JSONEncoder, json_encode

T = TypeVar("T")


@dataclass
class Foo(Generic[T]):
    foo: T


FooDict = Foo[datetime]

json_encoder = JSONEncoder(FooDict)
print(json_encoder.encode(FooDict(datetime.now())))
# or for one-time encoding:
print(json_encode(FooDict(datetime.now()), FooDict))

Another is to treat any unsupported bound as Any. Supported as in - out of the box, or via serialization strategy. I think this approach should not break any existing code, and unify the behavior.

I like the idea of using serialization_strategy for registering "Unsupported" type to pass_through or some kind of "fallback" function. I'll think about it some more. The hardest part here will be coming up with the most suitable names :)

@Fatal1ty Fatal1ty closed this as not planned Won't fix, can't repro, duplicate, stale Oct 24, 2023
@Fatal1ty Fatal1ty added the wontfix This will not be worked on label Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants