Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to construct TOML dictionary for dataclass instance #12

Open
mthuurne opened this issue Jun 16, 2023 · 3 comments
Open

Add function to construct TOML dictionary for dataclass instance #12

mthuurne opened this issue Jun 16, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@mthuurne
Copy link
Member

mthuurne commented Jun 16, 2023

The Binder.bind() method creates a dataclass instance from the data in TOML dictionary. It would be useful to support the opposite conversion as well, where we accept a dataclass instance and return the corresponding TOML dictionary.

We do support conversion from dataclass instance to textual TOML representation with format_template(), but in the case of parsing it turned out to be useful to support dictionary input in addition to textual input, so when generating TOML, it's probably also useful to support dictionary output in addition to textual output.

This could either be a standalone function or a method on an instanced Binder. I think it would be more efficient to use an instanced Binder in the implementation, both to avoid code duplication and to not do redundant checks on the data class definition. However, as we have a binder cache already, we could have a standalone function forward the request to an instanced binder, if that simplifies the interface.

Note that dataclasses.asdict() offers similar functionality, but it does not handle some conversions like timedelta, modules and dashes in key names. Perhaps we can use asdict() with a custom dictionary factory, but probably not, as there is no accompanying list factory.

@mthuurne mthuurne added the enhancement New feature or request label Jun 16, 2023
@mthuurne
Copy link
Member Author

mthuurne commented Jun 29, 2023

Now that Binder can be constructed from instances as well, maybe a method makes more sense than a function.

@mthuurne
Copy link
Member Author

mthuurne commented Jul 26, 2023

Now that Binder can be constructed from instances as well, maybe a method makes more sense than a function.

Although, if it is a method, calling that method when Binder was constructed from a class wouldn't work.

Maybe the original specialization syntax wasn't so bad after all: if Binder[DC] returns a specialized Binder class and Binder(data) returns an instanced Binder, the asdict()/to_dict() could be an instance method, such that type checkers know it can only be called with an instance.

You could also do things like type(data).parse_toml("other.toml") to parse a TOML file in the same format but without using the existing data as a default.

We'd have to check whether overloading a method (like parse_toml()) with a class and instance variant actually works both in Python itself and in mypy. I'm pretty confident that it can work in Python itself: even if it doesn't work directly, we could use descriptors instead. But I'm less confident about mypy.

Perhaps having two differently named methods is better than overloading. One would be a class method that parses from scratch and the other an instance method that parses with existing data as defaults. That would fit better with the convention in Python that you can call class methods on instances as well.

@mthuurne
Copy link
Member Author

Note that dataclasses.asdict() offers similar functionality, but it does not handle some conversions like timedelta, modules and dashes in key names. Perhaps we can use asdict() with a custom dictionary factory, but probably not, as there is no accompanying list factory.

In theory we could post-process the dictionary returned by asdict() recursively and replace any custom types by native TOML types. However, if we're going to recursively process the TOML data, does using asdict() provide any benefits over recursively generating the TOML data ourselves?

When using a custom dictionary factory, the post-processing would see nested dataclasses multiple times: when the nested dataclass is processed itself and once for every parent level. We can't just skip recursion into nested dictionaries, as unlike dictionaries were created from dataclasses, dictionaries created from mapping types do need recursive post-processing. Therefore, if we'd use asdict() at all, it would be more efficient to post-process the top-level asdict() output once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant