Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow typing.NamedTuple to be serialized #534

Merged
merged 5 commits into from
Aug 29, 2018

Conversation

vmarkovtsev
Copy link
Contributor

@vmarkovtsev vmarkovtsev commented Aug 22, 2018

This allows to serialize typing.NamedTuple-s and other crazy inheritors from list and tuple which do not preserve the original constructor signature.

We catch TypeError in the constructor and use the auxiliary list.

@stsci-bot
Copy link

stsci-bot bot commented Aug 22, 2018

Hi there @vmarkovtsev 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog. I help make sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃.

Everything looks good from my point of view! 👍

If there are any issues with this message, please report them here.

@vmarkovtsev vmarkovtsev changed the title Allow subclasses of list and tuple to be serialized Allowtyping.NamedTuple to be serialized Aug 22, 2018
@vmarkovtsev vmarkovtsev changed the title Allowtyping.NamedTuple to be serialized Allow typing.NamedTuple to be serialized Aug 22, 2018
@coveralls
Copy link

coveralls commented Aug 22, 2018

Coverage Status

Coverage increased (+0.02%) to 93.864% when pulling 975503f on vmarkovtsev:patch-3 into 317a4ee on spacetelescope:master.

@drdavella
Copy link
Contributor

Hi @vmarkovtsev, thanks for the PR. I am fully in favor of supporting serialization of namedtuple. I am a little wary of the fact that true roundtrip behavior is not preserved by this change. However I'm not sure whether there's any way around it since I don't know if it will be possible to write an actual tag for these types.

Copy link
Contributor

@drdavella drdavella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, modulo the requested changes and the philosophical considerations mentioned above. Ideally we could implement a tag for these cases, but I'm not sure it's possible.

Maybe @perrygreenfield has an opinion about whether it's okay to serialize these types without preserving true roundtrip behavior?

assert isinstance(asdf.tree['val'], list)
run_tuple_test(tree, tmpdir)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is extraneous whitespace on these two lines.


def test_named_tuple_typing(tmpdir):
# Ensure that we are able to serialize a typing.NamedTuple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is extraneous whitespace on this line as well.

run_tuple_test(tree, tmpdir)


def test_named_tuple_collections(tmpdir):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a test showing that recursive serialization of namedtuple members is handled correctly? For example, you could create a namedtuple where one of the members is an np.ndarray.

@vmarkovtsev
Copy link
Contributor Author

Added more tests and removed whitespaces.

@vmarkovtsev
Copy link
Contributor Author

@drdavella ping

@drdavella
Copy link
Contributor

@vmarkovtsev I'd like some input from @perrygreenfield before merging. I have some lingering concerns over whether this is okay since it does not preserve true round-trip behavior, although maybe it doesn't matter too much.

Just for the sake of argument: what would prevent a user from performing this conversion explicitly before passing to ASDF? I realize it's probably less convenient and/or elegant, but it would put the burden of information loss on the end user rather than requiring ASDF to strip away information implicitly under the hood.

@vmarkovtsev
Copy link
Contributor Author

Yep, I see the point. On the other side, the exact subtypes of namedtuple are always defined in the source code (namedtuple and NamedTuple are abstract) and it is impossible to recover them without saving the source code - thus we arrive at pickles. But the ultimate parent of those is tuple and it is possible to recover the value after serialization because the order of elements persists.

ASDF already strips items with None values btw.

@drdavella
Copy link
Contributor

the exact subtypes of namedtuple are always defined in the source code (namedtuple and NamedTuple are abstract) and it is impossible to recover them without saving the source code

This seems like a design flaw to me, but there's not much we can do about it. Unfortunately it means we can't write a tag for these types, which would be the ideal solution.

ASDF already strips items with None values btw.

Interesting point, although I think it's really a separate issue (e.g. #536 😉).

@perrygreenfield
Copy link
Contributor

perrygreenfield commented Aug 28, 2018 via email

@drdavella
Copy link
Contributor

@vmarkovtsev maybe as a compromise for now we could add a top-level option like implicit_type_conversion to enable this behavior (defaults to False)? Also it would be useful to emit a warning when this occurs.

@@ -17,7 +17,7 @@ open_files_ignore = test.fits asdf.fits
# Account for both the astropy test runner case and the native pytest case
asdf_schema_root = asdf-standard/schemas asdf/schemas
asdf_schema_skip_names = asdf-schema-1.0.0 draft-01
addopts = --doctest-rst
#addopts = --doctest-rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was changed by mistake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

@vmarkovtsev
Copy link
Contributor Author

How about implicit_lists? It is only used for those types which inherit from list or tuple and do not preserve the __init__ signature. -1 to warnings because when the user sets this flag to True they know what they are doing.

@perrygreenfield
Copy link
Contributor

If we support that, why not implicit_dicts as well? I tend to agree with not requiring the warning. These mechanisms would be advertised as an easy way to save python info, but only in one direction (without the user reconstructing the objects outside of ASDF).

@vmarkovtsev
Copy link
Contributor Author

Do we have an example in the wild when this code fails in treeutil.py:

if isinstance(tree, dict):
    result = tree.__class__()
    seen.add(id_tree)
    for key, val in tree.items():
        val = recurse(val)
        if val is not None:
            result[key] = val
    seen.remove(id_tree)

All dicts have items() and __setitem__, otherwise something is really screwed up on the user's side.
I guess there will be no problems afterwards...

@drdavella
Copy link
Contributor

Sorry to waffle, but here's one more proposal: perform the conversion by default, but with a warning, and add a top-level option to suppress the warnings like ignore_implicit_conversion=False or some such.

In this case I like the generality of "implicit conversion" vs "implicit list" or "implicit dict" since it lets us overlook the dict case and other possible cases for now (although I agree with @vmarkovtsev that namedtuple is probably fairly unique).

This also parallels existing top-level options ignore_version_mismatch and ignore_unrecognized_tag. (Btw, eventually it might be nice to have some other configuration mechanism for these kinds of options).

@vmarkovtsev
Copy link
Contributor Author

I like this proposal.

@drdavella
Copy link
Contributor

@vmarkovtsev cool. I can finish up this PR if you want, as long as I'm able to push to your branch.

@vmarkovtsev
Copy link
Contributor Author

I sent you the push permission invitation just in case.

@drdavella drdavella added this to the v2.1.0 milestone Aug 29, 2018
@drdavella drdavella merged commit c44d92a into asdf-format:master Aug 29, 2018
@drdavella
Copy link
Contributor

Thanks @vmarkovtsev!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants