Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make InlineTableDict inherit from OrderedDict #86

Closed
wants to merge 1 commit into from
Closed

Make InlineTableDict inherit from OrderedDict #86

wants to merge 1 commit into from

Conversation

ryanhiebert
Copy link
Collaborator

References my review comment from #75.

I first looked at creating an analog to _dict, that I was thinking to call _InlineTableDict. However, while I was doing that it became apparent that it would end up as a pretty infectious change, which would require change to at around 4 method signatures and 8 call sites. _dict is only a parameter for load and loads, so that just didn't seem like the best approach.

My thinking with this pull request is that there is a complexity trade-off from choosing to make InlineTableDict overridable, and that's it's not worth what it costs. OrderedDict has good performance characteristics, and Python 3.6 has even made the default dict order-preserving. Combined with the relatively limited use we'd expect to see with InlineTableDict over our standard use of dict, it seemed to me to be a reasonable choice.

If you disagree with my assessment, I'm willing to open another pull request with the required changes to add an _InlineTableDict parameter to load and loads for comparison.

@ryanhiebert
Copy link
Collaborator Author

Also included in this patch is the removal of the specialized __init__ method, which I think is both unnecessary, and quite possibly unwise given it's incompatibility with the normal dict signature.

@uiri
Copy link
Owner

uiri commented Apr 7, 2017

Hi Ryan,

Thanks for the patch. I am leaning towards accepting it but I would like to get @nateprewitt's thoughts on it since he has an actual use case for the feature that this affects (per #74).

-Will

@nateprewitt
Copy link
Contributor

Thanks for taking the time to pull me in @uiri. I meant to respond to your original comments @ryanhiebert but that somehow slipped off the checklist.

So I completely agree with the removal of the __init__ declaration that was an oversight.

As for using an OrderedDict for _InlineTableDict, this current implementation doesn't work with Python 2.6. We're maintaining support for 2.6 for the immediate future in pipenv unfortunately, so if this is merged, we'll either need to fork or find another solution for toml parsing.

This is also adding functionality, which is probably fine, but will change how this library has operated for most of its existence. I'm not entirely convinced that OrderedDict is a superior choice over dict when the user can pass any custom Mapping class to _dict.

I think if toml wants to support custom mappings for Inline Tables, the ideal option here is to have InlineTableDict inherit from the value of _dict. That's a much more complicated patch though, but would be consistent. Otherwise, this makes a breaking change to move from one static type to another, lesser used, static type.

That's just my 2 cents though, feel free to do what you feel is best for the library and we'll cope. Thanks for your work @uiri and @ryanhiebert!

@ryanhiebert
Copy link
Collaborator Author

we'll either need to fork

I certainly don't want that.

What would you think about adding a dependency on ordereddict for Python 2.6? If I were the one to have designed this library (though it's likely a good thing I wasn't) I would have just done OrderedDict everywhere. Because we expect TOML files to be edited by humans, it's a fact of life that the order of the keys does matter, at least to the human. We can dictate that it doesn't matter, and can force it on computers, but humans do care when their files change in useless ways.

If you think adding the dependency is unwise, then I'd be inclined to pursue creating a dynamic subclass using _dict as a base. It feels a bit hacky to me, and I think it'd still be pretty infectious, but I can see the merit over adding another function argument.

One possible wrench in that idea is whether we see InlineTableDict as an interface to the library. I'm inclined to think it is, rather than just being a detail of how things work when a toml file is opened and written both by this library. Overall, I just don't care for all the subclassing, and would rather eliminate the need for it (except internally. I don't see a better way then the sentinel InlineTableDict class for that purpose).

@uiri
Copy link
Owner

uiri commented Apr 9, 2017

As for using an OrderedDict for _InlineTableDict, this current implementation doesn't work with Python 2.6.

I want to maintain as much backwards compatibility as is reasonable. I have modified the .travis.yml file to that end.

Ryan, I rebased your PR commit as 48657a9 and, predictably, it fails on Python 2.6

What would you think about adding a dependency on ordereddict for Python 2.6?

I would rather avoid that if possible. Currently, the library has no dependencies outside of the Python standard library and I would prefer to keep it that way.

Using _dict as a base seems ideal (see 797caaa). The use of {} instead of _dict in _load_value was originally an oversight on my part when the code for inline tables was added. It is better to fix it and pass _dict throughout the code related to loads for consistency's sake.

Thank you Ryan for pointing out the issues with the existing InlineTableDict class and for your efforts to improve the code.

@ryanhiebert
Copy link
Collaborator Author

I'll work under the assumption that you already know it, but to be sure, I want to clarify that the dependency would only affect Python 2.6. It would not be installed for any other version.

What I'm thinking, then, is that InlineTableDict will end up being more of an abstract base class (not really, since it'll be fine for 3rd parties to use it directly, but we won't), and we'll use a dynamically created subclass of both _dict and InlineTableDict for inline tables. InlineTableDict needs to be a public interface, or else the power of InlineTableDict is only available to this library internally, which I think is less than ideal. It'll be a bit of a messy change, but it's certainly doable.

I'll pull out the uncontroversial fixup for InlineTableDict.__init__ method into a separate pull request.

@uiri
Copy link
Owner

uiri commented Apr 11, 2017

What I'm thinking, then, is that InlineTableDict will end up being more of an abstract base class (not really, since it'll be fine for 3rd parties to use it directly, but we won't), and we'll use a dynamically created subclass of both _dict and InlineTableDict for inline tables.

Just to note, I already made the changes relating to dynamic subclassing that you have suggested (in a separate branch).

InlineTableDict needs to be a public interface, or else the power of InlineTableDict is only available to this library internally, which I think is less than ideal.

My comments below reflect my understanding of what you mean by "the power of InlineTableDict" but any clarifications are most welcome.

I'm not really sure that InlineTableDict is (or needs to be) a public interface. I view it as an implementation detail of the preserve flag. We could just as easily dynamically subclass _dict to add an is_inline method which returns True/False accordingly. The details don't particularly matter so long as toml.dumps(toml.load(file_obj), True) faithfully prints out inline tables as they existed in the original TOML file.

I do think that downstream users of the library may wish to know whether or not a given dictionary is an inline table, but that exposing that in the form of internal implementation details (in this case, a sentinel class) is less than ideal. I'm not sure what the best way to expose it is. I would prefer to take a more functional approach but I'm also concerned about extensibility beyond simply exposing information about inline tables. #77 comes to mind as something which would be better handled on an as-needed basis by a user of the library.

The API exposed by the library right now is rather simple and functional (in the sense of not particularly object oriented): load, loads, dump and dumps. Just like the TomlDecodeError and TomlTz classes, the InlineTableDict class is an interface to the library only insofar as the library may expose these classes to its users as part of a returned value (or in raising an Exception). As a toml user, I would not rely too much on anything that exists in those classes but not their respective parent classes.

To take a cue from Python's json library, I think that creating TOMLEncoder and TOMLDecoder classes is one option to provide the flexibility that some users may need from this library. However, I think that such a change should be very well thought out. This is most likely outside of the scope of this particular PR.

@ryanhiebert
Copy link
Collaborator Author

The approach that I saw in that branch looks good to me. I'd recommend leaving dict as the base class, unless you have something specific driving the choice to make it something else.

In functional programming you'll tend to really good uses of data types. I think that InlineTableDict as a sentinel subclass is a good example of that. I think that I'd use something similar if I were writing this in Haskell, for example.

If this is not a public interface, then consumers of this library are not able to write a new toml file, without reading an old one, and have it use any inline tables. I think that there should be a way for users that don't need load or loads at all to still be able to make inline tables with dump and dumps. From there, it seems natural to expose InlineTableDict as the way for users to do it.

I like the functional approach, so the Encoder/Decoder classes don't jive well with my own thoughts. However, I've not used the json equivalents except in the most trivial of ways (just copy-pasting, basically), so I may not be a good judge of whether they are indeed the better approach.

I think at this point this particular PR is dead, and your work on that other branch should be how we implement what this desires, so I'm gonna go ahead and close it.

@ryanhiebert
Copy link
Collaborator Author

The approach that I saw in that branch looks good to me. I'd recommend leaving dict as the base class, unless you have something specific driving the choice to make it something else.

Oh, and even if you decide that you have valid case for not inheriting from dict, make sure it at least inherits from object, to ensure we're using a new-style class there in Python 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants