-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python API: Make the API Python 2 & 3 compatible #324
Conversation
This could be abstracted rather than omitting lxml.etree entirely. It is relatively simply to hide both lxml.etree and xml.etree.ElementTree behind an XML abstraction layer, at least for most typical operations such as building a tree. However lxml.etree provides much better xpath support, amongst other things, and as you say it is faster, and is most likely the future of the etree API. Can you elaborate on how installing the 'future' set might help here? |
@DavidAntliff I was originally unsure if introducing two different libraries in this case makes sense, I imagine the parsed/created XMLs are small enough for the differences to not matter, but I can do something along the lines of: try:
import lxml.etree as ElementTree
except ImportError:
import xml.etree.ElementTree as ElementTree The only issue that it introduces additional complexity of using two different libraries, but the APIs are compatible so this would work. However, that keeps lxml and gives a fallback in case it's unavailable. About the future package, that's needed for abstracting Python 2/3 differences. At the moment it's for only one case - iteritems is renamed to items in Python 3, however items in Python 2 behave differently than in Python 3. I imagine this will only grow as the API grows, hence the package. |
@mtusnio import of future.utils is failing - see travis-ci log. Does this need a change to |
@DavidAntliff Updated the PR with the ci/requirements.txt changes (as well as the import changes), I imagine the docker containers need to be rebuilt with this change though? |
@mtusnio the docker containers are rebuilt automatically by Jenkin's jobs. Travis, however, isn't happy because (perhaps by accidental omission) it's not using requirements.txt. I'll submit a PR to fix that. Once that's accepted, you can update this PR and the travis job should work better. |
Please wait for #326 to be merged. |
@mtusnio seems the "try,except" importing will introduce an issue of different api from the two libraries, such as in "etree.tostring(e_root, pretty_print=True)" , where xml.etree.ElementTree doesn't have a pretty_print ? |
This is a valid concern and why I suggested hiding them behind a wrapper. Then a consistent API can be presented that can gracefully handle missing functionality in the smaller library. |
The pretty print is gone in this version, it was just one call. I think in this case doing an abstraction layer seemed like a bit of an overkill, I wouldn't mind doing that if more discrepancies show up. |
I don't think it's overkill when you consider that it's intended to provide some discipline to what is an open-source project with potentially many contributors, as it will help ensure that contributors consider the implications of adding new functionality that may not be supported by one of those libraries. The wrapper can be really, really simple (just a delegation layer in most cases) and it really just ensures that all code using XML functions goes through a common layer, where any discrepancies, incompatibilities and adjustments can be made, now or later. If it's not done early, and used throughout, it can be a nightmare to add later as client code may rely on library-specific nuances. Better to even the playing field early and work with a consistent interface. Use case: someone wants to add a new feature that uses something from lxml (e.g. enhanced xpath). If this layer doesn't exist, they naturally implement their feature with lxml directly, discover that etree doesn't support it, and then write some code in-place to deal with the incompatibility in etree. Hopefully they realise that someone else may have already done this elsewhere, and go looking for it first. Hopefully they find it and don't rewrite such a translation. Hopefully. Otherwise maybe someone else will do that work for them later. Maybe. OR they see that all the XML code in the project is using a consistent wrapper, and look through there for an existing compatibility wrapper for "xpath", and just use that, knowing that it's tested ((non-trivial wrapper/compatibility code should definitely have tests). |
@DavidAntliff I definitely don't disagree with any of that, my point was only that it's a major piece of work to do and adding an abstraction layer at that point would be an overkill in the sense of work required to accommodate the pretty_print change. For future changes it definitely isn't, however there's in general work that can be done around the API to add responses/requests, clean up bits and pieces and add functionality. I'm hoping to do some of that, see about adding the abstraction layer, however, as always, time is the limit. :) In this case I wouldn't have been able to replace the whole thing with an abstraction layer and the question is whether it's better to then not do any changes, or enable Py2/Py3 compatibility as well as LXML/ElementTree support while keeping all of this in mind. To sum up, I don't disagree at all with what you are proposing, I just think that it was worth adding this functionality in this state and, if any PRs come expanding the API, start thinking about a switch. |
@mtusnio you could add the layer "as-you-go" - from memory at the moment there's only a handful of XML-related functions, so they are easily moved behind a wrapper, and then if you find yourself wanting to use something from lxml that isn't in etree, you can then write a more complex wrapper at that point. So you only need to wrap what is actually used - no need to analyse both libraries up-front and write a full wrapper. Thank you for taking the time to work on the project :) |
That sounds good, I'll give it a go once I start pushing my project forward again. :) |
Based on #305
Removes the requirement for lxml, while also adding a requirement for the future package (only for iteritems for now). ElementTree is significantly slower than lxml, however it makes for an easier use out of the box on the Ci40. Future package can be easily installed via 'pip install future'