-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-24132: Add pathlib._AbstractPath
#31085
bpo-24132: Add pathlib._AbstractPath
#31085
Conversation
The `Path` class is now an *implementation* of `_AbstractPath`; its methods call functions in `os.path`, `io`, `pwd`, etc, whereas the corresponding methods in `_AbstractPath` instead raise `NotImplementedError`.
Check the presence of the necessary `os`, `pwd` and `grp` functions at import time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the overall idea of this PR, but I'm afraid I think trying to cram everything into one ABC is the wrong approach -- the interface still feels a little confused to me.
For example -- I'm a static-typing guy, and I have no idea how we'd type AbstractPath.absolute()
in typeshed. For a class that override AbstractPath.cwd()
, AbstractPath.absolute()
returns an instance of that class. But subclasses of AbstractPath
can't be guaranteed to override AbstractPath.cwd()
, because it's not an abstractmethod
, because it's not part of the core interface. Which means that we couldn't include this method in the typeshed stub for AbstractPath
at all, because it would be unsafe for an instance of an AbstractPath
subclass to ever call absolute()
. So, why is absolute()
in AbstractPath
at all?
I'm using cwd()
and absolute()
as an example, but I think this is a broader problem for all of the methods in AbstractPath
that raise NotImplementedError
but are not abstractmethods
(and, by extension, all of the mixin methods that call methods that might-or-might-not be implemented).
I would counsel splitting AbstractPath
up into several smaller ABCs. In typeshed we could pretend that these are PEP 544 protocols, in much the same way we do with os.PathLike
, which is an ABC at runtime but is considered a Protocol
by static type checkers.
Instead of having a structure like this:
class AbstractPath(PurePath, ABC):
# core interface
@abstractmethod
def iterdir(self): ...
@abstractmethod
def stat(self, *, follow_symlinks=True): ...
@abstractmethod
def open(self, mode='r', buffering=-1, encoding=None,
errors=None, newline=None): ...
@classmethod
def cwd(cls): ... # Not abstract, but raises NotImplementedError
def absolute(self): ... # Depends on cwd() being implemented
def resolve(self): ... # Not abstract, but raises NotImplementedError
class Path(AbstractPath): ...
You could instead have something like this:
class AbstractBasePath(PurePath, metaclass=ABCMeta):
"""I represent the minimum requirements for a class to implement a pathlib-esque interface"""
@abstractmethod
def iterdir(self): ...
@abstractmethod
def stat(self, *, follow_symlinks=True): ...
@abstractmethod
def open(self, mode='r', buffering=-1, encoding=None,
errors=None, newline=None): ...
class ResolveablePathMixin(metaclass=ABCMeta):
"""Mixin ABC for paths that can be resolved in some sense"""
@classmethod
@abstractmethod
def cwd(cls): ...
def absolute(self): ... # Concrete method that calls cwd()
@abstractmethod
def resolve(self): ...
class AbstractPath(AbstractBasePath, ResolveablePathMixin):
"""I represent the full interface for a pathlib Path"""
# This class body would be completely empty
# The only purpose of this class would be to accumulate
# all of the abstractmethods from AbstractBasePath
# and all of the mixin classes
class Path(AbstractPath):
# This class body would be filled with concrete implementations
# of all the abstract methods accumulated in `AbstractPath`
In this way, users could decide whether they want to implement just the core interface, by subclassing AbstractBaseClass
; the full pathlib
interface, by subclass AbstractPath
; or something custom, by using multiple inheritance to subclass AbstractBaseClass
and one of the smaller mixin classes at the same time.
(You're far more knowledgeable about pathlib
internals than I am, so: apologies if I've got any of the pathlib
-specific details wrong here!)
I would look to the
|
Thank you so much Alex, that's great advice. I need to give this idea some serious consideration and work out exactly which ABCs would be involved. My kneejerk concerns about this are as follows:
|
I've logged bpo-46733 to address the existing misuse of |
Update on this PR: I've reduced the number of abstract methods substantially! Now only On the python-dev mailing list, Brett Cannon suggested this could be a protocol rather than an ABC. My only concern there is that the protocol would be pretty extensive, because the PurePath + _AbstractPath APIs are pretty extensive, even with methods like |
I think the interface is much clearer now — thank you! I won't formally "approve" the PR, because I just don't know enough about pathlib, but the overall design of the class looks really strong to me now. W.r.t. ABCs versus protocols — I agree that it seems too big to be a protocol. Additionally, a protocol cannot inherit from a concrete class, so you'd have to change the inheritance structure if you wanted to make it a protocol ( The only other piece of feedback I have is that I agree with Ethan Furman's comments on python-dev — I'd make the class public (name it |
Thanks very much Alex. On making this
What I'll do instead is withdraw my recommendation for limited experimentation by third parties - I'll adjust the PR description and the |
The real question is whether the API will be changing at all, or does @barneygale have some other changes planned which would influence this? Once this is made public there's no going back without a lot of pain, so we should be thoughtful about how the API will look and what is going to be asked of users. |
I don't want to commit to freezing the interface just yet. But it's not too far away - I think we'll be able to do it in 3.12 or 3.13. Even without making |
That makes sense to me -- and it's good to have that clarified! :) |
My current thinking (while I work through my PR review backlog to reach this PR), is this should be used for |
Yep, will do! That helps clarify the order of things a lot actually. I think we need to do these things first (in this order):
I'm going to mark this PR as a draft for now, will circle back when I've solved those. We'll be able to drop the underscore prefix then, too! |
@barneygale do you still want to wait to have this PR reviewed until the API can go public? Or can it be reviewed as-is and the API kept private for now and consider this PR as cleaning up code? I'm trying to clean up my PR review queue and so if this is going to stay in draft because you only want this merged if the API can go public I will unassign myself until it's ready. |
Please go ahead and unassign yourself! I'd like to make |
@barneygale Do I understand correctly that the aim of this change is to make something similar to pathlab somewhat easier to do? I.e. Allowing pathlib capabilities to be used outside of regular system paths and extend it to any kinds of paths that share similar semantics. If so, I'd love to help with it, if you need any help. |
Yep that's exactly right! Could be archived files (e.g. in |
Once #100479 is resolved, I'll resume work on this PR. |
I'll resume work on this once #100481 lands. |
Contrary to my earlier comments, I'm going to abandon this PR and continue the work elsewhere. Specifically, I'm planning to add a I'm going to leave |
After a couple of years of mostly deleting things in pathlib, I'm glad to finally offer an addition!
This PR adds a new
_AbstractPath
class that sits betweenPurePath
andPath
in the class hierarchy._AbstractPath
objects have all the same methods asPath
objects, but their filesystem-accessing methods consistently raiseNotImplementedError
.(updated Feb 15th: now, only
stat()
,open()
anditerdir()
are abstract; other methods that directly access the filesystem are not part of the_AbstractPath
interface).This class will form the basis of a future public-facing abstract class that can be used by the likes of s3path and cloudpathlib. It could also be used to make
zipfile.Path
objects fully pathlib-compatible (no missing methods!)Why is this underscore-prefixed?
I think this needs some time to gestate in CPython before we write full docs/tests and remove the prefix. I'd make an appeal to the authors of pathlib-y packages on PyPI to try it in an experimental branch and let us know where the pain points are.Three or so roadblocks remain before we can document and recommend it. See this commenthttps://bugs.python.org/issue24132