-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
path/filepath: add IsLocal #56219
Comments
I like the idea but when I saw the issue title, it's purpose wasn't at all clear to me from the name "Sanitize". It's not about cleanliness (we already have "Clean") but security. Maybe something as simple as "Safe", but with a name like that perhaps it should consider symbolic links. Perhaps it should anyway. A conundrum. |
I'm not sold on the name, although "Sanitize" does have the connotation of being more than just clean. ("To reduce or eliminate pathogenic agents", says Merriam-Webster.) Symbolic links are an interesting question. If we do consider them, then we need to know the base path that the untrusted path is relative to. Not all users will want to avoid symbolic links, either; an archive extractor writing to an output file wants to avoid writing through symlinks, but a web server probably wants to follow links in the directory it is serving from. In some cases, there might be a race condition between checking for the presence of a link and writing to the file. |
I lean towards not handling symlinks. Besides the raciness that neild mentions, once the program opens the sanitized path, there's other edge cases that could possibly cause trouble, so it's hard to guarantee complete safety:
I think this API would still be safe for a fairly common use case: extracting files from an archive (like a txtar) into a new directory. In that case, you don't have to worry about symlinks. As for the name: I understand API as "the file path is relative and a child". I agree with Rob that I'm not a big fan of |
I'm not a fan of the implicit “under the current directory” behavior — there are lots of cases where I'd like to restrict a path to be within some other working directory (such as the temporary directory returned by I'm also not keen on the “retains a trailing Separator if any” caveat — it's inconsistent with Perhaps instead, we could have variants of // JoinInDir joins any number of path elements to dir provided that
// the result is lexically still within dir. Empty elements are ignored.
//
// If the resulting path is not lexically within dir, JoinInDir returns a non-nil error.
func JoinInDir(dir, elem ...string) (string, error)
// RelInDir returns a path that is lexically equivalent to targpath when joined to dir,
// provided that targpath is lexically within dir.
// That is, JoinInDir(dir, RelInDir(dir, targpath)) is equivalent to targpath itself.
//
// An error is returned if targpath can't be made relative to basepath, if knowing
// the current working directory would be necessary to compute it, or if
// the resulting path would not be within dir when joined to it.
func RelInDir(dir, targpath string) (string, error) Then, if needed, something very close to func Sanitize(path string) (string, error) {
if filepath.IsAbs(path) {
cwd, err := os.Getwd()
if _, err := filepath.RelInDir(cwd, path); err != nil {
return "", fmt.Errorf("%s is absolute", path)
}
return filepath.Clean(path)
}
return filepath.JoinInDir(".", path)
} |
To restrict a path to some other working directory, you join the sanitized path to that directory:
One distinction between this and But I agree that it's very common to want to safely append an untrusted path to a trusted one, so perhaps the right operation here is a safer join. I do think the function name should convey a sense that it's more secure than the alternatives. |
For relative paths, maybe there's a tree-reflexive property, where joining a relative path with a prefix only results in targeting the subset of nodes that are the child nodes of the prefix. The mental hiccup is assuming tree-reflexivity when that's not a general property of relative paths. I think coercing the tree-reflexive property (or an error) from a relative path can be seen as a unary operation on a relative path, or a binary operation on a relative path with nothing.
Still, |
ISTM, most (all?) cases where you would want to use this involve os paths (since an fs.FS is rooted anyway) and joining to a base. I'm not sure when you would use path.Sanitize and not immediately turn around and do filepath.Join. So maybe it should just be |
I'm not sold on the |
I think there are use cases where it's "eventually" or even "never", rather than "immediately". For example, a CLI tool taking a path as an argument might not |
Thinking more about symlinks, filename sanitization is the wrong time to check for links. Links need to be checked for atomically at file open time, to avoid TOCTOU bugs. Possibly this indicates that we should have a function or functions in the One issue with Would it be reasonable for
|
It should be an error if it escapes, regardless of how it does so. But |
What if the base directory is a secret for some reason, then an attacker sends absolute paths until it finds one that succeeds? ISTM, it's safer to just fail any absolute paths in the paths parameter. |
I believe we have two possible APIs at this point. One is a
The other possibility is a
Either of these functions may be trivially implemented in terms of the other.
Most use cases for sanitized paths will join an untrusted path to a trusted base directory, so I lean slightly towards |
Good summary. My preference is for SafeJoin. It feels like the more common case to me, and the names "Sanitize" vs. "Clean" are too similar and easy to get backwards. |
Could you give some more detail on that? It seems like the security property we want from (And, for that matter, users who are familiar with |
That retains this security property, but it introduces a silent rewrite of an unsafe path into a safe-but-different form. I think that it's much clearer to be explicit and to never attempt to transform a path. For example, I think it's much better for a naive tar extractor to produce an error when trying to extract a file named This also provides better behavior and consistency on Windows: If a user does want to convert absolute filenames to relative ones, this can be trivially achieved by prepending |
There's a well-summarized idea section from Kubernetes around a vulnerability in volume mount paths concerning subpaths - another version of a problem where It looks like their solution purely working off of path text, not considering symlinks, relies on a The solution for safely mounting with symlinks apparently justified an entire package itself. I think I prefer |
An interesting question is what should be done with a relative path of
I can think of arguments for either accepting or rejecting Right now, I lean towards rejecting |
I don't think I agree with that. Rejecting a path that the user might have wanted will encourage them to use a different function instead, which carries its own risks. I think the specification for the functions is clearer if they allow a relative path of |
Another possible API:
|
I like the functionality of |
I don't understand I'm more partial to |
There is no universal definition of "safe", but there is a category of paths which reference locations which are lexically contained within some root. Perhaps there's a better name for these paths than "safe" or "sanitized".
I think the name should convey a sense that this is a function one uses on untrusted paths to make them trustworthy (within limits).
I think there's a good argument to be made for a function like It's worth noting that of the CVEs stemming from the lack of filename sanitization in |
Yes, that is what I was trying to get at with |
UnrootedPath? |
That missing "not" explains a lot! I'll think about wording. |
As suggested in my comment on the CL, I feel NonLocal is a much clearer and evocative name for what it does. |
|
I've updated https://go.dev/cl/449239 to:
Edit: s/any/all/ |
s/any/all/ ? |
This looks like an improvement except for the mentions of "current directory". And I suggest changing the final paragraph to: // IsLocal is a purely lexical operation. In particular, it does not account for |
Now:
|
Now that there is "IsLocal is a purely lexical operation" you can remove "lexical" from the beginning:
|
No change in consensus, so accepted. 🎉 |
IsLocal reports whether a path lexically refers to a location contained within the directory in which it is evaluated. It identifies paths that are absolute, escape a directory with ".." elements, and (on Windows) paths that reference reserved device names. For #56219. Change-Id: I35edfa3ce77b40b8e66f1fc8e0ff73cfd06f2313 Reviewed-on: https://go-review.googlesource.com/c/go/+/449239 Run-TryBot: Damien Neil <[email protected]> Reviewed-by: Joseph Tsai <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Joedian Reid <[email protected]>
Change https://go.dev/cl/451657 mentions this issue: |
CreateFile creates a handle to the console input or screen buffer when opening a file named CONIN$ or CONOUT$: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea#consoles Detect these paths as non-local. For #56219. Change-Id: Ib09e76a110d6ec09aef8038074b9bcbae09d00d7 Reviewed-on: https://go-review.googlesource.com/c/go/+/451657 Run-TryBot: Damien Neil <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Bryan Mills <[email protected]>
IsLocal will be in 1.20. |
It is common for programs to accept filenames from untrusted sources. For example, an archive extractor might create files based on names in the archive, or a webserver may serve the content of local files identified by a URL path. In these cases, the program should usually sanitize the untrusted filenames before use. (See #55356 for a list of vulnerabilities caused by using unsanitized archive paths.)
We should provide a simple path sanitization function which accepts an untrusted input and returns something reasonably safe.
Examples:
https://go.dev/play/p/EDzG8D15Zed contains a sample implementation.
The text was updated successfully, but these errors were encountered: