-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849
Comments
I'm of the opinion that it's the users responsibility to sanitize the path, as I have seen some crazy (and very rare) use-cases in the past actually depending on this behavior. We should document this though. As a data point, a popular package that wraps the standard archive package is |
Hmmm, as a counter-point, it seems that the logic in |
I think we should be safe by default but provide access to the unsanitized value for the "crazy" use cases you mention. |
One proposal is to return a distinguishable error when an invalid path is encountered, but still parse out (or write) the full header. Odd use cases can check for that specific semantic error and ignore it, but most users will be protected. |
+1 on sanitizing FileName and adding InsecureFileName. Looks like this applies to archive/tar as well. We'll also have to make absolute paths relative. However, note that it's not enough to prevent all path traversal attacks. For example, allowing symlinks with arbitrary targets (like See https://blog.filippo.io/so-i-lost-the-password-of-my-nas/ section 3 for example. |
And it was already reported against github.com/mholt/archiver: mholt/archiver#65 (comment) |
I like that. It'd mean only adding a new error variable (or type) to the package, and some docs on two functions.
This is more work, but not terrible. We could track them all and still detect when another name would reference a path element with a link. Without that, though, fortunately the number of affected callers is probably fewer because you need to go out of your way to support symlinks. If the caller program is, say, a webapp letting users upload a zip of JPEGs, they probably didn't add in the symlink creation support. Let's start with some more documentation for Go 1.11. |
Change https://golang.org/cl/118335 mentions this issue: |
I'm not comfortable with doing this for archive/tar. Absolute pathnames are common usage there and writing to /etc/passwd is often intended. Returning an error where there was none before is going to break existing programs. |
I probably should not have open-sourced my archiver package, since I only use it to compress, and only in non-adversarial environments. Sorry if it caused any trouble. (I don't have much time to maintain it lately, so if anyone wants to do so, I'll make you a collaborator.) Thanks for thinking of this aspect of security, though, from the standard library perspective. Admittedly it's hard to do path sanitization right, since there are some use cases, as others have mentioned, where "unsafe" sanitization may be desired. And I keep getting path cleanup wrong myself (did at least 3 times in Caddy too). Whatever you decide, I vote for sane defaults with the option to shoot yourself in the foot if need be, with strongly-worded warnings in the godocs. :) |
Updates #25849 Change-Id: I09ee928b462ab538a9d38c4e317eaeb8856919f2 Reviewed-on: https://go-review.googlesource.com/118335 Reviewed-by: Joe Tsai <[email protected]>
Updates golang#25849 Change-Id: I09ee928b462ab538a9d38c4e317eaeb8856919f2 Reviewed-on: https://go-review.googlesource.com/118335 Reviewed-by: Joe Tsai <[email protected]>
/cc @mikesamuel too |
It seems that there are 2 use cases:
IIUC, it seems rare that part of a tarball would be trusted more than another part. If so, we probably know this when we open the tarball. Instead of adding more file information to individual entries, could we treat this as an optional is self contained check? An individual entry might be treated as invalid if the tarball is supposed to be self contained and either
|
How hard is the symlink problem to disentangle? Resolving symlink targets might require isDirectory checks so it might require accessing multiple entries to tell whether a symlink entry reaches outside the directory to which entry That means that you have to detect and reject symlink cycles. Are isDirectory checks the only kind of transitive analysis required? isDirectory becomes easier to reason about if you also assume that symlink targets that don't map to any entry are not directories. Otherwise, you might have to do a worst-case analysis. I did some experimenting with symlink chains below. I think this means that we can do self-containedness checks even where an attacker can cause two or more tarballs to untar into the same containing directory since an attacker doesn't defeat any analysis by constructing a symlink chain where even elements are from one tarball and odd from another. A symlink chain that reaches past `.` must have one link that reaches past `.`Question: Given a tarball with a symlink structure like the below, would untarring into a/a/a/a/a -> ../../../etc/shadow The answer seems to be no. $ mkdir /tmp/ln-exp; \
cd /tmp/ln-exp/; \
mkdir -p a/a/a/a/; \
mkdir a/etc/; \
echo 'self-contained' > a/etc/shadow; \
cd a/a/a/a; \
ln -s ../../../etc/shadow a; \
cd -; \
ln -s a/a/a/a/ b; \
ln -s b/a c; \
cat c
self-contained
$ find . \( -type l -o -type f \) -exec ls -l {} \;
-rw-r--r-- 1 msamuel wheel 15 Jun 25 14:40 ./a/etc/shadow
lrwxr-xr-x 1 msamuel wheel 19 Jun 25 14:40 ./a/a/a/a/a -> ../../../etc/shadow
lrwxr-xr-x 1 msamuel wheel 3 Jun 25 14:40 ./c -> b/a
lrwxr-xr-x 1 msamuel wheel 8 Jun 25 14:40 ./b -> a/a/a/a/ |
I'd like to see an answer that covers archive/zip, archive/tar, and anything else in a consistent way. We can take our time. This is a very very very old "problem". |
How about this: archive/tar
archive/zip
|
Separate "NewTrustedArchiveReader" etc seems like overkill. A new InsecureName or RawName, as proposed above, combined with sanitizing the current Name field, seems like enough. But then you need to define what "sanitized" means. Is it just ".." or do we care about other things, like files named COM1 or with backslashes or ... ? |
I'm not a fan of duplicated header fields.
The more I think about this, I like my suggestion of a non-fatal distinguishable error (#25849 (comment)). |
This seems OK to leave until Go 1.13. |
Note: this issue was referenced in https://blog.ryotak.me/post/cdnjs-remote-code-execution-en/, and if my reading of the blog post is correct, the current archive/tar behaviour is what enabled the exploit. |
Anybody interested in this, please see the proposal at #55356. Please comment on that proposal if it would not help. Thanks. |
Change https://go.dev/cl/449937 mentions this issue: |
Go isn't directly affected by path traversal attacks in archives but programs written in Go might be.
In particular, if a Go program reads a malicious zip file, the archive/zip package will return the malicious filename in the
FileHeader.FileName
.Perhaps we should sanitize it?
If we really need the raw/unsanitized version, we could copy it to a new field
FileHeader.InsecureFileName
orRawFilename
or something./cc @dsnet @ianlancetaylor @FiloSottile @andybons @rsc
The text was updated successfully, but these errors were encountered: