archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849

bradfitz · 2018-06-12T17:59:35Z

Go isn't directly affected by path traversal attacks in archives but programs written in Go might be.

In particular, if a Go program reads a malicious zip file, the archive/zip package will return the malicious filename in the FileHeader.FileName.

Perhaps we should sanitize it?

If we really need the raw/unsanitized version, we could copy it to a new field FileHeader.InsecureFileName or RawFilename or something.

/cc @dsnet @ianlancetaylor @FiloSottile @andybons @rsc

The text was updated successfully, but these errors were encountered:

bradfitz · 2018-06-12T18:04:06Z

/cc @mholt too, author of github.com/mholt/archiver (pointed out by @dsnet), which properly sanitizes paths on its own. But maybe that shouldn't be its job.

dsnet · 2018-06-12T18:04:32Z

I'm of the opinion that it's the users responsibility to sanitize the path, as I have seen some crazy (and very rare) use-cases in the past actually depending on this behavior. We should document this though.

As a data point, a popular package that wraps the standard archive package is github.com/mholt/archiver, which handles it properly.

dsnet · 2018-06-12T18:05:37Z

Hmmm, as a counter-point, it seems that the logic in archiver was added precisely because of zip-slip:
mholt/archiver#65

bradfitz · 2018-06-12T18:09:37Z

I think we should be safe by default but provide access to the unsanitized value for the "crazy" use cases you mention.

dsnet · 2018-06-12T18:18:30Z

One proposal is to return a distinguishable error when an invalid path is encountered, but still parse out (or write) the full header. Odd use cases can check for that specific semantic error and ignore it, but most users will be protected.

FiloSottile · 2018-06-12T18:26:18Z

+1 on sanitizing FileName and adding InsecureFileName. Looks like this applies to archive/tar as well. We'll also have to make absolute paths relative.

However, note that it's not enough to prevent all path traversal attacks. For example, allowing symlinks with arbitrary targets (like ../../) in a tar archive can still results in an escape even if filenames are clean. Sanitizing symlink targets might break more things than filenames.

See https://blog.filippo.io/so-i-lost-the-password-of-my-nas/ section 3 for example.

FiloSottile · 2018-06-12T18:29:49Z

And it was already reported against github.com/mholt/archiver: mholt/archiver#65 (comment)

bradfitz · 2018-06-12T19:25:05Z

One proposal is to return a distinguishable error when an invalid path is encountered, but still parse out (or write) the full header.

I like that.

It'd mean only adding a new error variable (or type) to the package, and some docs on two functions.

However, note that it's not enough to prevent all path traversal attacks. For example, allowing symlinks with arbitrary targets (like ../../) in a tar archive can still results in an escape even if filenames are clean. Sanitizing symlink targets might break more things than filenames.

This is more work, but not terrible. We could track them all and still detect when another name would reference a path element with a link. Without that, though, fortunately the number of affected callers is probably fewer because you need to go out of your way to support symlinks. If the caller program is, say, a webapp letting users upload a zip of JPEGs, they probably didn't add in the symlink creation support.

Let's start with some more documentation for Go 1.11.

gopherbot · 2018-06-12T19:34:19Z

Change https://golang.org/cl/118335 mentions this issue: archive/zip: warn about FileHeader.Name being unvalidated on read

slrz · 2018-06-12T19:43:55Z

+1 on sanitizing FileName and adding InsecureFileName. Looks like this applies to archive/tar as well. We'll also have to make absolute paths relative.

I'm not comfortable with doing this for archive/tar. Absolute pathnames are common usage there and writing to /etc/passwd is often intended. Returning an error where there was none before is going to break existing programs.

mholt · 2018-06-12T22:59:13Z

I probably should not have open-sourced my archiver package, since I only use it to compress, and only in non-adversarial environments. Sorry if it caused any trouble. (I don't have much time to maintain it lately, so if anyone wants to do so, I'll make you a collaborator.)

Thanks for thinking of this aspect of security, though, from the standard library perspective. Admittedly it's hard to do path sanitization right, since there are some use cases, as others have mentioned, where "unsafe" sanitization may be desired. And I keep getting path cleanup wrong myself (did at least 3 times in Caddy too).

Whatever you decide, I vote for sane defaults with the option to shoot yourself in the foot if need be, with strongly-worded warnings in the godocs. :)

Updates #25849 Change-Id: I09ee928b462ab538a9d38c4e317eaeb8856919f2 Reviewed-on: https://go-review.googlesource.com/118335 Reviewed-by: Joe Tsai <[email protected]>

Updates golang#25849 Change-Id: I09ee928b462ab538a9d38c4e317eaeb8856919f2 Reviewed-on: https://go-review.googlesource.com/118335 Reviewed-by: Joe Tsai <[email protected]>

odeke-em · 2018-06-22T08:04:09Z

/cc @mikesamuel too

mikesamuel · 2018-06-22T12:23:41Z

It seems that there are 2 use cases:

We have a tarball from a source that we trust with arbitrary current-user file-system access
We have a tarball of which we are suspicious but we can interact with carefully as long as it's self-contained.

IIUC, it seems rare that part of a tarball would be trusted more than another part.

If so, we probably know this when we open the tarball.

Instead of adding more file information to individual entries, could we treat this as an optional is self contained check?

An individual entry might be treated as invalid if the tarball is supposed to be self contained and either

unpacking the tarball into an empty subdirectory would place the entries content out of the subdirectory or
an entry has the symlink bit set and unpacking the entry would cause the referent (whether it currently exists) to be outside the directory.

mikesamuel · 2018-06-25T19:04:08Z

@odeke-em

How hard is the symlink problem to disentangle?

Resolving symlink targets might require isDirectory checks so it might require accessing multiple entries to tell whether a symlink entry reaches outside the directory to which entry . refers.

That means that you have to detect and reject symlink cycles.

Are isDirectory checks the only kind of transitive analysis required?

isDirectory becomes easier to reason about if you also assume that symlink targets that don't map to any entry are not directories. Otherwise, you might have to do a worst-case analysis.

I did some experimenting with symlink chains below. I think this means that we can do self-containedness checks even where an attacker can cause two or more tarballs to untar into the same containing directory since an attacker doesn't defeat any analysis by constructing a symlink chain where even elements are from one tarball and odd from another.

A symlink chain that reaches past `.` must have one link that reaches past `.`

Question: Given a tarball with a symlink structure like the below, would untarring into ${UNTAR_DIR} cause cat ${UNTAR_DIR/c} to access ${UNTAR_DIR}/../../etc/shadow?

a/a/a/a/a -> ../../../etc/shadow
b -> a/a/a/a/
c -> b/a

The answer seems to be no.

$ mkdir /tmp/ln-exp; \
  cd /tmp/ln-exp/; \
  mkdir -p a/a/a/a/; \
  mkdir a/etc/; \
  echo 'self-contained' > a/etc/shadow; \
  cd a/a/a/a; \
  ln -s ../../../etc/shadow a; \
  cd -; \
  ln -s a/a/a/a/ b; \
  ln -s b/a c; \
  cat c
self-contained
$ find . \( -type l -o -type f \) -exec ls -l {} \;
-rw-r--r--  1 msamuel  wheel  15 Jun 25 14:40 ./a/etc/shadow
lrwxr-xr-x  1 msamuel  wheel  19 Jun 25 14:40 ./a/a/a/a/a -> ../../../etc/shadow
lrwxr-xr-x  1 msamuel  wheel  3 Jun 25 14:40 ./c -> b/a
lrwxr-xr-x  1 msamuel  wheel  8 Jun 25 14:40 ./b -> a/a/a/a/

rsc · 2018-06-25T20:08:47Z

I'd like to see an answer that covers archive/zip, archive/tar, and anything else in a consistent way. We can take our time. This is a very very very old "problem".

mikesamuel · 2018-06-27T17:34:33Z

How about this:

archive/tar

fork NewReader into
- NewReader which creates a reader with an internal bit set indicating it should be self contained
- NewTrustedArchiveReader which creates a reader with that bit unset.
Change func (tr *Reader) Read(b []byte) (int, error) so that it reports an error for any entry that violates self containment as defined in archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849 (comment)
Make sure that the Sys() is nil for anyos.FileInfo from FileInfo() for an entry that violates self-containment. Alternatively, produce an error.

archive/zip

same as archive/tar
fork other Open methods and/or possibly associate a trustedness bit with File

rsc · 2018-09-26T18:25:00Z

Separate "NewTrustedArchiveReader" etc seems like overkill. A new InsecureName or RawName, as proposed above, combined with sanitizing the current Name field, seems like enough. But then you need to define what "sanitized" means. Is it just ".." or do we care about other things, like files named COM1 or with backslashes or ... ?

dsnet · 2018-09-26T19:50:59Z

I'm not a fan of duplicated header fields.

When writing, it is not clear whether you are required to populate both Name and InsecureName.
In the case of archive/tar you would need both InsecureName and InsecureLinkname.
When reading, how does duplicated fields help the security issue? If I'm reading a file entry, and it is not "self-contained", does the reader just put an empty Name in the header and populate the InsecureName instead? Is it my responsibility to check that Name is not set, but that InsecureName is? If the suggestion is a "sanitized" name, how could that work for "../foo.txt"?

The more I think about this, I like my suggestion of a non-fatal distinguishable error (#25849 (comment)).

rsc · 2018-11-28T18:39:48Z

This seems OK to leave until Go 1.13.

ALTree · 2021-07-16T06:52:07Z

Note: this issue was referenced in https://blog.ryotak.me/post/cdnjs-remote-code-execution-en/, and if my reading of the blog post is correct, the current archive/tar behaviour is what enabled the exploit.

ianlancetaylor · 2022-10-10T22:56:52Z

Anybody interested in this, please see the proposal at #55356. Please comment on that proposal if it would not help. Thanks.

gopherbot · 2022-11-11T21:38:35Z

Change https://go.dev/cl/449937 mentions this issue: archive/tar, archive/zip: return ErrInsecurePath for unsafe paths

bradfitz added Security NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Jun 12, 2018

ianlancetaylor added this to the Go1.11 milestone Jun 19, 2018

ianlancetaylor added the release-blocker label Jun 19, 2018

rsc added early-in-cycle A change that should be done early in the 3 month dev cycle. and removed release-blocker labels Jun 25, 2018

rsc modified the milestones: Go1.11, Go1.12 Jun 25, 2018

rsc removed this from the Go1.12 milestone Nov 28, 2018

rsc added this to the Go1.13 milestone Nov 28, 2018

bradfitz modified the milestones: Go1.13, Go1.14 Apr 29, 2019

shivamdixit mentioned this issue Sep 11, 2019

Fix zip slip vulnerability uber/astro#47

Merged

rsc modified the milestones: Go1.14, Backlog Oct 9, 2019

neild self-assigned this Jul 13, 2022

neild mentioned this issue Sep 22, 2022

archive/tar, archive/zip: add ErrInsecurePath #55356

Open

gopherbot closed this as completed in a2d8157 Nov 16, 2022

dmitshur modified the milestones: Backlog, Go1.20 Nov 21, 2022

dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Nov 21, 2022

neild mentioned this issue Jan 17, 2023

proposal: archive/tar, archive/zip: add NewReaderOptions with directory traversal defenses #57850

Open

golang locked and limited conversation to collaborators Nov 21, 2023

gopherbot added the FrozenDueToAge label Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849

archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849

bradfitz commented Jun 12, 2018 •

edited by FiloSottile

Loading

bradfitz commented Jun 12, 2018

dsnet commented Jun 12, 2018

dsnet commented Jun 12, 2018

bradfitz commented Jun 12, 2018

dsnet commented Jun 12, 2018

FiloSottile commented Jun 12, 2018

FiloSottile commented Jun 12, 2018

bradfitz commented Jun 12, 2018

gopherbot commented Jun 12, 2018

slrz commented Jun 12, 2018

mholt commented Jun 12, 2018

odeke-em commented Jun 22, 2018

mikesamuel commented Jun 22, 2018

mikesamuel commented Jun 25, 2018

rsc commented Jun 25, 2018

mikesamuel commented Jun 27, 2018

rsc commented Sep 26, 2018

dsnet commented Sep 26, 2018 •

edited

Loading

rsc commented Nov 28, 2018

ALTree commented Jul 16, 2021 •

edited

Loading

ianlancetaylor commented Oct 10, 2022

gopherbot commented Nov 11, 2022

archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849

archive/zip: sanitize the FileHeader.Name to remove path traversal ("../../") from zip files? #25849

Comments

bradfitz commented Jun 12, 2018 • edited by FiloSottile Loading

bradfitz commented Jun 12, 2018

dsnet commented Jun 12, 2018

dsnet commented Jun 12, 2018

bradfitz commented Jun 12, 2018

dsnet commented Jun 12, 2018

FiloSottile commented Jun 12, 2018

FiloSottile commented Jun 12, 2018

bradfitz commented Jun 12, 2018

gopherbot commented Jun 12, 2018

slrz commented Jun 12, 2018

mholt commented Jun 12, 2018

odeke-em commented Jun 22, 2018

mikesamuel commented Jun 22, 2018

mikesamuel commented Jun 25, 2018

rsc commented Jun 25, 2018

mikesamuel commented Jun 27, 2018

archive/tar

archive/zip

rsc commented Sep 26, 2018

dsnet commented Sep 26, 2018 • edited Loading

rsc commented Nov 28, 2018

ALTree commented Jul 16, 2021 • edited Loading

ianlancetaylor commented Oct 10, 2022

gopherbot commented Nov 11, 2022

bradfitz commented Jun 12, 2018 •

edited by FiloSottile

Loading

dsnet commented Sep 26, 2018 •

edited

Loading

ALTree commented Jul 16, 2021 •

edited

Loading