Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

fix incorrect encoding of filenames with spaces in #2090

Merged
merged 7 commits into from
Mar 11, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/2090.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix a bug where media with spaces in the name would get a corrupted name.
54 changes: 51 additions & 3 deletions synapse/rest/media/v1/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,29 @@ def _quote(x):

request.setHeader(b"Content-Type", media_type.encode("UTF-8"))
if upload_name:
if is_ascii(upload_name):
disposition = "inline; filename=%s" % (_quote(upload_name),)
# RFC6266 section 4.1 [1] defines both `filename` and `filename*`.
#
# `filename` is defined to be a `value`, which is defined by RFC2616
# section 3.6 [2] to be a `token` or a `quoted-string`, where a `token`
# is (essentially) a single US-ASCII word, and a `quoted-string` is a
# US-ASCII string surrounded by double-quotes, using backslash as an
# escape charater. Note that %-encoding is *not* permitted.
#
# `filename*` is defined to be an `ext-value`, which is defined in
# RFC5987 section 3.2.1 [3] to be `charset "'" [ language ] "'" value-chars`,
# where `value-chars` is essentially a %-encoded string in the given charset.
#
# [1]: https://tools.ietf.org/html/rfc6266#section-4.1
# [2]: https://tools.ietf.org/html/rfc2616#section-3.6
# [3]: https://tools.ietf.org/html/rfc5987#section-3.2.1

# We avoid the quoted-string version of `filename`, because (a) synapse didn't
# correctly interpret those as of 0.99.2 and (b) they are a bit of a pain and we
# may as well just do the filename* version.
if _can_encode_filename_as_token(upload_name):
disposition = 'inline; filename=%s' % (upload_name, )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not do filename* everywhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main thinking is that some things won't have good support for filename, and we should cater to simple clients where possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean good support for filename*?

okay, fair enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, yes that's what I mean.

else:
disposition = "inline; filename*=utf-8''%s" % (_quote(upload_name),)
disposition = "inline; filename*=utf-8''%s" % (_quote(upload_name), )

request.setHeader(b"Content-Disposition", disposition.encode('ascii'))

Expand All @@ -116,6 +135,35 @@ def _quote(x):
request.setHeader(b"Content-Length", b"%d" % (file_size,))


# separators as defined in RFC2616. SP and HT are handled separately.
# see _can_encode_filename_as_token.
_FILENAME_SEPARATOR_CHARS = set((
"(", ")", "<", ">", "@", ",", ";", ":", "\\", '"',
"/", "[", "]", "?", "=", "{", "}",
))


def _can_encode_filename_as_token(x):
for c in x:
# from RFC2616:
#
# token = 1*<any CHAR except CTLs or separators>
#
# separators = "(" | ")" | "<" | ">" | "@"
# | "," | ";" | ":" | "\" | <">
# | "/" | "[" | "]" | "?" | "="
# | "{" | "}" | SP | HT
#
# CHAR = <any US-ASCII character (octets 0 - 127)>
#
# CTL = <any US-ASCII control character
# (octets 0 - 31) and DEL (127)>
#
if ord(c) >= 127 or ord(c) <= 32 or c in _FILENAME_SEPARATOR_CHARS:
return False
return True


@defer.inlineCallbacks
def respond_with_responder(request, responder, media_type, file_size, upload_name=None):
"""Responds to the request with given responder. If responder is None then
Expand Down