Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 ls error when keys are non-ASCII in eu-central-1 #1046

Closed
tgal opened this issue Dec 6, 2014 · 1 comment · Fixed by boto/botocore#412
Closed

s3 ls error when keys are non-ASCII in eu-central-1 #1046

tgal opened this issue Dec 6, 2014 · 1 comment · Fixed by boto/botocore#412
Assignees
Labels
bug This issue is a bug.

Comments

@tgal
Copy link

tgal commented Dec 6, 2014

Hello!

My version: aws-cli/1.6.6 Python/2.7.6 Linux/3.13.0-40-generic

To demostrate this problem I have set up a bucket in the region eu-central-1 and created five non-ASCII keys (with the most common German diacritics) inside:

$ aws s3 ls s3://tgal-test-eu-central --region eu-central-1
2014-12-06 18:50:38          5 non-ascii-key-äöü-00.txt
2014-12-06 18:50:38          5 non-ascii-key-äöü-01.txt
2014-12-06 18:50:38          5 non-ascii-key-äöü-02.txt
2014-12-06 18:50:38          5 non-ascii-key-äöü-03.txt
2014-12-06 18:50:38          5 non-ascii-key-äöü-04.txt

This works fine. But when I use a smaller page-size so that the "marker" field contains one of the non-ASCII strings I get this error:

2014-12-06 18:50:38          5 non-ascii-key-äöü-00.txt
/usr/lib/python2.7/urllib.py:1288: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  return ''.join(map(quoter, s))

u'\xc3'

The same with --debug:

$ aws s3 ls s3://tgal-test-eu-central --region eu-central-1 --page-size 1 --debug
2014-12-06 19:27:27,576 - MainThread - awscli.clidriver - DEBUG - CLI version: aws-cli/1.6.6 Python/2.7.6 Linux/3.13.0-40-generic, botocore version: 0.77.0
2014-12-06 19:27:27,576 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function add_scalar_parsers at 0x7fd7bd097500>
2014-12-06 19:27:27,577 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function inject_assume_role_provider at 0x7fd7bd2e3938>
2014-12-06 19:27:27,578 - MainThread - botocore.hooks - DEBUG - Event building-command-table.s3: calling handler <function add_waiters at 0x7fd7bd2ea488>
2014-12-06 19:27:27,580 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.custom.s3.anonymous: calling handler <function uri_param at 0x7fd7bd40d0c8>
2014-12-06 19:27:27,580 - MainThread - botocore.hooks - DEBUG - Event building-command-table.ls: calling handler <function add_waiters at 0x7fd7bd2ea488>
2014-12-06 19:27:27,582 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.custom.ls.paths: calling handler <function uri_param at 0x7fd7bd40d0c8>
2014-12-06 19:27:27,583 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.custom.ls.anonymous: calling handler <function uri_param at 0x7fd7bd40d0c8>
2014-12-06 19:27:27,583 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.custom.ls.page-size: calling handler <function uri_param at 0x7fd7bd40d0c8>
2014-12-06 19:27:27,583 - MainThread - botocore.hooks - DEBUG - Event process-cli-arg.custom.ls: calling handler <awscli.argprocess.ParamShorthand object at 0x7fd7bd098090>
2014-12-06 19:27:27,583 - MainThread - awscli.argprocess - DEBUG - Detected structure: scalar
2014-12-06 19:27:27,584 - MainThread - botocore.service - DEBUG - Creating service object for: s3
2014-12-06 19:27:27,641 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function register_retries_for_service at 0x7fd7bd7c01b8>
2014-12-06 19:27:27,645 - MainThread - botocore.handlers - DEBUG - Registering retry handlers for service: s3
2014-12-06 19:27:27,646 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function signature_overrides at 0x7fd7bd7c0320>
2014-12-06 19:27:27,646 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function register_retries_for_service at 0x7fd7bd7c01b8>
2014-12-06 19:27:27,646 - MainThread - botocore.handlers - DEBUG - Registering retry handlers for service: s3
2014-12-06 19:27:27,646 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function signature_overrides at 0x7fd7bd7c0320>
2014-12-06 19:27:27,648 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: env
2014-12-06 19:27:27,648 - MainThread - botocore.credentials - INFO - Found credentials in environment variables.
2014-12-06 19:27:27,650 - MainThread - botocore.service - DEBUG - Creating operation objects for: Service(s3)
2014-12-06 19:27:27,658 - MainThread - botocore.operation - DEBUG - Operation:ListObjects called with kwargs: {u'MaxKeys': 1, 'prefix': '', 'bucket': u'tgal-test-eu-central', 'delimiter': '/'}
2014-12-06 19:27:27,660 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjects: calling handler <function add_expect_header at 0x7fd7bd7c0398>
2014-12-06 19:27:27,660 - MainThread - botocore.endpoint - DEBUG - Making request for <botocore.model.OperationModel object at 0x7fd7bcdbe290> (verify_ssl=True) with params: {'query_string': {u'delimiter': '/', u'max-keys': 1, u'prefix': ''}, 'headers': {}, 'url_path': u'/tgal-test-eu-central', 'body': '', 'method': u'GET'}
2014-12-06 19:27:27,661 - MainThread - botocore.hooks - DEBUG - Event before-auth.s3: calling handler <function fix_s3_host at 0x7fd7bd7c0050>
2014-12-06 19:27:27,662 - MainThread - botocore.auth - DEBUG - Calculating signature using v4 auth.
2014-12-06 19:27:27,662 - MainThread - botocore.auth - DEBUG - CanonicalRequest:
GET
/tgal-test-eu-central
delimiter=%2F&max-keys=1&prefix=
host:s3.eu-central-1.amazonaws.com
user-agent:aws-cli/1.6.6 Python/2.7.6 Linux/3.13.0-40-generic
x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date:20141206T182727Z

host;user-agent;x-amz-content-sha256;x-amz-date
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
2014-12-06 19:27:27,663 - MainThread - botocore.auth - DEBUG - StringToSign:
AWS4-HMAC-SHA256
20141206T182727Z
20141206/eu-central-1/s3/aws4_request
b63f9a35e6765a2a729c00db459082aa9afea5178e7a3d20315440330512cb36
2014-12-06 19:27:27,663 - MainThread - botocore.auth - DEBUG - Signature:
29b04742e42f8e5dd579e243413b86140e1a30a3bbd0292048dc439b1b9ead25
2014-12-06 19:27:27,671 - MainThread - botocore.endpoint - DEBUG - Sending http request: <PreparedRequest [GET]>
2014-12-06 19:27:27,672 - MainThread - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (1): s3.eu-central-1.amazonaws.com
2014-12-06 19:27:27,738 - MainThread - botocore.vendored.requests.packages.urllib3.connectionpool - DEBUG - "GET /tgal-test-eu-central?delimiter=%2F&max-keys=1&prefix= HTTP/1.1" 200 None
2014-12-06 19:27:27,741 - MainThread - botocore.parsers - DEBUG - Response headers:
{'content-type': 'application/xml',
 'date': 'Sat, 06 Dec 2014 18:27:28 GMT',
 'server': 'AmazonS3',
 'transfer-encoding': 'chunked',
 'x-amz-id-2': 'fgjM4xWchv7TrOo8HYZj73qXdniTRYLcjRxMlLKNumgNdcl+Y8LKenu6V0uMUkMT4ZHkRKxolvE=',
 'x-amz-request-id': '7F7F9DA0879C807F'}
2014-12-06 19:27:27,742 - MainThread - botocore.parsers - DEBUG - Response body:
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>tgal-test-eu-central</Name><Prefix></Prefix><Marker></Marker><NextMarker>non-ascii-key-äöü-00.txt</NextMarker><MaxKeys>1</MaxKeys><Delimiter>/</Delimiter><IsTruncated>true</IsTruncated><Contents><Key>non-ascii-key-äöü-00.txt</Key><LastModified>2014-12-06T17:50:38.000Z</LastModified><ETag>&quot;2205e48de5f93c784733ffcca841d2b5&quot;</ETag><Size>5</Size><Owner><ID>54ebb9ca2d1870c1e3dfbbd499c0ab442603b06d495b6966fa6a577010bf2156</ID></Owner><StorageClass>STANDARD</StorageClass></Contents></ListBucketResult>
2014-12-06 19:27:27,744 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjects: calling handler <botocore.retryhandler.RetryHandler object at 0x7fd7bce9ca50>
2014-12-06 19:27:27,744 - MainThread - botocore.retryhandler - DEBUG - No retry needed.
2014-12-06 19:27:27,744 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function enhance_error_msg at 0x7fd7bd2d6578>
2014-12-06 19:27:27,744 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <awscli.errorhandler.ErrorHandler object at 0x7fd7bd0980d0>
2014-12-06 19:27:27,745 - MainThread - awscli.errorhandler - DEBUG - HTTP Response Code: 200
2014-12-06 18:50:38          5 non-ascii-key-äöü-00.txt
2014-12-06 19:27:27,746 - MainThread - botocore.operation - DEBUG - Operation:ListObjects called with kwargs: {u'Marker': u'non-ascii-key-\xe4\xf6\xfc-00.txt', u'MaxKeys': 1, 'delimiter': '/', 'bucket': u'tgal-test-eu-central', 'prefix': ''}
2014-12-06 19:27:27,747 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjects: calling handler <function add_expect_header at 0x7fd7bd7c0398>
2014-12-06 19:27:27,747 - MainThread - botocore.endpoint - DEBUG - Making request for <botocore.model.OperationModel object at 0x7fd7bcdbe290> (verify_ssl=True) with params: {'query_string': {u'marker': u'non-ascii-key-\xe4\xf6\xfc-00.txt', u'delimiter': '/', u'max-keys': 1, u'prefix': ''}, 'headers': {}, 'url_path': u'/tgal-test-eu-central', 'body': '', 'method': u'GET'}
2014-12-06 19:27:27,748 - MainThread - botocore.hooks - DEBUG - Event before-auth.s3: calling handler <function fix_s3_host at 0x7fd7bd7c0050>
/usr/lib/python2.7/urllib.py:1288: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  return ''.join(map(quoter, s))
2014-12-06 19:27:27,750 - MainThread - awscli.clidriver - DEBUG - Exception caught in main()
Traceback (most recent call last):
  File "/home/aws/lib/lib/python2.7/site-packages/awscli/clidriver.py", line 197, in main
    return command_table[parsed_args.command](remaining, parsed_args)
  File "/home/aws/lib/lib/python2.7/site-packages/awscli/customizations/commands.py", line 184, in __call__
    parsed_globals)
  File "/home/aws/lib/lib/python2.7/site-packages/awscli/customizations/commands.py", line 181, in __call__
    return self._run_main(parsed_args, parsed_globals)
  File "/home/aws/lib/lib/python2.7/site-packages/awscli/customizations/s3/subcommands.py", line 260, in _run_main
    self._list_all_objects(bucket, key, parsed_args.page_size)
  File "/home/aws/lib/lib/python2.7/site-packages/awscli/customizations/s3/subcommands.py", line 281, in _list_all_objects
    for _, response_data in iterator:
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/paginate.py", line 69, in __iter__
    response = self._make_request(current_kwargs)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/paginate.py", line 386, in _make_request
    return self._operation.call(self._endpoint, **current_kwargs)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/operation.py", line 90, in call
    response = endpoint.make_request(self.model, request_dict)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/endpoint.py", line 108, in make_request
    prepared_request = self.create_request(request_dict)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/endpoint.py", line 134, in create_request
    prepared_request = self.prepare_request(request, signer)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/endpoint.py", line 163, in prepare_request
    signer.add_auth(request=request)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/auth.py", line 313, in add_auth
    canonical_request = self.canonical_request(request)
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/auth.py", line 252, in canonical_request
    cr.append(self.canonical_query_string(request))
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/auth.py", line 181, in canonical_query_string
    return self._canonical_query_string_url(urlsplit(request.url))
  File "/home/aws/lib/lib/python2.7/site-packages/botocore/auth.py", line 204, in _canonical_query_string_url
    quote(unquote(q[1]), safe='-_.~')))
  File "/usr/lib/python2.7/urllib.py", line 1288, in quote
    return ''.join(map(quoter, s))
KeyError: u'\xc3'
2014-12-06 19:27:27,752 - MainThread - awscli.clidriver - DEBUG - Exiting with rc 255

u'\xc3'

Is this possible to fix or work around? I specifically need the sync functionality which has the same problem. With region eu-west-1 it works okay, that's why I suppose it might have something to do with v4 signature?

Thanks,
Tobias

@jamesls
Copy link
Member

jamesls commented Dec 8, 2014

More debug info below.

Full traceback:

2014-12-08 14:33:40,654 - MainThread - botocore.hooks - DEBUG - Event before-auth.s3: calling handler <function fix_s3_host at 0x105a02e60>
/usr/local/Cellar/python/2.7.7_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py:1288: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  return ''.join(map(quoter, s))
2014-12-08 14:33:40,656 - MainThread - awscli.customizations.s3.s3handler - DEBUG - Exception caught during task execution: u'\xe2'
Traceback (most recent call last):
  File "aws-cli/awscli/customizations/s3/s3handler.py", line 91, in call
    total_files, total_parts = self._enqueue_tasks(files)
  File "aws-cli/awscli/customizations/s3/s3handler.py", line 172, in _enqueue_tasks
    for filename in files:
  File "aws-cli/awscli/customizations/s3/fileinfobuilder.py", line 32, in call
    for file_base in files:
  File "aws-cli/awscli/customizations/s3/comparator.py", line 85, in call
    dest_file = advance_iterator(dest_files)
  File "aws-cli/awscli/customizations/s3/filegenerator.py", line 140, in call
    for src_path, size, last_update in file_list:
  File "aws-cli/awscli/customizations/s3/filegenerator.py", line 286, in list_objects
    page_size=self.page_size):
  File "aws-cli/awscli/customizations/s3/utils.py", line 401, in list_objects
    for response, page in pages:
  File "botocore/botocore/paginate.py", line 69, in __iter__
    response = self._make_request(current_kwargs)
  File "botocore/botocore/paginate.py", line 386, in _make_request
    return self._operation.call(self._endpoint, **current_kwargs)
  File "botocore/botocore/operation.py", line 90, in call
    response = endpoint.make_request(self.model, request_dict)
  File "botocore/botocore/endpoint.py", line 108, in make_request
    prepared_request = self.create_request(request_dict)
  File "botocore/botocore/endpoint.py", line 134, in create_request
    prepared_request = self.prepare_request(request, signer)
  File "botocore/botocore/endpoint.py", line 163, in prepare_request
    signer.add_auth(request=request)
  File "botocore/botocore/auth.py", line 315, in add_auth
    canonical_request = self.canonical_request(request)
  File "botocore/botocore/auth.py", line 254, in canonical_request
    cr.append(self.canonical_query_string(request))
  File "botocore/botocore/auth.py", line 183, in canonical_query_string
    return self._canonical_query_string_url(urlsplit(request.url))
  File "botocore/botocore/auth.py", line 206, in _canonical_query_string_url
    quote(unquote(q[1]), safe='-_.~')))
  File "/usr/local/Cellar/python/2.7.7_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1288, in quote
    return ''.join(map(quoter, s))
KeyError: u'\xe2'

Repro steps:

1. Create files with non-ascii chars:

$ touch ✓   ✔

2. Initial sync to an eu-central-1 bucket (should work without issues):

aws s3 sync . s3://bucket-in-eu-central-1/ --region eu-central-1

3. Try to sync with a page size of 1 (should get the traceback above):

aws s3 sync . s3://bucket-in-eu-central-1/ --region eu-central-1 --page-size 1

@jamesls jamesls added bug This issue is a bug. confirmed labels Dec 8, 2014
thoward-godaddy pushed a commit to thoward-godaddy/aws-cli that referenced this issue Feb 12, 2022
Resolve Docker on Mac performance issues (https://docs.docker.com/docker-for-mac/osxfs-caching/) by allowing the project folder to be mounted as both "read only" (current behavior) and "delegated".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants