-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added support for generation-based query and delete of blobs. #1435
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
I signed the CLA as Vital Labs, Inc. |
path = blob.path | ||
if generation is not None: | ||
path = (blob.path + '?generation=' + | ||
quote(str(generation), safe='')) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@tsinha You can ignore the AppVeyor failure for Updating @jgeewax What should be done about matching his signature of the CLA with his GitHub handle? |
I'll look at the blob level methods and make sure everything works with the bucket level query, or refactor as you suggested if needed. I'll post an update shortly. |
CLAs look good, thanks! |
…I paths. added generation support to blob.delete and blob.exists.
@dhermes I looked over and fixed blob.exists and blob.delete. Still unsure about adding generation param directly into blob.path It doesn't seem as clean as having the api_request create the appropriate paths via blob.path and query_params at the bucket level. Otherwise, I think this PR should be ready to go. |
@tsinha the generation is required to be in the path, since it uniquely defines the object if specified.
|
@dhermes Thanks for the feedback. Can you help me brainstorm how to reconcile the way in which api_requests are made with embedding the generation query parameter at the blob level? This confounds my current WIP, which will enable the renaming of blobs while preserving versions, since the 'copyTo' API end-point requires another query param ('sourceGeneration') to copy the right version. If we move generation into blob.path, I'd end up with something like the following: But I may not be thinking it through completely... any advice you have would be helpful. |
I think you've just thought about it more than I had 😀 However, I'd still rather special case You could also leave |
…l to help with generation based queries of blobs. also added version-based blob copying and renaming. fixed a typo in _helpers.py.
@dhermes I've added a new blob.path_with_params function to alleviate some of the checking at the bucket level: here and here. I've also got a working version of the copy_blob and rename_blob functions with versioning support here and here. I changed the API for copy_blob a little to fit into the existing rename_blob logic... that is, if versioned copy is enabled, copy_blob returns a tuple of two lists that contain the old and new blobs... rename_blobs takes the old_blobs list from this tuple and deletes them, and then returns the new_blobs list. If versioning is not enabled, then copy_blobs's return parameters are unchanged from the existing API. What do you think about this approach? I can add testing around this if it looks reasonable. |
@tsinha Sorry for the review delay. I'm traveling and won't have a chance to look at this until next Monday. (I feel bad, sorry for the negative experience.) |
No worries... I look forward to your feedback next week. Safe travels. |
if self.generation is not None: | ||
params = {'generation': self.generation} | ||
|
||
return (self.path, params) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@dhermes I have not forgotten about this. Tied up with some other stuff and will get back to this ASAP. |
Thanks for the update |
@tillahoffmann There have been lots of changes to Peeking at the code, at least the |
@dhermes, thanks for the update. Do you have any recommendations on how to access md5 and crc32c hashes given a generation number from python? |
Sure thing. Here is a somewhat icky hack (but it'll work for now). For a file with multiple versions:
If you >>> from google.cloud import storage
>>>
>>> client = storage.Client()
>>> bucket = client.bucket('some-bucket')
>>> blob_normal = bucket.blob('file.txt')
>>> blob_normal.reload()
>>> blob_normal._properties
{u'bucket': u'some-bucket',
u'contentType': u'text/plain',
u'crc32c': u'JcxYOQ==',
u'etag': u'CLH7moCB59QCEAE=',
u'generation': u'1498875135704497',
u'id': u'some-bucket/file.txt/1498875135704497',
u'kind': u'storage#object',
u'md5Hash': u'F/IyVDfyzIWJW12F0xO6+Q==',
u'mediaLink': u'https://www.googleapis.com/download/storage/v1/b/some-bucket/o/file.txt?generation=1498875135704497&alt=media',
u'metageneration': u'1',
u'name': u'file.txt',
u'selfLink': u'https://www.googleapis.com/storage/v1/b/some-bucket/o/file.txt',
u'size': u'1672',
u'storageClass': u'MULTI_REGIONAL',
u'timeCreated': u'2017-07-01T02:12:15.663Z',
u'timeStorageClassUpdated': u'2017-07-01T02:12:15.663Z',
u'updated': u'2017-07-01T02:12:15.663Z'} However, if you manually patch >>> from google.cloud.storage.blob import _quote
>>>
>>> GENERATION = '1498874876972049'
>>>
>>> def reload(self, client=None):
... client = self._require_client(client)
... query_params = {'projection': 'noAcl', 'generation': GENERATION}
... api_response = client._connection.api_request(
... method='GET', path=self.path, query_params=query_params,
... _target_object=self)
... self._set_properties(api_response)
...
>>>
>>> blob_patched = bucket.blob('file.txt')
>>> blob_patched.reload = reload.__get__(blob_patched)
>>> blob_patched._properties
{u'bucket': u'some-bucket',
u'contentType': u'text/plain',
u'crc32c': u'MO9tZQ==',
u'etag': u'CJGY64SA59QCEAE=',
u'generation': u'1498874876972049',
u'id': u'some-bucket/file.txt/1498874876972049',
u'kind': u'storage#object',
u'md5Hash': u'zPc9wCyNcKGiYV3ijiB2iw==',
u'mediaLink': u'https://www.googleapis.com/download/storage/v1/b/some-bucket/o/file.txt?generation=1498874876972049&alt=media',
u'metageneration': u'1',
u'name': u'file.txt',
u'selfLink': u'https://www.googleapis.com/storage/v1/b/some-bucket/o/file.txt',
u'size': u'1635',
u'storageClass': u'MULTI_REGIONAL',
u'timeCreated': u'2017-07-01T02:07:56.965Z',
u'timeDeleted': u'2017-07-01T02:12:15.704Z',
u'timeStorageClassUpdated': u'2017-07-01T02:07:56.965Z',
u'updated': u'2017-07-01T02:07:56.965Z'} |
Hi If there are any outstanding issues remaining, feel free to re-open (or open a new issue). |
…tform/python-docs-samples#1435) * remove face detection samples * update docstring * linter * linter
Small modifications to get_blob and delete_blob to handle generation ids when querying buckets that have versioning enabled.