added support for generation-based query and delete of blobs. #1435

tsinha · 2016-02-01T04:58:03Z

Small modifications to get_blob and delete_blob to handle generation ids when querying buckets that have versioning enabled.

googlebot · 2016-02-01T04:58:07Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If you signed the CLA as a corporation, please let us know the company's name.

tsinha · 2016-02-01T05:02:50Z

I signed the CLA as Vital Labs, Inc.

gcloud/storage/bucket.py

+            path = blob.path
+            if generation is not None:
+                path = (blob.path + '?generation=' +
+                        quote(str(generation), safe=''))


dhermes · 2016-02-01T19:18:58Z

@tsinha You can ignore the AppVeyor failure for test_upload_from_file_w_bound_client_multipart, we are tracking it on #1434.

Updating Bucket.delete_blob and Bucket.get_blob is insufficient here, Blob.delete and Blob.exists and Blob.reload may also need updating (or you may get that for free when updating Blob.path)

@jgeewax What should be done about matching his signature of the CLA with his GitHub handle?

tsinha · 2016-02-01T20:07:59Z

I'll look at the blob level methods and make sure everything works with the bucket level query, or refactor as you suggested if needed.

I'll post an update shortly.

googlebot · 2016-02-01T20:08:01Z

CLAs look good, thanks!

…I paths. added generation support to blob.delete and blob.exists.

…ction

tsinha · 2016-02-02T16:33:59Z

@dhermes I looked over and fixed blob.exists and blob.delete.

Still unsure about adding generation param directly into blob.path

It doesn't seem as clean as having the api_request create the appropriate paths via blob.path and query_params at the bucket level.

Otherwise, I think this PR should be ready to go.

dhermes · 2016-02-03T19:36:37Z

@tsinha the generation is required to be in the path, since it uniquely defines the object if specified.

Bucket.delete_blob and Bucket.get_blob are meant to be "dumb" (i.e. they don't know how to do much) and the knowledge of how to make the request should live on Blob.

tsinha · 2016-02-03T21:49:18Z

@dhermes Thanks for the feedback.

Can you help me brainstorm how to reconcile the way in which api_requests are made with embedding the generation query parameter at the blob level?

This confounds my current WIP, which will enable the renaming of blobs while preserving versions, since the 'copyTo' API end-point requires another query param ('sourceGeneration') to copy the right version.

If we move generation into blob.path, I'd end up with something like the following:

https://www.googleapis.com/storage/v1/b/sourceBucket/o/sourceObject?generation=XXX/copyTo/b/destinationBucket/o/destinationObject?sourceGeneration=XXX

But I may not be thinking it through completely... any advice you have would be helpful.

dhermes · 2016-02-03T21:53:44Z

I think you've just thought about it more than I had 😀

However, I'd still rather special case Bucket.copy_blob than special case every other place a generation might be used.

You could also leave Blob.path as-is and we could have another property that also contains identifying informating in query params, like Blob.path_with_params (or a name that is less hideous).

…l to help with generation based queries of blobs. also added version-based blob copying and renaming. fixed a typo in _helpers.py.

tsinha · 2016-02-04T04:25:39Z

@dhermes I've added a new blob.path_with_params function to alleviate some of the checking at the bucket level: here and here.

I've also got a working version of the copy_blob and rename_blob functions with versioning support here and here.

I changed the API for copy_blob a little to fit into the existing rename_blob logic... that is, if versioned copy is enabled, copy_blob returns a tuple of two lists that contain the old and new blobs... rename_blobs takes the old_blobs list from this tuple and deletes them, and then returns the new_blobs list.

If versioning is not enabled, then copy_blobs's return parameters are unchanged from the existing API.

What do you think about this approach? I can add testing around this if it looks reasonable.

dhermes · 2016-02-05T13:05:34Z

@tsinha Sorry for the review delay. I'm traveling and won't have a chance to look at this until next Monday. (I feel bad, sorry for the negative experience.)

tsinha · 2016-02-05T23:27:05Z

No worries... I look forward to your feedback next week. Safe travels.

gcloud/storage/blob.py

+        if self.generation is not None:
+            params = {'generation': self.generation}
+
+        return (self.path, params)


tsinha · 2016-02-29T17:17:02Z

@dhermes I have not forgotten about this. Tied up with some other stuff and will get back to this ASAP.

dhermes · 2016-02-29T17:45:10Z

Thanks for the update

tillahoffmann · 2017-06-30T21:29:50Z

@dhermes, @tsinha, thanks for looking into this. Do you have any updates on when this PR might get merged to address #2463?

dhermes · 2017-06-30T22:48:47Z

@tillahoffmann There have been lots of changes to blob since this PR was originally sent. Some code paths in the current release (google-cloud-storage==1.2.0) will "just work" for this feature.

Peeking at the code, at least the Blob.download_* functions will incorporate the generation (but, as a hack, only if media_link isn't set on the Blob instance). As for delete, it calls out to Blob.path_helper which does not incorporate the generation.

tillahoffmann · 2017-07-01T00:11:48Z

@dhermes, thanks for the update. Do you have any recommendations on how to access md5 and crc32c hashes given a generation number from python?

dhermes · 2017-07-01T02:27:56Z

Sure thing. Here is a somewhat icky hack (but it'll work for now).

For a file with multiple versions:

$ gsutil ls -a gs://some-bucket
gs://some-bucket/file.txt#1498874876972049
gs://some-bucket/file.txt#1498875135704497

If you reload() it like normal, you'll see the latest version:

>>> from google.cloud import storage
>>>
>>> client = storage.Client()
>>> bucket = client.bucket('some-bucket')
>>> blob_normal = bucket.blob('file.txt')
>>> blob_normal.reload()
>>> blob_normal._properties
{u'bucket': u'some-bucket',
 u'contentType': u'text/plain',
 u'crc32c': u'JcxYOQ==',
 u'etag': u'CLH7moCB59QCEAE=',
 u'generation': u'1498875135704497',
 u'id': u'some-bucket/file.txt/1498875135704497',
 u'kind': u'storage#object',
 u'md5Hash': u'F/IyVDfyzIWJW12F0xO6+Q==',
 u'mediaLink': u'https://www.googleapis.com/download/storage/v1/b/some-bucket/o/file.txt?generation=1498875135704497&alt=media',
 u'metageneration': u'1',
 u'name': u'file.txt',
 u'selfLink': u'https://www.googleapis.com/storage/v1/b/some-bucket/o/file.txt',
 u'size': u'1672',
 u'storageClass': u'MULTI_REGIONAL',
 u'timeCreated': u'2017-07-01T02:12:15.663Z',
 u'timeStorageClassUpdated': u'2017-07-01T02:12:15.663Z',
 u'updated': u'2017-07-01T02:12:15.663Z'}

However, if you manually patch reload on the object, you can get the data you want:

>>> from google.cloud.storage.blob import _quote
>>>
>>> GENERATION = '1498874876972049'
>>>
>>> def reload(self, client=None):
...     client = self._require_client(client)
...     query_params = {'projection': 'noAcl', 'generation': GENERATION}
...     api_response = client._connection.api_request(
...         method='GET', path=self.path, query_params=query_params,
...         _target_object=self)
...     self._set_properties(api_response)
...
>>>
>>> blob_patched = bucket.blob('file.txt')
>>> blob_patched.reload = reload.__get__(blob_patched)
>>> blob_patched._properties
{u'bucket': u'some-bucket',
 u'contentType': u'text/plain',
 u'crc32c': u'MO9tZQ==',
 u'etag': u'CJGY64SA59QCEAE=',
 u'generation': u'1498874876972049',
 u'id': u'some-bucket/file.txt/1498874876972049',
 u'kind': u'storage#object',
 u'md5Hash': u'zPc9wCyNcKGiYV3ijiB2iw==',
 u'mediaLink': u'https://www.googleapis.com/download/storage/v1/b/some-bucket/o/file.txt?generation=1498874876972049&alt=media',
 u'metageneration': u'1',
 u'name': u'file.txt',
 u'selfLink': u'https://www.googleapis.com/storage/v1/b/some-bucket/o/file.txt',
 u'size': u'1635',
 u'storageClass': u'MULTI_REGIONAL',
 u'timeCreated': u'2017-07-01T02:07:56.965Z',
 u'timeDeleted': u'2017-07-01T02:12:15.704Z',
 u'timeStorageClassUpdated': u'2017-07-01T02:07:56.965Z',
 u'updated': u'2017-07-01T02:07:56.965Z'}

lukesneeringer · 2017-08-07T20:07:19Z

Hi ~~@tillahoffmann~~ @tsinha,
Since the code has changed so much since your original PR, and it sounds like there is an existing path to do what you need now, I am going to close this. I am sorry that we sat on it for so long.

If there are any outstanding issues remaining, feel free to re-open (or open a new issue).

…tform/python-docs-samples#1435) * remove face detection samples * update docstring * linter * linter

added support for generation-based query and delete of blobs.

c3c9a7a

googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Feb 1, 2016

tsinha added 2 commits January 31, 2016 21:00

added support for generation-based query and delete of blobs.

35cf688

Merge branch 'master' of github.com:VitalLabs/gcloud-python

1af1e97

tsinha added 3 commits January 31, 2016 21:14

cleaned up to past lint.

016080a

cleaned up to pass lint.

299fab9

Merge branch 'master' of github.com:VitalLabs/gcloud-python

5d8d041

dhermes reviewed Feb 1, 2016
View reviewed changes

googlebot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Feb 1, 2016

tsinha added 2 commits February 1, 2016 16:14

using 'client.connection.api_request' instead of manually creating AP…

e1e02da

…I paths. added generation support to blob.delete and blob.exists.

ignoring placeholder generation parameter in the mock delete_blob fun…

3aa1185

…ction

tsinha added 4 commits February 3, 2016 11:57

target_object parameter fix.

70a4f4c

query_params are default empty if generation is None.

9cc73fd

added tests for None generations

ed94125

working on support for version sensitive renaming of blobs.

8d92cf5

working version with a new path_with_params function at the blob leve…

3bfdd10

…l to help with generation based queries of blobs. also added version-based blob copying and renaming. fixed a typo in _helpers.py.

dhermes reviewed Feb 15, 2016
View reviewed changes

gcloud/storage/blob.py

if self.generation is not None:

params = {'generation': self.generation}

return (self.path, params)

This comment was marked as spam.

Sign in to view

This comment was marked as spam.

Sign in to view

fixed conflict

5d491ee

tseaver added the api: storage Issues related to the Cloud Storage API. label Mar 15, 2017

lukesneeringer added the priority: p2 Moderately-important priority. Fix may not be included in next release. label Apr 19, 2017

lukesneeringer closed this Aug 7, 2017

parthea pushed a commit that referenced this pull request Sep 22, 2023

[DO NOT MERGE] remove face detection samples [(#1435)](GoogleCloudPla…

e97a931

…tform/python-docs-samples#1435) * remove face detection samples * update docstring * linter * linter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added support for generation-based query and delete of blobs. #1435

added support for generation-based query and delete of blobs. #1435

tsinha commented Feb 1, 2016

googlebot commented Feb 1, 2016

tsinha commented Feb 1, 2016

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

dhermes commented Feb 1, 2016

tsinha commented Feb 1, 2016

googlebot commented Feb 1, 2016

tsinha commented Feb 2, 2016

dhermes commented Feb 3, 2016

tsinha commented Feb 3, 2016

dhermes commented Feb 3, 2016

tsinha commented Feb 4, 2016

dhermes commented Feb 5, 2016

tsinha commented Feb 5, 2016

This comment was marked as spam.

This comment was marked as spam.

tsinha commented Feb 29, 2016

dhermes commented Feb 29, 2016

tillahoffmann commented Jun 30, 2017 •

edited

Loading

dhermes commented Jun 30, 2017

tillahoffmann commented Jul 1, 2017

dhermes commented Jul 1, 2017

lukesneeringer commented Aug 7, 2017 •

edited by dhermes

Loading

added support for generation-based query and delete of blobs. #1435

added support for generation-based query and delete of blobs. #1435

Conversation

tsinha commented Feb 1, 2016

googlebot commented Feb 1, 2016

tsinha commented Feb 1, 2016

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

dhermes commented Feb 1, 2016

tsinha commented Feb 1, 2016

googlebot commented Feb 1, 2016

tsinha commented Feb 2, 2016

dhermes commented Feb 3, 2016

tsinha commented Feb 3, 2016

dhermes commented Feb 3, 2016

tsinha commented Feb 4, 2016

dhermes commented Feb 5, 2016

tsinha commented Feb 5, 2016

This comment was marked as spam.

This comment was marked as spam.

tsinha commented Feb 29, 2016

dhermes commented Feb 29, 2016

tillahoffmann commented Jun 30, 2017 • edited Loading

dhermes commented Jun 30, 2017

tillahoffmann commented Jul 1, 2017

dhermes commented Jul 1, 2017

lukesneeringer commented Aug 7, 2017 • edited by dhermes Loading

tillahoffmann commented Jun 30, 2017 •

edited

Loading

lukesneeringer commented Aug 7, 2017 •

edited by dhermes

Loading