Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support key version with KMS names to correctly recognize when to skip re-encryption #1705

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

KDB223
Copy link

@KDB223 KDB223 commented May 19, 2023

The metadata received from GetObjectMetadata() at https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rewrite.py#LL399C1-L403C5 is used to determine if the object's current encryption is CSEK or SMEK, and compared with dest_encryption_kms_key:

    src_metadata = gsutil_api.GetObjectMetadata(
        transform_url.bucket_name,
        transform_url.object_name,
        generation=transform_url.generation,
        provider=transform_url.scheme)

    ...

    # Store metadata about src encryption to make logic below easier to read.
    src_encryption_kms_key = (src_metadata.kmsKeyName
                              if src_metadata.kmsKeyName else None)
                              
    encryption_unchanged = (src_encryption_sha256 == dest_encryption_sha256 and
                            src_encryption_kms_key == dest_encryption_kms_key)                         

The issue here is that GetObjectMetadata() returns the fully-qualified KMS key name, including the version number. Here is sample debugging output from one of my local runs:

src_encryption_kms_key: projects/my-project/locations/us/keyRings/gcs-cmek/cryptoKeys/gcs-kms-1/cryptoKeyVersions/1
dest_encryption_kms_key: projects/my-projec/locations/us/keyRings/gcs-cmek/cryptoKeys/gcs-kms-1
encryption_unchanged: False

encryption_unchanged will thus never be True for objects with CMEK, and the rewrite -k command will always re-encrypt the object despite having the same KMS key.

To reproduce:

  • Copy an object to GCS: $ gsutil cp ./mysecretfile.txt gs://my-bucket
  • Use rewriteto encrypt it with CMEK. storage.objects.rewrite gets called as expected:
$ KMS_KEY=projects/my-project/locations/us/keyRings/gcs-cmek/cryptoKeys/gcs-kms-1

$ gsutil -D -o "GSUtil:encryption_key=$KMS_KEY" rewrite -k gs://my-bucket/mysecretfile.txt 2>&1 | grep 'Calling method\|Skipping'

INFO 0519 20:03:57.433407 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:03:58.112828 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:03:58.438109 base_api.py] Calling method storage.objects.rewrite with StorageObjectsRewriteRequest: <StorageObjectsRewriteRequest
  • Running the same command again, you can see storage.objects.rewrite called again:
$ gsutil -D -o "GSUtil:encryption_key=$KMS_KEY" rewrite -k gs://my-bucket/mysecretfile.txt 2>&1 | grep 'Calling method\|Skipping'

INFO 0519 20:04:04.822613 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:04:05.544981 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:04:05.962979 base_api.py] Calling method storage.objects.rewrite with StorageObjectsRewriteRequest: <StorageObjectsRewriteRequest

Contrast this with a user-generated CSEK instead, where it skips the rewrite second time round:

$ KEY=qxG<REDACTED>I=

$ gsutil git:(master) ✗ ./gsutil -D -o "GSUtil:encryption_key=$KEY" rewrite -k gs://gcs-kms/mysecretfile.txt 2>&1 | grep 'Calling method\|Skipping'

INFO 0519 20:55:45.531044 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:55:46.942206 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:55:47.361670 base_api.py] Calling method storage.objects.rewrite with StorageObjectsRewriteRequest: <StorageObjectsRewriteRequest

$ gsutil git:(master) ./gsutil -D -o "GSUtil:encryption_key=$KEY" rewrite -k gs://gcs-kms/mysecretfile.txt 2>&1 | grep 'Calling method\|Skipping'

INFO 0519 20:55:49.934391 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
INFO 0519 20:55:50.571057 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
Skipping gs://gcs-kms/mysecretfile.txt, all transformations were redundant: ['encryption key']

Modifying ValidateCMEK() to accept version numbers as part of the KMS name fixes this. Public docs already mention using the "fully-qualified" KMS key name, which includes the version number anyway.

@@ -34,7 +34,8 @@
lambda: re.compile('projects/([^/]+)/'
'locations/([a-zA-Z0-9_-]{1,63})/'
'keyRings/([a-zA-Z0-9_-]{1,63})/'
'cryptoKeys/([a-zA-Z0-9_-]{1,63})$'))
'cryptoKeys/([a-zA-Z0-9_-]{1,63})/'
'cryptoKeyVersions/([0-9]{1,63})$'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the ValidateCMEK function seems to be getting used at multiple places and not every time we pass a key with CryptoKeyVersions (e.g. if you are reading it from boto config)?. I have a feeling that this might break other workflows.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it will break a lot of things that rely on the CryptoKeyWrapper class.

@dilipped
Copy link
Collaborator

dilipped commented May 21, 2023

Thanks for the PR! Can you please verify some of the other use cases that rely on this function?

@KDB223
Copy link
Author

KDB223 commented May 21, 2023

Since this bug has to do specifically with rewrite -k, I removed these changes and added some logic to strip the version number instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants