-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Storage] Fix public bucket source check in SkyPilot Storage #1087
Conversation
# These buckets are known to be private, but may need to be updated if | ||
# they are removed by their owners. | ||
with pytest.raises(sky.exceptions.StorageBucketGetError, | ||
match='the bucket name is taken'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match=backend_utils._BUCKET_FAIL_TO_CONNECT_MESSAGE.format(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh private buckets trigger this message on gcp, so the common message is the bucket name is taken
:
Lines 1129 to 1133 in 6196552
ex = exceptions.StorageBucketGetError( | |
f'Attempted to access a private external bucket {self.name}' | |
'\nCheck if the 1) the bucket name is taken and/or ' | |
'2) the bucket permissions are not setup correctly. ' | |
f'Consider using `gsutil ls gs://{self.name}` to debug.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, can we move this message as an variable such as PRIVATE_BUCKET_ERROR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @romilbhardwaj! LGTM. @michaelzhiluo?
@@ -820,7 +820,7 @@ def test_new_bucket_creation_and_deletion(self, tmp_local_storage_obj, | |||
|
|||
@pytest.mark.parametrize( | |||
'tmp_public_storage_obj, store_type', | |||
[('s3://tcga-2-open', storage_lib.StoreType.S3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep this bucket in the test? For coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point - added!
tests/test_smoke.py
Outdated
else: | ||
retry_count += 1 | ||
if retry_count > 3: | ||
raise RuntimeError('Could find a nonexistent bucket to use.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could not*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! fixed
@michaelzhiluo ready for another look! |
This PR looks good to me, as long as above comment is resolved! |
@michaelzhiluo - regarding refactoring error strings, I'll create another PR to fix that since that's tangential to this bugfix - would that be ok? |
Sounds good, in that case, feel free to merge! |
Closes #977.
GetPublicAccessBlock
is no longer a reliable method to check if a s3 bucket is public. Buckets can be publicly readable yet not allow access toGetPublicAccessBlock
. I'm not sure what caused it (and I can't find a clear reason).Some buckets still allow it (e.g.,
s3://tcga-2-open
, which is used in our smoke tests and thus was passing), but others have blocked it (e.g.,s3://digitalcorpora
,s3://imagenet-bucket
).Using
head_bucket
API now, as recommended by boto3 docs.Also added storage tests for non-existent and private buckets.
Tested:
pytest test_smoke.py::TestStorageWithCredentials