GCSHook's functions for list and download do not work with Requester Pays buckets. #31137
Closed
2 tasks done
Labels
area:providers
good first issue
kind:bug
This is a clearly a bug
provider:google
Google (including GCP) related issues
Apache Airflow version
2.6.0
What happened
When accessing data in a "Requester Pays" bucket, the user's project needs to be supplied in the storage client's definition of the bucket, or set in the acl. When calling the "list" or "download" function from the GCSHook, there is no place to supply a user project id. This results in the following error: Bucket is a requester pays bucket but no user project provided.
This is explicit in the GCP documentation.
What you think should happen instead
In the "insert_bucket_acl" function in the GCSHook, a user_project is optionally supplied for Requester Pays projects. This code looks like this:
I believe this code should be added to the list and download functions as well. This should also fix any operators from GCP to GCP/S3/Azure that want to transfer data from a "Requester Pays" bucket.
How to reproduce
Call hook.list() on any GCS bucket with Requester Pays enabled
Operating System
Debian 11
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: