Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deferrable GCSObjectExistenceSensorAsync #28763

Merged
merged 9 commits into from
Jan 10, 2023
Merged

Conversation

rajaths010494
Copy link
Contributor

@rajaths010494 rajaths010494 commented Jan 6, 2023

This PR donates the following developed GCSObjectExistenceSensorAsync` in astronomer-providers repo to apache airflow.

GCSObjectExistenceSensorAsync

@rajaths010494 rajaths010494 changed the title Add deferrable GCSObjectExistenceSensorAsync Add deferrable GCSObjectExistenceSensorAsync Jan 9, 2023
@rajaths010494 rajaths010494 requested review from uranusjr and removed request for turbaszek and mik-laj January 9, 2023 07:28
@uranusjr uranusjr removed their request for review January 9, 2023 07:30
@rajaths010494 rajaths010494 requested a review from uranusjr January 9, 2023 08:56
@rajaths010494
Copy link
Contributor Author

rajaths010494 commented Jan 9, 2023

When i re-requested for review. It automatically removed reviewers and now i am not able to add them back.
@mik-laj @turbaszek

@kaxil kaxil merged commit 284cd52 into apache:main Jan 10, 2023
@kaxil kaxil deleted the astro_gcs branch January 10, 2023 14:31
@vchiapaikeo
Copy link
Contributor

vchiapaikeo commented Jan 10, 2023

Hi @rajaths010494 @kaxil - I wanted to point out a possible bug that is introduced here - it seems like using this approach:

    async def get_storage_client(self, session: ClientSession) -> Storage:
        """Returns a Google Cloud Storage service object."""
        with await self.service_file_as_context() as file:
            return Storage(service_file=file, session=cast(Session, session))

does not honor the impersonation_chain when application default credentials (ADC) are being used. We've tried a similar approach from astronomer-providers and encountered issues on our end. I believe this is because in the case that await self.service_file_as_context() as file yields None (the fall through here), the async storage client gets instantiated with this:

return Storage(service_file=None, session=cast(Session, session))

TalkIQ's async io storage client has no idea of the impersonation chain on instantiation here --> https://github.com/talkiq/gcloud-aio/blob/4e0006240913ecadde68722b8986f184a9b1adea/storage/gcloud/aio/storage/storage.py#L149-L168

And as a result, the token gets instantiated with ADC without regard for impersonation chain.

Edit: On second look, it seems like provide_gcp_credential_file_as_context does not honor the impersonation chain at all? Compare this with get_credentials_and_project_id which handles the impersonation chain properly

@rajaths010494
Copy link
Contributor Author

rajaths010494 commented Jan 11, 2023

Thanks @vchiapaikeo i will look at this and check get_credentials_and_project_id which handles the impersonation chain properly.
and add support for impersonation chain.
Edit - As you mentioned currently gclou-aio-storage doesn't yet have support for it.
Once there is a support for it will add support for it. Until then it will be a limitation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants