-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Append region info to S3ToRedshitOperator if present #32328
Append region info to S3ToRedshitOperator if present #32328
Conversation
It's possible to copy from S3 into Redshift across different regions, however, currently you are unable to do so with the S3ToRedshiftOperator. This PR simply makes this possible, by checking the aws connection passed has the region set in the extras part of the connection config. If this is set, it'll use this in line with the syntax defined [here](https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html)
Following on from discussion in PR, currently the way assertion is done is kind of redundant, as it's asserting static variables == other static variables. Instead, this now gets compared to what gets generated from the `_build_copy_query` function, this has been reflected for all applicable test cases in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things looks fine now.
I've removed the comments you highlighted @Adaverse , hope things look okay now |
some static checks. I recommend to install pre-commit and use it locally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except for static-checks, rest LGTM
You can look at instructions here to run static checks |
All checks are passing now |
column_names = "(" + ", ".join(self.column_list) + ")" if self.column_list else "" | ||
return f""" | ||
COPY {copy_destination} {column_names} | ||
FROM 's3://{self.s3_bucket}/{self.s3_key}' | ||
credentials | ||
'{credentials_block}' | ||
{region_info} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will that work when region is None? I srsly doubt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. It can only be None when user passes null in the extra config. Should be handled explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot, I've updated the parameter to not default to None which prevents the possibility of this happening
It's possible to copy from S3 into Redshift across different regions, however, currently you are unable to do so with the S3ToRedshiftOperator. This PR simply makes this possible, by checking the aws connection passed has the region set in the extras part of the connection config. If this is set, it'll use this in line with the syntax defined here
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.