-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(bigquery): allow insert to target dataset/table in another project #28097
Conversation
@@ -2810,7 +2810,7 @@ def insert_async table_id, skip_invalid: nil, ignore_unknown: nil, max_bytes: 10 | |||
ensure_service! | |||
|
|||
# Get table, don't use Dataset#table which handles NotFoundError | |||
gapi = service.get_table dataset_id, table_id, metadata_view: view | |||
gapi = service.get_project_table project_id, dataset_id, table_id, metadata_view: view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this changes things. Under what circumstance would the dataset's project ID be different from the project ID of the service backing the dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, it's now possible for the client (whose project is tied to the billing account) to get and construct a Dataset object from a different project. So this is actually a bug fix rather than a feature: in such a case, we're making sure the correct project gets set when doing an insert rows operation. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dazuma This is a very common scenario among BigQuery users, where operations are run in project A to modify data stored in project B. So you need to target tables in a dataset that lives in a separated project, other than the project usage on the main client and which for example is used for billing.
Another example: a central project for an org holds all the long term stored data, but each department isolates their operational costs from one another by using individual projects to run queries and load jobs, etc.
This is the same need that was mentioned on #27681 and #27368
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, it's now possible for the client (whose project is tied to the billing account) to get and construct a Dataset object from a different project. So this is actually a bug fix rather than a feature: in such a case, we're making sure the correct project gets set when doing an insert rows operation. Is that correct?
yeah, it might make sense to change it to a fix instead of feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is indeed a bug fix (and I think it should be: I would expect a new feature to include an addition to the public interface, and this PR doesn't make any public interface changes at all), then I'd like to see new unit tests that show the fixed case. You can mock out the backend service, as many of the unit tests do, but just test that the correct project ID gets passed down to the backend service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks much!
Allow
insertAll
API to target tables in a dataset that lives in a separated project, other than the project usage on the main client and which is used for billing.I haven't added integration tests, since we would not have write access to two different projects in our CI pipelines. But I tested locally with this given code:
Follow up on #27681 and #27368