Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bigquery): load jobs with dest tables in another project #27681

Merged
merged 9 commits into from
Dec 2, 2024

Conversation

alvarowolfx
Copy link
Contributor

This PR favors the usage of the dataset.project_id method instead of @service.project to allow Load jobs to target tables in a dataset that lives in a separated project, other than the project usage on the main client and which is used for billing.

A follow up PR needs to be address more scenarios where we should favor that same usage of the dataset.project_id instead of @service.project, since is a common use case where operations are run in project A to modify data stored in project B.

I haven't added integration tests, since we would not have write access to two different projects in our CI pipelines. But I tested locally with this given code:

# $ export GOOGLE_CLOUD_PROJECT=projectA
require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
project_id = "projectB"

dataset = bigquery.dataset dataset_id, project_id: project_id # <- this is the important change
gcs_uri  = "gs://cloud-samples-data/bigquery/us-states/us-states.csv"
table_id = "us_states"

load_job = dataset.load_job table_id, gcs_uri, skip_leading: 1 do |schema|
  schema.string "name"
  schema.string "post_abbr"
end
puts "Starting job #{load_job.job_id}"

load_job.wait_until_done! # Waits for table load to complete.
puts "Job finished."

table = dataset.table table_id
puts "Loaded #{table.rows_count} rows to table #{table.id}"

Fixes #27368

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Nov 27, 2024
@alvarowolfx alvarowolfx requested a review from dazuma November 27, 2024 20:38
@alvarowolfx alvarowolfx added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 29, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 29, 2024
@alvarowolfx
Copy link
Contributor Author

blocked by #27684

Copy link
Member

@dazuma dazuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks much!

@dazuma dazuma merged commit a23b15c into main Dec 2, 2024
13 checks passed
@dazuma dazuma deleted the bq-feat-load-job-project branch December 2, 2024 19:57
@github-actions github-actions bot added the release-please:force-run To run release-please label Dec 3, 2024
@release-please release-please bot removed the release-please:force-run To run release-please label Dec 3, 2024
shubhangi-google pushed a commit to shubhangi-google/google-cloud-ruby that referenced this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run Load Job in a project that loads data in a table in a different project
4 participants