Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up fix for merging errors from batch BigQuery inserts (#1348). #1350

Merged
merged 2 commits into from
Jan 17, 2020

Conversation

Dor1s
Copy link
Contributor

@Dor1s Dor1s commented Jan 17, 2020

Follow up! Looks like either documentation or my reading of it is not right :)

@googlebot googlebot added the cla: yes CLA signed. label Jan 17, 2020
@Dor1s
Copy link
Contributor Author

Dor1s commented Jan 17, 2020

/gcbrun

if result.get('insertErrors'):
result['insertErrors'].extend(new_errors)
else:
result = response
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we overwrite the result here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great question, linter complained when I tried to assign only that field:

| E:394,10: 'result' does not support item assignment (unsupported-assignment-operation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me give a try to upgrading it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't seem like we can use the version that has it fixed (2.3.0: pylint-dev/pylint#2767), but I've used {} instead of None and now it's happy

@@ -379,8 +379,18 @@ def insert(self, inserts):
time.sleep(1)

if not result:
# Use result from the first batch, appending errors from the rest.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't capture result (non-errors) from other parts of query response. This seems weird. @oliverchang for review as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't anything else useful as per the documentation: https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that there is nothing much else in the returned values.

@Dor1s
Copy link
Contributor Author

Dor1s commented Jan 17, 2020

/gcbrun

Copy link
Contributor Author

@Dor1s Dor1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landing this for now to ((hopefully) unblock the cron job. If @oliverchang knows a better solution, I'll be happy to revert these two fixes next week.

@Dor1s Dor1s merged commit 4b5e825 into master Jan 17, 2020
Copy link
Collaborator

@oliverchang oliverchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

continue

# Apparently result may not have errors, be careful.
if result.get('insertErrors'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can do this much more concisely like so:

result.setdefault('insertErrors', []).extend(new_errors)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful, done!

result = response
else:
result['insertErrors'].extend(response['insertErrors'])
# If there are new errors from the current batch, append to the result.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the else makes the flow of the logic a bit confusing here. We can probably just continue after line 383 and unindent the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, done!

@@ -379,8 +379,18 @@ def insert(self, inserts):
time.sleep(1)

if not result:
# Use result from the first batch, appending errors from the rest.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that there is nothing much else in the returned values.

Copy link
Contributor Author

@Dor1s Dor1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uploaded as #1355

continue

# Apparently result may not have errors, be careful.
if result.get('insertErrors'):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful, done!

result = response
else:
result['insertErrors'].extend(response['insertErrors'])
# If there are new errors from the current batch, append to the result.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, done!

@oliverchang oliverchang deleted the insert_errors branch January 30, 2020 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes CLA signed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants