-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Make transaction rollback best effort. #1967
Conversation
'Best effort to rollback failed with error:', | ||
reason | ||
); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only question is whether there are rollback failures that should be retried?
maybe MISSING_TRANSACTION_HANDLE
should be the only one that's best-effort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the documentation "Note that a rollback itself might fail, so the rollback should be a best-effort attempt only."
https://cloud.google.com/datastore/docs/best-practices#api_calls
However, I see your point.
This could be improved such that retryable errors from rollback cause rollback to be retried, except INTERNAL which happens when MISSING_TRANSACTION_HANDLE.
BUT, even if rollback experiences a non-retryable error, the transaction should still retry. That was not the case when I looked at Java, which seems completely wrong even if it was baked into a test case.
The phrasing "Best effort" leaves some nuance to be interpreted.
Ideally MISSING_TRANSACTION_HANDLE should just be a no-op on Firestore backend, and NOT throw an INTERNAL error. This code might simply be working around what should be a change to the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally MISSING_TRANSACTION_HANDLE should just be a no-op on Firestore backend, and NOT throw an INTERNAL error. This code might simply be working around what should be a change to the backend.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be improved such that retryable errors from rollback cause rollback to be retried, except INTERNAL
Let's do this then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into code, and the underlying GAPIC client does do retries according to retry settings. Effectively, that means currently there is appropriate backoff, and 5 attempts to rollback before failure.
This second level of retry on a higher level was a complete duplication and would simply multiply the number of attempts with additional backoff.
In addition, after some investigation, the GAPIC level retry respects the error codes from service config. So it actually does a better job of determining when to retry than the higher transaction level retry.
In other words, we should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. Thanks
'Best effort to rollback failed with error:', | ||
reason | ||
); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be improved such that retryable errors from rollback cause rollback to be retried, except INTERNAL
Let's do this then
Internal tracking: b/316023452
A rollback is meant to be best effort. If the transaction has already expired, it is possible for the rollback to fail due to transaction no longer existing in Firestore. The retry logic will use attempts to rollback, and in the case where transaction no longer exists, all attempts to be exhausted attempted to rollback transaction.
This PR makes the rollback best effort, simply logging any rollback error and continuing.