-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Try to reduce DB deadlocks #408
Conversation
dd41ca8
to
e5ecf95
Compare
See also #399, which I'd like to test separately and might also help. It's likely impossible to avoid all deadlocks as the locks held are dynamic and unpredictable. Clients can expect to have to do occasional retries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this would work as sequencing can still be trying to access the same area of key space (depending on what the guard interval is set to).
Worth trying though.
storage/mysql/log_storage.go
Outdated
@@ -180,6 +183,7 @@ func (m *mySQLLogStorage) hasher(treeID int64) (merkle.TreeHasher, error) { | |||
func (m *mySQLLogStorage) beginInternal(ctx context.Context, treeID int64) (storage.LogTreeTX, error) { | |||
// TODO(codingllama): Validate treeType | |||
var duplicatePolicy string | |||
glog.V(4).Infof("QueryRow: %v with %v", flatten(getTreePropertiesSQL), treeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it's worth logging these as you can get them from the mysql log if you turn on general query logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought there must be a better method. I'll just drop the logging commit.
storage/mysql/log_storage.go
Outdated
|
||
// Delete in order of the hash values in the leaves. | ||
orderedLeaves := make([]*trillian.LogLeaf, len(leaves)) | ||
copy(orderedLeaves, leaves) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one doesn't affect a public entrypoint so Done.
for i, leaf := range leaves { | ||
// Insert in order of the hash values in the leaves. | ||
orderedLeaves := make([]*trillian.LogLeaf, len(leaves)) | ||
copy(orderedLeaves, leaves) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid copying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not copying will modify the input parameters, which I prefer to avoid (and we're just copying pointers not objects so it should be too expensive).
I guess we could add comments to indicate that this entrypoint may modify the passed-in slice if you're super keen to avoid the copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we can leave it like this if there's a reason to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I miss const
sometimes...
So far my tests with bucketing are not encouraging. Needs more investigation but it's not doing what I hoped it would. |
Re: "Not sure why this would work as sequencing can still be trying to access the same area of key space (depending on what the guard interval is set to)." -- I tried to put my best attempt at an explanation into the commit message for 860c493. |
Yes I see how that could change the lock pattern in that case the dequeue of sequencing is also scanning the table. I did have an order by the leaf hash in that query, which I might have removed because it looked like unnecessary work. If so it could be worth putting it back. |
10a43e9
to
383ec15
Compare
Now just the leaf-ordering change -- worth merging? |
I think we should try it and observe the results. |
If distinct multi-row write operations for the Unsequenced table (i.e. SequenceBatch, QueueLeaves) use arbitrary ordering, the chances of DB database deadlock are increased. AIUI each write will lock a range of index values to preserve primary key uniqueness, roughly [prev-existing-hash, new-hash]. So one multiple row update might lock (say) BCD then LMN then WXYZ. Meanwhile, a different transaction might try to lock UVW then DEF. This gives a chance of deadlock: 1) gets BCD and LMN 2) gets UVW 1) tries to get WXYZ and needs to wait for 2) 2) tries to get DEF and needs to wait for 1).
4babbe6
to
4a0adff
Compare
…m 589b12611..55c1d0c85 8cc3a55af Add custom options to allow more control of swagger/openapi output (google#145) b0be3cdef runtime: fix chunk encoding 82b83c781 protoc-gen-swagger optional SourceCodeInfo 1fd8ba6a5 Fix logic handling primitive wrapper in URL params b2423da79 runtime: use r.Context() (google#473) c323909dd Add Handler method to pass in a client (google#454) ac41185c3 Fallback to JSON name when matching URL parameter. (google#450) 8bec008bd fix 2 typos in Registry.SetPrefix's comment de5a00fcc Reference Gulp by a more complete path 185dda2d4 Fix build. 824b9a716 Test with Go 1.9. f2862b476 Memoise calls to fullyQualifiedNameToSwaggerName to speed it up for large registries (google#421) 1a03ca3ba Update DO NOT EDIT template. (google#434) a5c7982c0 Update Swagger Codegen from 2.1.6 to 2.2.2 (google#415) d64f5319e ISSUE#405: customize the error return (google#409) c6f7a5ac6 improve {incoming,outgoing}HeaderMatcher logic (google#408) 2a40dd795 Return if runtime.AnnotateContext gave error (google#403) 47a11d786 jsonpb: update tests to reflect new jsonpb behavior (google#401) f6f92fcd9 Reference import grpc Status to suppress unused errors. (google#387) 979be44d9 fixes package name override doesn't work (google#277) b1e4aed16 Skip unreferenced messages in definitions. (google#371) ca4c8d6af ci: regen with current protoc-gen-go (google#385) 7195ea445 Use status package for error and introduce WithProtoErrorHandler option (google#378) 4539fc575 Add response headers from grpc server (google#374) 597c8c358 support allow_delete_body for protoc-gen-grpc-gateway (google#318) 55d0969c0 Use canonical header form in default header matcher. (google#369) 893772d22 Extend ServeMux to allow user configurable header forwarding. git-subtree-dir: vendor/github.com/grpc-ecosystem/grpc-gateway git-subtree-split: 55c1d0c857e5c6cadb0ee292f6cc36621cd5ea8c
Speculative change; I'd obviously drop the last commit before merging (but it's helpful to get more test coverage on Travis), and I'm not convinced that the
V(4)
commit is worthwhile (or even the right way to do it). That leaves the sort-leaves commit, where I've made up a plausible explanation but I don't know if it's really relevant or not.For issue #405.