Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Try to reduce DB deadlocks #408

Merged
merged 1 commit into from
Feb 28, 2017

Conversation

daviddrysdale
Copy link
Contributor

@daviddrysdale daviddrysdale commented Feb 27, 2017

Speculative change; I'd obviously drop the last commit before merging (but it's helpful to get more test coverage on Travis), and I'm not convinced that the V(4) commit is worthwhile (or even the right way to do it). That leaves the sort-leaves commit, where I've made up a plausible explanation but I don't know if it's really relevant or not.

For issue #405.

@Martin2112
Copy link
Contributor

See also #399, which I'd like to test separately and might also help.

It's likely impossible to avoid all deadlocks as the locks held are dynamic and unpredictable. Clients can expect to have to do occasional retries.

Copy link
Contributor

@Martin2112 Martin2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this would work as sequencing can still be trying to access the same area of key space (depending on what the guard interval is set to).

Worth trying though.

@@ -180,6 +183,7 @@ func (m *mySQLLogStorage) hasher(treeID int64) (merkle.TreeHasher, error) {
func (m *mySQLLogStorage) beginInternal(ctx context.Context, treeID int64) (storage.LogTreeTX, error) {
// TODO(codingllama): Validate treeType
var duplicatePolicy string
glog.V(4).Infof("QueryRow: %v with %v", flatten(getTreePropertiesSQL), treeID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it's worth logging these as you can get them from the mysql log if you turn on general query logging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought there must be a better method. I'll just drop the logging commit.


// Delete in order of the hash values in the leaves.
orderedLeaves := make([]*trillian.LogLeaf, len(leaves))
copy(orderedLeaves, leaves)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this copy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one doesn't affect a public entrypoint so Done.

for i, leaf := range leaves {
// Insert in order of the hash values in the leaves.
orderedLeaves := make([]*trillian.LogLeaf, len(leaves))
copy(orderedLeaves, leaves)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid copying?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not copying will modify the input parameters, which I prefer to avoid (and we're just copying pointers not objects so it should be too expensive).

I guess we could add comments to indicate that this entrypoint may modify the passed-in slice if you're super keen to avoid the copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we can leave it like this if there's a reason to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss const sometimes...

@Martin2112
Copy link
Contributor

So far my tests with bucketing are not encouraging. Needs more investigation but it's not doing what I hoped it would.

@daviddrysdale
Copy link
Contributor Author

Re: "Not sure why this would work as sequencing can still be trying to access the same area of key space (depending on what the guard interval is set to)." -- I tried to put my best attempt at an explanation into the commit message for 860c493.

@Martin2112
Copy link
Contributor

Yes I see how that could change the lock pattern in that case the dequeue of sequencing is also scanning the table. I did have an order by the leaf hash in that query, which I might have removed because it looked like unnecessary work. If so it could be worth putting it back.

@daviddrysdale
Copy link
Contributor Author

Now just the leaf-ordering change -- worth merging?

@Martin2112
Copy link
Contributor

I think we should try it and observe the results.

If distinct multi-row write operations for the Unsequenced table (i.e.
SequenceBatch, QueueLeaves) use arbitrary ordering, the chances of
DB database deadlock are increased.

AIUI each write will lock a range of index values to preserve primary
key uniqueness, roughly [prev-existing-hash, new-hash].

So one multiple row update might lock (say) BCD then LMN then WXYZ.
Meanwhile, a different transaction might try to lock UVW then DEF.

This gives a chance of deadlock:
 1) gets BCD and LMN
 2) gets UVW
 1) tries to get WXYZ and needs to wait for 2)
 2) tries to get DEF and needs to wait for 1).
@daviddrysdale daviddrysdale merged commit e104d60 into google:master Feb 28, 2017
@daviddrysdale daviddrysdale deleted the deadlock branch February 28, 2017 18:52
gdbelvin added a commit to gdbelvin/trillian that referenced this pull request Dec 5, 2017
…m 589b12611..55c1d0c85

8cc3a55af Add custom options to allow more control of swagger/openapi output (google#145)
b0be3cdef runtime: fix chunk encoding
82b83c781 protoc-gen-swagger optional SourceCodeInfo
1fd8ba6a5 Fix logic handling primitive wrapper in URL params
b2423da79 runtime: use r.Context() (google#473)
c323909dd Add Handler method to pass in a client (google#454)
ac41185c3 Fallback to JSON name when matching URL parameter. (google#450)
8bec008bd fix 2 typos in Registry.SetPrefix's comment
de5a00fcc Reference Gulp by a more complete path
185dda2d4 Fix build.
824b9a716 Test with Go 1.9.
f2862b476 Memoise calls to fullyQualifiedNameToSwaggerName to speed it up for large registries (google#421)
1a03ca3ba Update DO NOT EDIT template. (google#434)
a5c7982c0 Update Swagger Codegen from 2.1.6 to 2.2.2 (google#415)
d64f5319e ISSUE#405: customize the error return (google#409)
c6f7a5ac6 improve {incoming,outgoing}HeaderMatcher logic (google#408)
2a40dd795 Return if runtime.AnnotateContext gave error (google#403)
47a11d786 jsonpb: update tests to reflect new jsonpb behavior (google#401)
f6f92fcd9 Reference import grpc Status to suppress unused errors. (google#387)
979be44d9 fixes package name override doesn't work (google#277)
b1e4aed16 Skip unreferenced messages in definitions. (google#371)
ca4c8d6af ci: regen with current protoc-gen-go (google#385)
7195ea445 Use status package for error and introduce WithProtoErrorHandler option (google#378)
4539fc575 Add response headers from grpc server (google#374)
597c8c358 support allow_delete_body for protoc-gen-grpc-gateway (google#318)
55d0969c0 Use canonical header form in default header matcher. (google#369)
893772d22 Extend ServeMux to allow user configurable header forwarding.

git-subtree-dir: vendor/github.com/grpc-ecosystem/grpc-gateway
git-subtree-split: 55c1d0c857e5c6cadb0ee292f6cc36621cd5ea8c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants