Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: use a "distinct" batch in Replica.handleRaftReady() #7015

Merged

Conversation

petermattis
Copy link
Collaborator

@petermattis petermattis commented Jun 2, 2016

A minor optimization that affects small transactions.

Fixes #6949.

name                       old time/op    new time/op    delta
Insert1_Cockroach-8           212µs ± 2%     204µs ± 1%  -3.55%        (p=0.000 n=10+10)
Update1_Cockroach-8           356µs ± 2%     343µs ± 1%  -3.49%         (p=0.000 n=10+9)
Delete1_Cockroach-8           390µs ± 2%     389µs ± 2%    ~           (p=0.529 n=10+10)
TrackChoices1_Cockroach-8     280µs ± 1%     269µs ± 1%  -3.61%        (p=0.000 n=10+10)

This change is Reviewable

@BramGruneir
Copy link
Member

LGTM

var err error
if lastIndex, err = r.applySnapshot(batch, rd.Snapshot); err != nil {
return err
}
// TODO(bdarnell): update coalesced heartbeat mapping with snapshot info.
}
// We know that all of the writes from here forward will be to distinct keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the right thing to require here that none of the keys written have previously been mutated in the batch? Reads get passed through to the underlying engine (not the batch). We definitely mutate a lot of keys when we apply a snapshot, conceivably the hard state, last index, ... (maybe we don't currently, but putting this Distinct() business in for minor gains seems like a good recipe for a future two-day debugging session).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious about this. Would it be more palatable to only enable this usage of a "distinct" batch if a snapshot isn't being applied?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so (though of course ideally we also want to use Distinct() in applySnapshot). I'm still somewhat uncomfortable with the semantics of Distinct() - they essentially require you to know about all prior writes to the batch. Remind me, why didn't we make Distinct() read from the underlying batch? It was because that opened up new questions about what would happen if the batch was written to while a Distinct batch was also active?
I'm thinking maybe the best way to go about it would be making the batch inactive as long as it has an open Distinct() batch (i.e. a write to the underlying batch would fail), and making distinctBatch.Commit() do nothing but flush to the underlying batch (which would then have to be committed)? The upshot would be that Distinct() wouldn't have to worry about what happened to the batch earlier, and that use of the original batch is disallowed until we're done with Distinct() mode. I think that's congruent with our usage and avoids the weird global perspective that's currently required for reasoning about when Distinct() applies.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recollection is that distinctBatch has its current semantics so that you didn't have to recall when you created it to know what writes it would see. That is simple at one level, but as you've noted it adds cognitive complexity at another. Let me take a look at making the change you suggest. I don't think it will be difficult.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this was mostly straightforward except for one large gotcha: the current usage of Distinct() within Replica.{Put,ConditionalPut} would now cause a flush on every request which completely destroys the distinct optimization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, got it figured out. I've added another commit which changes the semantics of Distinct(). PTAL.

@petermattis petermattis force-pushed the pmattis/handle-raft-ready branch from 28c333f to 07c227d Compare June 2, 2016 19:55
@tbg
Copy link
Member

tbg commented Jun 3, 2016

:lgtm:. Do you see any chances to consolidate some of the interplay between Blind and DistinctSpans? They seem similar, but not quite the same.

Previously, BramGruneir (Bram Gruneir) wrote…

LGTM


Reviewed 1 of 1 files at r1, 4 of 4 files at r2.
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


storage/replica.go, line 1424 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Doesn't seem worth the effort to optimize this handling of snapshots as they should be much rarer than normal raft processing. Should I remove the TODO?

Using a new batch for snapshot doesn't seem like a bad idea conceptually, though there's no immediate upshot if we don't use `Distinct()`. For the snapshot case, the go-side caching of mutations is a penalty because we write everything to Go and then force a flush down to C++, doubling the work, correct? I don't think the TODO is worth it at this point, but maybe it could be converted into a comment that explains why there's not much to be gained.

Snapshots are not frequent (normally) but they are pretty large. Maybe optimizing them is still useful at some point.


storage/engine/engine.go, line 92 [r2] (raw file):

// Reader is the read interface to an engine's data.
type Reader interface {
  // Close closes the reader, freeing up any outstanding resources.

The comment here is somewhat misleading. For a normal batch, Close() aborts the batch. For a Distinct() batch, it frees the snapshot but keeps the writes in the original batch.


Comments from Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Jun 3, 2016

:lgtm:

Previously, tschottdorf (Tobias Schottdorf) wrote…

:lgtm:. Do you see any chances to consolidate some of the interplay between Blind and DistinctSpans? They seem similar, but not quite the same.


Reviewed 4 of 4 files at r2.
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


storage/replica.go, line 1424 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Using a new batch for snapshot doesn't seem like a bad idea conceptually, though there's no immediate upshot if we don't use Distinct().
For the snapshot case, the go-side caching of mutations is a penalty because we write everything to Go and then force a flush down to C++, doubling the work, correct?
I don't think the TODO is worth it at this point, but maybe it could be converted into a comment that explains why there's not much to be gained.

Snapshots are not frequent (normally) but they are pretty large. Maybe optimizing them is still useful at some point.

Yeah, snapshots are large and (mostly) distinct, so there's more to gain by using a distinct batch with them, but they're rare enough that it's probably not worth the trouble and I'd remove the TODO.

Comments from Reviewable

A minor optimization that affects small transactions.

Fixes cockroachdb#6949.

name                       old time/op    new time/op    delta
Insert1_Cockroach-8           212µs ± 2%     204µs ± 1%  -3.55%        (p=0.000 n=10+10)
Update1_Cockroach-8           356µs ± 2%     343µs ± 1%  -3.49%         (p=0.000 n=10+9)
Delete1_Cockroach-8           390µs ± 2%     389µs ± 2%    ~           (p=0.529 n=10+10)
TrackChoices1_Cockroach-8     280µs ± 1%     269µs ± 1%  -3.61%        (p=0.000 n=10+10)
A distinct batch will now read previous writes made to the parent batch,
but won't read its own writes. Also make it illegal to use the parent
batch while the distinct batch is open.
@petermattis petermattis force-pushed the pmattis/handle-raft-ready branch from 07c227d to 4246eba Compare June 4, 2016 01:15
@petermattis
Copy link
Collaborator Author

Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


storage/replica.go, line 1424 [r1] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Yeah, snapshots are large and (mostly) distinct, so there's more to gain by using a distinct batch with them, but they're rare enough that it's probably not worth the trouble and I'd remove the TODO.

Removed the TODO.

Writing to Go and then flushing to C++ is not double the work of writing directly to C++ because the Go writes are much faster. I imagine we'll optimize applying snapshots at some point, though I suspect the initial optimization work will be on somehow streaming them instead of sending them in one request.


storage/engine/engine.go, line 92 [r2] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

The comment here is somewhat misleading. For a normal batch, Close() aborts the batch. For a Distinct() batch, it frees the snapshot but keeps the writes in the original batch.

Well, `Close()` does free up outstanding resources for the `Distinct()` batch. Add some additional commentary.

Comments from Reviewable

@petermattis
Copy link
Collaborator Author

As you say, Blind and DistinctSpans are similar but not quite the same. Blind indicates that the key has never been written before (and thus there is nothing to read from RocksDB). DistinctSpans indicates that all of the keys in a batch or distinct and thus operations in the batch do not need to read their own writes. I don't see any way to combine the two.

Previously, bdarnell (Ben Darnell) wrote…

:lgtm:


Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


Comments from Reviewable

@petermattis petermattis merged commit ce51673 into cockroachdb:master Jun 4, 2016
@petermattis petermattis deleted the pmattis/handle-raft-ready branch June 4, 2016 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants