Use upsert behavior for datapoints written to the mutable series buffer #876

robskillington · 2018-09-02T17:03:14Z

No description provided.

codecov · 2018-09-02T22:18:33Z

Codecov Report

Merging #876 into master will increase coverage by 0.04%.
The diff coverage is 91.9%.

@@            Coverage Diff             @@
##           master     #876      +/-   ##
==========================================
+ Coverage   78.67%   78.71%   +0.04%     
==========================================
  Files         396      397       +1     
  Lines       33533    33678     +145     
==========================================
+ Hits        26382    26511     +129     
- Misses       5354     5359       +5     
- Partials     1797     1808      +11

Flag	Coverage Δ
#dbnode	`81.45% <91.03%> (+0.01%)`	⬆️
#m3ninx	`71.93% <ø> (ø)`	⬆️
#query	`69.64% <100%> (+0.14%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2883443...11dfeb6. Read the comment docs.

richardartoul · 2018-09-04T15:10:03Z

Maybe its too much of a pain, but how do you feel about making the conflict resolution strategy configurable? The main reason being:

The existing first-write-wins strategy is useful in certain situations (I.E for keeping track of whether or not something has occurred in a given period of time by truncating to a given number of hours and then always writing 1.0 ala Utilization Monitor. You could still do this with this change, but it would be a lot more resource intensive.
A deterministic strategy (E.X the biggest value always wins) would be really useful for when we're trying to do shadow comparisons of one m3db cluster against another since even though they receive the same writes, they might receive them in slightly different orders

…s, ensure writing same value doesn't cause new encoder to be created

robskillington · 2018-09-10T03:12:33Z

@richardartoul all feedback addressed, if you write the same value to same timestamp multiple times it will be a no-op for creating buffers, and you can control how it selects different datapoints when different values for the same timestamp are being iterated over.

richardartoul · 2018-09-10T19:54:17Z

src/dbnode/encoding/iterators.go

+
+	switch i.equalTimesStrategy {
+	case IterateHighestValue:
+		sort.Slice(i.earliest, func(a, b int) bool {


Is this gonna alloc?

Thankfully no, output of go build -gcflags "-m" . in this package:

./iterators.go:63:26: (*iterators).current func literal does not escape ./iterators.go:70:26: (*iterators).current func literal does not escape ./iterators.go:87:26: (*iterators).current func literal does not escape

richardartoul · 2018-09-10T20:29:59Z

src/dbnode/encoding/iterators_test.go

@@ -0,0 +1,156 @@
+// Copyright (c) 2017 Uber Technologies, Inc.


Ta, will fix.

richardartoul · 2018-09-10T20:30:32Z

src/dbnode/encoding/iterators_test.go

+
+package encoding
+
+import (


Did this file really not have any tests before?

Haha, it was covered by multi reader iterator tests before, since this file was split as a subtype of multi reader iterator when series iterator needed the same type of logic.

richardartoul · 2018-09-10T20:31:41Z

src/dbnode/encoding/iterators_types.go

+
+const (
+	// IterateLastPushed is useful for within a single replica, using the last
+	// immutable buffer that was created to decide which value to choose.


The iterators have to be given in the correct order though right? Maybe clarify?

Sure thing.

richardartoul · 2018-09-10T20:57:19Z

src/dbnode/encoding/iterators_types.go

+
+	// DefaultIterateEqualTimestampStrategy is the default iterate
+	// equal timestamp strategy.
+	DefaultIterateEqualTimestampStrategy = IterateEqualTimestampStrategy(0)


Can you do DefaultIterateEqualTimestampStrategy = IterateLastPushed

Sure thing.

richardartoul · 2018-09-10T21:00:00Z

src/dbnode/encoding/iterators_types.go

+	// Return a copy here so callers cannot mutate the known list.
+	result := make([]IterateEqualTimestampStrategy, 0,
+		len(validIterateEqualTimestampStrategies))
+	copy(result, validIterateEqualTimestampStrategies)


Is this broken? my understanding is that copy will copy min(dst, src) and len(dst) here is zero

Ah yes, good call. I'll add a test for this too.

richardartoul · 2018-09-10T21:01:23Z

src/dbnode/encoding/iterators_types_test.go

+	yaml "gopkg.in/yaml.v2"
+)
+
+func TestValidIterateEqualTimestampStrategies(t *testing.T) {


If my previous comment is right, you might want to add a length test here

I added a values check, in addition to the ptr mismatch test.

richardartoul · 2018-09-10T21:06:39Z

src/dbnode/encoding/series_iterator.go

@@ -101,10 +97,17 @@ func (it *seriesIterator) Close() {
 		return
 	}
 	it.closed = true
-	it.id.Finalize()
-	it.nsID.Finalize()
+	if it.id != nil {


What changed that we suddenly needed these nil checks?

No new change, I'm just making sure we do the same nil check for all fields that are nil-able here.

richardartoul · 2018-09-10T21:07:04Z

src/dbnode/encoding/series_iterator.go

 	it.iters.reset()
-	it.iters.setFilter(startInclusive, endExclusive)
+	if !it.start.IsZero() && !it.end.IsZero() {
+		it.iters.setFilter(it.start, it.end)


does this do the right thing is one of them is zero?

Yes, the only invalid case is where start is not zero but end is zero.

richardartoul · 2018-09-10T21:07:43Z

src/dbnode/encoding/types.go

+	NumEncoded() int
+
+	// LastEncoded returns the last encoded datapoint, useful for
+	// de-duplicating encoded values. If there no values encoded previously


If there are no previously encoded values...

Sure thing, ta.

richardartoul · 2018-09-10T21:08:40Z

src/dbnode/persist/fs/commitlog/commit_log_test.go

@@ -437,7 +437,7 @@ func TestReadCommitLogMissingMetadata(t *testing.T) {

 func TestCommitLogReaderIsNotReusable(t *testing.T) {
 	// Make sure we're not leaking goroutines
-	defer leaktest.CheckTimeout(t, time.Second)()
+	defer leaktest.CheckTimeout(t, 10*time.Second)()


Not sure if bumping this is necessary, you might just need to rebase master. Prateek upgraded to the latest version of leaktest which fixes this

Hm, ok sure thing.

richardartoul · 2018-09-10T21:09:55Z

src/dbnode/storage/bootstrap/bootstrapper/commitlog/source.go

@@ -859,6 +859,8 @@ func (s *commitLogSource) startM3TSZEncodingWorker(
 			wroteExisting  = false
 		)
 		for i := range unmergedBlock {
+			// TODO(r): Write unit test to ensure that different values that arrive


This is a good point, did you handle this anywhere? I see a TODO for a unit test, but I don't see the logic at all.

There is no test here, I think I'll open an issue to follow up on this. This seems out of scope of this change, it would need to be a followup I believe.

I added a link on the comment in the code here to the issue I created:
#898

richardartoul · 2018-09-10T21:18:54Z

src/dbnode/storage/series/buffer.go

-		b.encoders = append(b.encoders, next)
-		idx = len(b.encoders) - 1
+
+	// Upsert/last-write-wins semantics.


I was skimming through the encoder code and its not clear to me why we need to create a new encoder for this scenario. Does it simplify the iteration logic? it seemed like the encoder could encode multiple different values for the same timestamp

It could, it's just way harder and I don't really want to enforce encoders have to deal with this case right now. We can revisit this later.

richardartoul · 2018-09-10T21:22:06Z

src/dbnode/storage/series/buffer.go

+
+	b.encoders[idx].lastWriteAt = datapoint.Timestamp
+
+	if b.empty {


any reason to add this if statement? I assume its just as fast if not faster to just unconditionally update it

Yeah, it's kind of buggy and nasty to keep this tracking everywhere.

richardartoul

Left a few comments but is starting to look good. Only thing I'm low confidence on is the iteration logic because I'm not very familiar with it

richardartoul · 2018-09-11T18:31:55Z

src/dbnode/storage/series/buffer.go

@@ -545,8 +543,22 @@ func (b *dbBufferBucket) finalize() {
 	b.resetBootstrapped()
 }

+func (b *dbBufferBucket) empty() bool {


you think this will be ok perf-wise? I assume thats why we had the bool to begin with

Yeah this should be fine, we don't actually call it from any high frequency code paths. We used to have a high frequency call site with it, although now it was refactored out.

richardartoul

LGTM if you're confident about the iterator changes.

richardartoul

LGTM

prateek · 2018-09-11T19:41:38Z

src/dbnode/encoding/series_iterator.go

-	tags ident.TagIterator,
-	startInclusive, endExclusive time.Time,
-	replicas []MultiReaderIterator,
+	opts SeriesIteratorOptions,


+1 for this

prateek · 2018-09-11T19:51:49Z

src/dbnode/encoding/null.go

@@ -45,6 +45,10 @@ func (e *nullEncoder) Encode(dp ts.Datapoint, timeUnit xtime.Unit, annotation ts
 func (e *nullEncoder) Stream() xio.SegmentReader {
 	return xio.NewSegmentReader(ts.Segment{})
 }
+func (e *nullEncoder) NumEncoded() int { return 0 }
+func (e *nullEncoder) LastEncoded() (ts.Datapoint, error) {
+	return ts.Datapoint{}, fmt.Errorf("not implemented")


nit: use a const errors.New() instead of the fmt.Errorf here

prateek · 2018-09-11T20:42:23Z

src/dbnode/storage/series/buffer.go

@@ -752,6 +802,8 @@ func (b *dbBufferBucket) merge() (mergeResult, error) {
 		}
 	}()

+	// Rank bootstrapped blocks as data that has appeared before data that


hm can this be violated during a topology change?

This is mainly a best effort change to be honest, as we know without keeping a timestamp next to each value of when it was written we can't guarantee selecting the last upserted value when reading values returned from multiple replicas.

It should be a good approximation however, which is what we need right now. When we are ready to begin solving it by storing metadata next to the values we can offer configuration for either strict or best effort upserts.

prateek · 2018-09-11T20:43:55Z

src/dbnode/storage/series/buffer.go

+				return err
+			}
+			if last.Value == value {
+				// No-op since matches the current value


prateek · 2018-09-11T20:45:45Z

src/dbnode/encoding/iterators.go

 }

 func (i *iterators) len() int {
 	return len(i.values)
 }

 func (i *iterators) current() (ts.Datapoint, xtime.Unit, ts.Annotation) {
-	return i.earliest.Current()
+	numIters := len(i.earliest)


could this function be implemented using a heap instead? wondering if we'd avoid sorting for each call to current() that way

We can look at this in a followup change, currently none of the non-default strategies aren't likely to be called very often (only subsets of queries/requests, etc, hence using the setter at a per series iterator granularity).

When we have a more complete implementation that can actually return you the last written value, say with metadata stored alongside the value, we should then look at optimizing this. This is mainly an ergonomic change useful for edge cases rather than supposed to be definitive and optimized.

prateek · 2018-09-11T20:46:27Z

src/dbnode/encoding/iterators.go

+			return currA.Value > currB.Value
+		})
+
+	case IterateHighestFrequencyValue:


lol this feels so magic-y. Do you have a use in mind for it?

As per docs its basically if you are using consistency: all and don't mind the read unavailability when not all replicas are available to be read from (say perhaps exact match mode for testing correctness for shadow traffic, etc):

// IterateHighestFrequencyValue is useful across replicas when you want to // choose the most common appearing value, however you can only use this // reliably if you wait to successfully fetch values from all replicas, i.e. // you cannot use this reliably with quorum/majority replicas consistency, // and only all consistency. IterateHighestFrequencyValue

prateek · 2018-09-11T20:54:31Z

src/dbnode/encoding/iterators.go

-	filterStart time.Time
-	filterEnd   time.Time
+	values             []Iterator
+	earliest           []Iterator


Can I convince you to write a little blurb about the intent behind how earliest and values are used in the code? I think I follow based on the implementation of moveToValidNext() but it's a little convoluted the first time someone looks at it.

Sure thing, can do.

Use upsert behavior for datapoints written to the mutable series buffer

305aeef

Add series buffer respects upserts

852af64

Rob Skillington added 8 commits September 9, 2018 19:01

Add ability to set definitive value selection for duplicate timestamp…

8b384c6

…s, ensure writing same value doesn't cause new encoder to be created

Fix build errors

76a0613

Fix writing to an empty series buffer

aa2dbc4

Fix build errors and out of date series mock

d5b8e9b

Fix remaining build errors

71ead80

Fix integration test

d76b1b0

Remove unused timeZero

c3e8b89

Increase leaktest timeout for CI

4450f8e

richardartoul reviewed Sep 10, 2018

View reviewed changes

Rob Skillington added 2 commits September 11, 2018 10:44

Address feedback

8859cd0

Merge branch 'master' into r/upsert-datapoints

f0c73f8

Rob Skillington added 2 commits September 11, 2018 10:48

Revert increase of leak check timeout

3d156c7

Merge branch 'master' into r/upsert-datapoints

9bb9151

richardartoul force-pushed the master branch from 2883443 to 2cf04be Compare September 11, 2018 18:16

richardartoul reviewed Sep 11, 2018

View reviewed changes

richardartoul previously approved these changes Sep 11, 2018

View reviewed changes

robskillington dismissed richardartoul’s stale review via 11dfeb6 September 11, 2018 19:12

m3db deleted a comment from richardartoul Sep 11, 2018

Add lastWriteAt when appending encoders

11dfeb6

richardartoul approved these changes Sep 11, 2018

View reviewed changes

prateek reviewed Sep 11, 2018

View reviewed changes

src/dbnode/storage/series/buffer.go

return err

}

if last.Value == value {

// No-op since matches the current value

Copy link

Collaborator

prateek Sep 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

prateek reviewed Sep 11, 2018

View reviewed changes

robskillington merged commit 0061da1 into master Sep 12, 2018

robskillington deleted the r/upsert-datapoints branch September 12, 2018 02:51

		@@ -0,0 +1,156 @@
		// Copyright (c) 2017 Uber Technologies, Inc.


		b.encoders[idx].lastWriteAt = datapoint.Timestamp

		if b.empty {


		package encoding

		import (

Use upsert behavior for datapoints written to the mutable series buffer #876

Use upsert behavior for datapoints written to the mutable series buffer #876

Conversation

robskillington commented Sep 2, 2018 • edited by richardartoul Loading

codecov bot commented Sep 2, 2018 • edited Loading

Codecov Report

richardartoul commented Sep 4, 2018 • edited Loading

robskillington commented Sep 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Sep 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul left a comment

Choose a reason for hiding this comment

richardartoul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Sep 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington commented Sep 2, 2018 •

edited by richardartoul

Loading

codecov bot commented Sep 2, 2018 •

edited

Loading

richardartoul commented Sep 4, 2018 •

edited

Loading

robskillington Sep 11, 2018 •

edited

Loading

robskillington Sep 12, 2018 •

edited

Loading