Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Merge chunks together on compaction #397

Closed

Conversation

csmarchbanks
Copy link
Contributor

@csmarchbanks csmarchbanks commented Sep 26, 2018

fixes: #236

This is a first pass at merging small chunks together into larger ones. Initial query benchmark results seem promising for when blocks become larger. This code currently takes the 120 samples per 2 hours plan used in the head and makes the chunk size proportionally larger as the block size increases.

benchcmp results between first and last commits:

benchmark                                                          old ns/op     new ns/op     delta
BenchmarkPersistedQueries/series=10,samplesPerSeries=1000-8        103098        106463        +3.26%
BenchmarkPersistedQueries/series=10,samplesPerSeries=10000-8       468642        259983        -44.52%
BenchmarkPersistedQueries/series=10,samplesPerSeries=100000-8      3987684       209855        -94.74%
BenchmarkPersistedQueries/series=100,samplesPerSeries=1000-8       863718        852374        -1.31%
BenchmarkPersistedQueries/series=100,samplesPerSeries=10000-8      4582881       2465040       -46.21%
BenchmarkPersistedQueries/series=100,samplesPerSeries=100000-8     47530956      1918334       -95.96%

benchmark                                                          old allocs     new allocs     delta
BenchmarkPersistedQueries/series=10,samplesPerSeries=1000-8        427            427            +0.00%
BenchmarkPersistedQueries/series=10,samplesPerSeries=10000-8       1977           1107           -44.01%
BenchmarkPersistedQueries/series=10,samplesPerSeries=100000-8      17267          877            -94.92%
BenchmarkPersistedQueries/series=100,samplesPerSeries=1000-8       3738           3738           +0.00%
BenchmarkPersistedQueries/series=100,samplesPerSeries=10000-8      19238          10538          -45.22%
BenchmarkPersistedQueries/series=100,samplesPerSeries=100000-8     172145         8238           -95.21%

benchmark                                                          old bytes     new bytes     delta
BenchmarkPersistedQueries/series=10,samplesPerSeries=1000-8        70499         70500         +0.00%
BenchmarkPersistedQueries/series=10,samplesPerSeries=10000-8       194376        123395        -36.52%
BenchmarkPersistedQueries/series=10,samplesPerSeries=100000-8      1317825       92106         -93.01%
BenchmarkPersistedQueries/series=100,samplesPerSeries=1000-8       341780        341780        +0.00%
BenchmarkPersistedQueries/series=100,samplesPerSeries=10000-8      1580558       870738        -44.91%
BenchmarkPersistedQueries/series=100,samplesPerSeries=100000-8     12815166      557849        -95.65%

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I'm not familiar that much with the tsdb code but it looks ok to me.

compact.go Outdated
func mergeChunks(chks []chunks.Meta) []chunks.Meta {
newChks := make([]chunks.Meta, 0, len(chks))
for i := 0; i < len(chks); i++ {
if i < len(chks)-1 && chks[i].Chunk.NumSamples()+chks[i+1].Chunk.NumSamples() <= 480 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be easier to read like this:

for i := 0; i < len(chks); i++ {
  if i >= len(chks) - 1 || chks[i].Chunk.NumSamples()+chks[i+1].Chunk.NumSamples() > 480 {
    newChks = append(newChks, chks[i])
    continue
  }
  newChunk := chunkenc.NewXORChunk()
  ...
}

compact.go Outdated
newChunk := chunkenc.NewXORChunk()
app, err := newChunk.Appender()
if err != nil {
return chks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also log the error? Also in other places where we are returning chks due to error.

@krasi-georgiev
Copy link
Contributor

once we clear some previous PRs I will focus on this one. Sorry for the delay.

@krasi-georgiev
Copy link
Contributor

please remove the [WIP] when this is ready for a review.

@csmarchbanks csmarchbanks changed the title [WIP] Merge chunks together on compaction Merge chunks together on compaction Oct 13, 2018
@csmarchbanks
Copy link
Contributor Author

@krasi-georgiev This is ready for a review now.

Sorry it took awhile, the last two weeks have been a bit crazy for me.

@csmarchbanks
Copy link
Contributor Author

Also, it seems worthwhile to run prombench for this. Should I make a PR updating tsdb in Prometheus to do that? Or is there another a better way to run prombench for changes in tsdb?

@krasi-georgiev
Copy link
Contributor

yeah for now we need a PR against Prometheus, but let me review it first and will run the tests after.
It will probably be a week or two before I get to this since we have few other PRs to finish before this one.

logger log.Logger
ranges []int64
chunkPool chunkenc.Pool
mergeSmallChunks bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think we should make it configurable?
What reason would we have to have it disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly, the issue suggested that it should be an option. Also, I could imagine some queries getting slower if they look at small time ranges from long ago due to seeks taking longer. I am planning to do some additional benchmarking in that area soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok ping me when ready for another review. Thanks

compact.go Show resolved Hide resolved
querier_test.go Outdated Show resolved Hide resolved
querier_test.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
Signed-off-by: Chris Marchbanks <[email protected]>
compact_test.go Outdated Show resolved Hide resolved
compact_test.go Outdated
{
{
lset: map[string]string{"a": "b"},
chunks: [][]sample{{{t: 0}, {t: 10}}, {{t: 11}, {t: 20}}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also thinking that we should also test that the size of the merged chunks is as expected - doesn't pass the maxSamples boundary and not less than what is expected. This is to prevent bugs for creating too big chunks or smaller than the original once.
Probably better to add in a separate test. Maybe should also move the logic from here to the new dedicated test for merging.

also check the new chunks has the expected min/max times

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Oct 20, 2018

4hs later 😅 and afer trying few different things this is what I came up with :

// mergeChunks .....
func mergeChunks(chks []chunks.Meta, maxSamples int) ([]chunks.Meta, error) {
	mergedChks := make([]chunks.Meta, 0, len(chks))

	for range chks {

		chkNew, totalAppended, err := appendChunks(chks, maxSamples)
		if err != nil {
			return nil, err
		}

		mergedChks = append(mergedChks, chkNew)

		// All existing chunks were merged so no need to continue.
		if len(chks) == totalAppended {
			break
		}
		// Cut the slice so that the next loop run includes
		// only chunks after the last merged chunk.
		chks = chks[totalAppended:]
	}

	return mergedChks, nil
}

// appendChunks .....
func appendChunks(chks []chunks.Meta, maxSamples int) (chunks.Meta, int, error) {
	var totalAppended int

	if len(chks) == 1 {
		return chks[0], 0, nil
	}

	chunkNew := chunkenc.NewXORChunk()
	appNew, err := chunkNew.Appender()
	if err != nil {
		return chunks.Meta{}, 0, err
	}

	for _, chkExisting := range chks {
		// Keep total samples lower than the maxSamples.
		if chunkNew.NumSamples()+chkExisting.Chunk.NumSamples() > maxSamples {
			break
		}
		it := chkExisting.Chunk.Iterator()
		for it.Next() {
			appNew.Append(it.At())
		}
		if err := it.Err(); err != nil {
			return chunks.Meta{}, 0, err
		}
		totalAppended++
	}

	mergedChk := chunks.Meta{
		MinTime: chks[0].MinTime,
		MaxTime: chks[totalAppended-1].MaxTime,
		Chunk:   chunkNew,
	}
	return mergedChk, totalAppended, nil

}

Signed-off-by: Chris Marchbanks <[email protected]>
@csmarchbanks
Copy link
Contributor Author

csmarchbanks commented Oct 21, 2018

I added a unit test for mergeChunks, decided to leave the compaction tests in there to supplement the specific unit test.

I am going to think a bit more on how to simplify mergeChunks, I am not super happy with either way yet...

EDIT: Ended up doing something similar to your idea, but with a couple shortcuts still in mergeChunks. Let me know what you think.

Signed-off-by: Chris Marchbanks <[email protected]>
@fabxc
Copy link
Contributor

fabxc commented Oct 22, 2018

makes the chunk size proportionally larger as the block size increases.

That sounds wrong. The compression ratio nearly reaches optimum average at about 60 samples. There's virtually no further gain at all beyond 120 samples (Figure 6).
All we save is indexing those chunks but index size has not been a major concern IIRC.

Do we have any data that would back up that bigger chunks would become more efficient?
If you keep making chunks bigger and bigger, that would effectively mean spending a ton of CPU on each compaction to re-compress virtually every series, no?
Queries will have to decompress day-long chunks just to access a small window of data within them. Thus a query for 1h and 4 weeks could take the same amount of time in unlucky cases.

Please correct me if I misunderstood something.
This seems like an expensive and invasive operation and we'd need some hard data what improvement this shows. The added benchmark is relatively micro and doesn't reflect the big picture I think. There are some impressive improvements for sure – it would be good to verify whether those are owed to samples being actually compressed in one big sequence or whether this is a simple allocation problem that could just be optimized away.

@krasi-georgiev
Copy link
Contributor

@csmarchbanks we can probably test all this roughly in a prombench test. If you open a PR against Prometheus with these changes I can start a test leave it running and test some query timings, cpu usage etc.

@csmarchbanks
Copy link
Contributor Author

@fabxc Your understanding is correct, definitely more cpu on compaction, and I don't know if it would make a real world difference. I wanted to get this in an ok enough place where it would make sense to test using prombench/a custom build of prometheus for a more realistic scenario, and verify if there are any noticeable improvements. If not, then I will close this PR because the complexity and extra cpu on compaction is definitely not worth it.

I will look into some allocation optimizations as well. That sounds like an interesting path to pursue.

@krasi-georgiev I will let you know when I make a prometheus PR. Should be sometime today.

@csmarchbanks
Copy link
Contributor Author

Looking at the benchmarks in prometheus/prometheus#4768, I don't think making bigger chunks is worthwhile. Query performance does not change much, and some compactions are taking much much longer (40 minutes vs 10 minutes).

An alternative could be to merge tiny chunks (say less than 40 samples) into another chunk to increase compression a bit. However, in my prometheus no more than 1% of chunks would be merged, so I doubt the extra complexity is worthwhile.

@csmarchbanks
Copy link
Contributor Author

Not worth it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merging of smaller-chunks during compaction
5 participants