Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compaction): Support Lmax to Lmax compaction #1615

Merged
merged 21 commits into from
Jan 4, 2021

Conversation

jarifibrahim
Copy link
Contributor

@jarifibrahim jarifibrahim commented Dec 8, 2020

This PR adds support for Lmax to Lmax compaction. These compactions are useful
when the last level of the LSM Tree contains stale data that might not get cleaned up
in the normal compaction process.
The Lmax Compactions can be enabled by setting opt.LmaxCompaction=true.

This change is Reviewable

Ibrahim Jarif added 2 commits December 16, 2020 19:46
Conflicts:
	db.go
	go.sum
	levels.go
	table/table.go
levels.go Show resolved Hide resolved
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: approved.

Reviewed 10 of 15 files at r1, 3 of 4 files at r3.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @jarifibrahim)


compaction.go, line 227 at r3 (raw file):

	// table. In case of the max level, we might rewrite a single table to
	// remove stale data.
	if cd.thisLevel != cd.nextLevel && !cd.nextRange.isEmpty() {

How do we deal with L0->L0?


db.go, line 975 at r3 (raw file):

		switch {
		case firstKeyHasDiscardSet && y.SameKey(lastKey, iter.Key()):
			b.AddStaleKey(iter.Key(), iter.Value(), vp.Len)

Add a note that we don't want to drop any of these keys. Just jot them down.

Maybe just remove this considering how frequently L0 is compacted.


levels.go, line 506 at r3 (raw file):

			}
			// Each ticker is 50ms so 50*200=10seconds.
			if id == 2 && count == 200 {

count >= 200


levels.go, line 781 at r3 (raw file):

			}
			switch {
			case firstKeyHasDiscardSet && y.SameKey(it.Key(), lastKey):

Do we still need y.SameKey? Adds extra weight to the critical path.


levels.go, line 1243 at r3 (raw file):

	sortedTables := make([]*table.Table, len(tables))
	copy(sortedTables, tables)
	s.sortByStaleDataSize(sortedTables, cd)

add a space after.


levels.go, line 1254 at r3 (raw file):

		for ; j < len(tables); j++ {
			// Move forward until we find the correct table.
			if tables[j].ID() == t.ID() {

Do binary search, use the left key for searching. And you can double check if the ID matches. It should.


levels.go, line 1279 at r3 (raw file):

			continue
		}
		if now.Sub(t.CreatedAt) < 10*time.Second {

I think this could be set to an hour.


stream_writer.go, line 380 at r3 (raw file):

	switch {
	case sameKey && w.firstKeyHasDiscard:
		w.builder.AddStaleKey(key, vs, vp.Len)

In Stream Writer, you can just avoid writing these to begin with. Don't write those stale keys.


util.go, line 57 at r3 (raw file):

			return errors.Errorf(
				"Inter: Biggest(j-1)[%d] \n%s\n vs Smallest(j)[%d]: \n%s\n: level=%d j=%d numTables=%d",
				s.tables[j-1].ID(), hex.Dump(s.tables[j-1].Biggest()), s.tables[j].ID(), hex.Dump(s.tables[j].Smallest()),

100 chars.

Copy link
Contributor Author

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 7 of 13 files reviewed, 9 unresolved discussions (waiting on @manishrjain)


compaction.go, line 227 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

How do we deal with L0->L0?

The nextRange is empty in case of L0->L0


db.go, line 975 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Add a note that we don't want to drop any of these keys. Just jot them down.

Maybe just remove this considering how frequently L0 is compacted.

Done.


levels.go, line 506 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

count >= 200

Done.


levels.go, line 781 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Do we still need y.SameKey? Adds extra weight to the critical path.

We can skip it.


levels.go, line 1243 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

add a space after.

Done.


levels.go, line 1254 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Do binary search, use the left key for searching. And you can double check if the ID matches. It should.

Done.


levels.go, line 1279 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I think this could be set to an hour.

Done.


stream_writer.go, line 380 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

In Stream Writer, you can just avoid writing these to begin with. Don't write those stale keys.

Done.


util.go, line 57 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

100 chars.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants