Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: optimize point lookups on column families #30744

Merged
merged 1 commit into from
Oct 9, 2018

Conversation

solongordon
Copy link
Contributor

For tables with multiple column families, point lookups will now only
scan column families which contain the needed columns. Previously we
would scan the entire row. This optimization allows for faster lookups
and, perhaps more importantly, reduces contention between operations on
the same row but disjoint column families.

Fixes #18168

Release note: None

@solongordon solongordon requested a review from a team as a code owner September 27, 2018 21:18
@solongordon solongordon requested a review from a team September 27, 2018 21:18
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@solongordon
Copy link
Contributor Author

I'm going to add unit tests and additional logic tests, but the implementation is ready for a look if you're curious.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks great - very clean. The logic test failures are kind of weird. Why would this one have ended up looking at more spans than before?

testdata/select:578: SELECT message FROM [SHOW KV TRACE FOR SESSION]
	 WHERE message LIKE 'fetched:%' OR message LIKE 'output row%'
	expected:
	    fetched: /dt/primary/'2015-08-25 04:45:45.53453+00:00'/b -> '2015-08-25'
	    fetched: /dt/primary/'2015-08-25 04:45:45.53453+00:00'/c -> '2h45m2s234ms'
	    output row: ['2015-08-25 04:45:45.53453+00:00' '2015-08-25' '2h45m2s234ms']
	    
	but found (query options: "") :
	    fetched: /dt/primary/'2015-08-25 04:45:45.53453+00:00' -> NULL
	    fetched: /dt/primary/'2015-08-25 04:45:45.53453+00:00'/b -> '2015-08-25'
	    fetched: /dt/primary/'2015-08-25 04:45:45.53453+00:00'/c -> '2h45m2s234ms'
	    output row: ['2015-08-25 04:45:45.53453+00:00' '2015-08-25' '2h45m2s234ms']

Reviewed 7 of 7 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt_index_selection.go, line 687 at r1 (raw file):

	// least one non-nullable column in the needed column families, we can
	// potentially omit the primary family, since the primary keys are encoded
	// in all families.

Cool idea!

Copy link
Contributor Author

@solongordon solongordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I actually removed that first line from the expected output because that's the result I was seeing locally (and it seemed reasonable). But looks like Teamcity is still seeing the old result. I'll check it out. Also will look into why zone config tests are unhappy.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@solongordon
Copy link
Contributor Author

Looks like this breaks the delete fast path, since that only deletes the spans from the delete node's underlying scan node. I'm looking into what the proper fix for that should be but open to suggestions.

[email protected]:52406/defaultdb> create table t (x int primary key, y int, z int, family (y), family (z));
CREATE TABLE

[email protected]:52406/defaultdb> insert into t values (1, 2, 3);
INSERT 1

[email protected]:52406/defaultdb> delete from t where x = 1;
DELETE 1

[email protected]:52406/defaultdb> select * from t;
  x |  y   | z
+---+------+---+
  1 | NULL | 3
(1 row)

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt_index_selection.go, line 603 at r1 (raw file):

// non-adjacent column families should be scanned.
func spansFromConstraintSpan(
	tableDesc *sqlbase.TableDescriptor,

It's better if this function takes an input Spans and appends to it, or we end up allocating a temporary Spans for each logical spans


pkg/sql/opt_index_selection.go, line 687 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Cool idea!

Except for composite datums :)

Copy link
Contributor Author

@solongordon solongordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt_index_selection.go, line 603 at r1 (raw file):

Previously, RaduBerinde wrote…

It's better if this function takes an input Spans and appends to it, or we end up allocating a temporary Spans for each logical spans

Done


pkg/sql/opt_index_selection.go, line 687 at r1 (raw file):

Previously, RaduBerinde wrote…

Except for composite datums :)

Aha, good point. Adding that to the comment since I will surely forget about that exception.

Copy link
Contributor Author

@solongordon solongordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I fixed the deletion issues. I don't love the solution, but it's the least hacky approach I could find. I'm disabling this optimization if the spans might be used for the delete fast path. The annoying part was that this requires the scanNode to be aware that it is the source for a delete. Best way I could find to do this was to set a flag during plan expansion. I'm certainly open to other suggestions.

I still intend to add more testing.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@solongordon solongordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt_index_selection.go, line 687 at r1 (raw file):

Previously, solongordon (Solon) wrote…

Aha, good point. Adding that to the comment since I will surely forget about that exception.

Done.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/expand_plan.go, line 126 at r2 (raw file):

		// If the source of the delete is a scan node (optionally with a render on
		// top), mark it as such. Note that this parallels the logic in
		// canDeleteFast.

To minimize this hacky-ness, I would make sure to add a corresponding comment to canDeleteFast. If that implementation changes and this one doesn't, what happens?

Also, I'm not sure how feasible this is, but can you have the deleteNode declare all columns as required? Or does this mess other things up?

Copy link
Contributor Author

@solongordon solongordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/expand_plan.go, line 126 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

To minimize this hacky-ness, I would make sure to add a corresponding comment to canDeleteFast. If that implementation changes and this one doesn't, what happens?

Also, I'm not sure how feasible this is, but can you have the deleteNode declare all columns as required? Or does this mess other things up?

Yes, will do.

I fooled around with the required columns approach but didn't have much luck. From the perspective of the delete node it's already marking all columns as needed:

case *deleteNode:
setNeededColumns(n.source, allColumns(n.source))

but in many cases this is only a subset of the columns because there is a projection between it and the scan node. It's probably possible to make this work but it felt like I was heading down a hackier path than the scanNode flag approach.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can you add a couple of EXPLAIN logic tests to ensure we don't regress?

@solongordon solongordon requested a review from a team October 9, 2018 14:21
PRIMARY KEY (a, b),
FAMILY (a, b),
FAMILY (c),
FAMILY (d)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests. The main one I was thinking of that I don't see here is a test that two adjacent non-primary families get coalesced into a single span. I'm sure it works today, since similar logic triggers for the primary-adjacent families, but this seems like a case that might regress as code changes.

@@ -1336,3 +1336,105 @@ render · · (w) ·
│ spans /1-/10 · ·
└── scan · · (v, w) ·
· table t3@primary · ·

# ------------------------------------------------------------------------------
# These tests are for the point lookup optimization, which applies to SELECTs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well throw a quick explanation of the optimization in here.

pkg/sql/opt_index_selection.go Show resolved Hide resolved
Copy link
Contributor Author

@solongordon solongordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/testdata/select_index, line 1341 at r4 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Might as well throw a quick explanation of the optimization in here.

Done.


pkg/sql/opt/exec/execbuilder/testdata/select_index, line 1353 at r4 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Nice tests. The main one I was thinking of that I don't see here is a test that two adjacent non-primary families get coalesced into a single span. I'm sure it works today, since similar logic triggers for the primary-adjacent families, but this seems like a case that might regress as code changes.

Good idea, done.

For tables with multiple column families, point lookups will now only
scan column families which contain the needed columns. Previously we
would scan the entire row. This optimization allows for faster lookups
and, perhaps more importantly, reduces contention between operations on
the same row but disjoint column families.

Fixes cockroachdb#18168

Release note: None
Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@solongordon
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Oct 9, 2018
30744: sql: optimize point lookups on column families r=solongordon a=solongordon

For tables with multiple column families, point lookups will now only
scan column families which contain the needed columns. Previously we
would scan the entire row. This optimization allows for faster lookups
and, perhaps more importantly, reduces contention between operations on
the same row but disjoint column families.

Fixes #18168

Release note: None

Co-authored-by: Solon Gordon <[email protected]>
@craig
Copy link
Contributor

craig bot commented Oct 9, 2018

Build succeeded

@craig craig bot merged commit ae97046 into cockroachdb:master Oct 9, 2018
@solongordon solongordon deleted the column-family-opt branch October 10, 2018 12:00
@knz
Copy link
Contributor

knz commented Oct 12, 2018

Not sure this may want a backport?

@solongordon
Copy link
Contributor Author

@nvanbenschoten and I discussed a backport and decided it was too risky for this late in the stability period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants