[DNM] rowexec, rowcontainer: add an invertedJoiner, for #48019

sumeerbhola · 2020-04-24T15:25:00Z

joining two tables where one has an inverted index

Ignore the first commit, which is from #47424
There are currently no tests for this code. There are also some
questions listed as bare todos ("TODO:") in the code, relating to
descriptors and using the encoded inverted column to construct
spans to retrieve from the index. I'd like to get a sanity check
on the approach, and get answers to those questions before
proceeding to add tests.

InvertedJoinerSpec is the spec for the invertedJoiner and
consists of two expressions:
- involving the inverted column and the corresponding column
  in the input that will be used for lookup. For geospatial
  these would be either both geometry columns or both geography
  columns.
- on expression that can involve the other columns on the
  two sides.
  The join is a conjunction of both expressions.
RowToInvertedIndexExpr is an interface that uses an input row
to produce a reverse polish set expression involving spans of
the inverted column. For geospatial, this will be implemented
by the functionality in GeographyIndex and GeometryIndex.
invertedJoiner is given an implementation of RowToInvertedIndexExpr
for the join it is executing, so it can be abstracted from the
details on how each input row is converted into an expression.
invertedJoiner operates analogous to a "lookup join" -- it
consumes a batch of input rows, computes the expression for
each row, unions the spans needed by this batch of expressions,
and fetches from the inverted index to evaluate the expressions.
invertedJoiner will be used for geospatial joins and could be used
for JSON and array joins (it is not clear to me if we currently
use inverted indexes for JSON and array).
batchedInvertedExprEvaluator is used by invertedJoiner to
evaluate the join on a batch of input rows. It is also to be
used for the non-join case where one is selecting from a table
using an expression that involves literals and the inverted column.
InvertedIndexRowContainer is used by invertedJoiner to dedup
the rows retrieved from the inverted index (minus the inverted
column). This allows the expr evaluators to work with integers
as set members. There is only a memory-backed implementation for
now but adding a disk-backed implementation will be straightforward.

Release note: None

cockroach-teamcity · 2020-04-24T15:25:10Z

This change is

rytaft

This is exciting to see!

I didn't look through the inverted_joiner.go code in detail, since I think @jordanlewis or someone else from the execution team would have a better sense of whether it's doing the right thing.

The other code looks right to me, but I think it needs more comments.

We should definitely discuss how we can unify the inverted index constraints we're creating for JSON and Array types with the RPInvertedIndexExpr you've created here. @mgartner has been working on improving inverted index constraint generation in the optimizer, so we should probably all meet to discuss.

it is not clear to me if we currently use inverted indexes for JSON and array

We don't yet create plans for inverted index lookup joins in the optimizer (other than the geospatial lookup join that I just added). Currently we only create inverted index scans and zig zag joins. Let me know if you want any pointers to that code.

Reviewed 18 of 18 files at r1, 5 of 5 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, and @sumeerbhola)

pkg/sql/execinfrapb/processors_sql.proto, line 461 at r2 (raw file):

  // The inverted index. The first column is the inverted column, and the
  // the remaining columns are the primary key.
  optional uint32 index_idx = 2 [(gogoproto.nullable) = false];

Is this the ID of the index? If so, that's not really clear from the comment...

pkg/sql/execinfrapb/processors_sql.proto, line 494 at r2 (raw file):

  // that produce false positives for user expressions, like geospatial
  // indexes, only INNER and LEFT_OUTER are actually useful -- LEFT_SEMI will
  // be mapped to INNER by the optimizer, and LEFT_ANTI to LEFT_OUTER, to

Is this a rule that we should create in the optimizer? Is it always more efficient in this case to map Semi to Inner and Anti to Left Outer?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 68 at r2 (raw file):

// "b":2}'::json or x ? 'g') and (x @> '{"a":1, "c":3}'::json or x ? 'h') can
// be refactored after the conversion to spans since union, intersection is
// precisely computable over spans.

What would this expression look like after conversion to spans and refactoring?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 75 at r2 (raw file):

type invertedExprEvaluator struct {
	expr RPInvertedIndexExpr
	// If expr[i] is an operand, the corresponding set elements are in

By set elements do you mean the output after performing the scan?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 236 at r2 (raw file):

// - selecting from an inverted index using an expression involving literals
//   and the indexed column. This will be a batch with a single element in
//   exprs.

Will this replace the way inverted index scans are currently executed? How will that work?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 268 at r2 (raw file):

}

func (b *batchedInvertedExprEvaluator) fragmentPendingSpans(fragmentUntil encInvertedVal) {

It would help to have a comment explaining what this function is doing (and why)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 314 at r2 (raw file):

}

func (b *batchedInvertedExprEvaluator) getSpans() []encInvertedValSpan {

Ditto -- need a comment

pkg/sql/rowexec/inverted_joiner.go, line 315 at r2 (raw file):

func (ij *invertedJoiner) generateSpan(enc encInvertedVal) (roachpb.Span, error) {
	// TODO: this may not work for JSON since it seems to use its own special

Take a look at

cockroach/pkg/sql/opt/idxconstraint/index_constraints.go

Line 825 in d5de1b7

func (c *indexConstraintCtx) makeInvertedIndexSpansForJSONExpr(

for how we do this in the optimizer, and

cockroach/pkg/sql/span/span_builder.go

Line 304 in 4b119a9

keys, err := sqlbase.EncodeInvertedIndexTableKeys(val, key)

for how it gets encoded.

pkg/sql/rowexec/inverted_joiner.go, line 454 at r2 (raw file):

	// nCols := len(ij.lookupCols)

	// isJoinTypePartialJoin := ij.joinType == sqlbase.LeftSemiJoin || ij.joinType == sqlbase.LeftAntiJoin

Some dead code here...

blathers-crl · 2020-04-28T11:58:31Z

❌ The GitHub CI (Cockroach) build has failed on a1e7fcb6.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

sumeerbhola

TFTR! I've addressed a few of the comments and I agree it could use more code comments.

I've been reading some code to figure out how this would get integrated and how one could implement the RowToInvertedIndexExpr interface. One of the things I'd like to figure out in the meeting today with you and @mgartner is the following:

Consider an expression that only involves columns in the inverted index. Say

((x @> '{"a":1, "b":2}'::jsonb OR x <@ '{"e":5}'::jsonb) AND y < 12) OR (x <@ '{"c":3}'::jsonb AND y > 30) OR (x @> '{"d":4, "b":2}'::jsonb AND z = 'foo')

where x is the inverted column and y, z happen to also be in the inverted index since they are PK columns.

We'll get to the contexts in which this expression comes about later. We need to do the following for this expression

Compute the spans to read from the inverted index. IIUC, the code in constraint.Span and constraint.Constraint and friends doesn't quite do the same thing since with inverted indexes we don't want to intersect the spans corresponding to the inverted index columns in order to figure out what spans to read. We can potentially extract common sub-expressions for efficiency of evaluation (though the general problem is hard), but the spans for the geospatial columns need to be read by unioning. But one could increase efficiency in some cases by extending the spans using y, z (analogous to Constraint.Combine)
Evaluate the expression. To do this, one needs to track the spans needed by each sub-expression so that the row read can be fed to all the sub-expressions that need it. That is, one cannot evaluate this row by row using the inverted index.

The code in this PR covers 1 and 2, but is slightly more specialized than 1 and 2. It uses batchedInvertedExprEvaluator and invertedExprEvaluator for 1 under the assumption that the expression only involves x. And for 2, the expression is represented using RPInvertedIndexExpr and then evaluated using the previously mentioned evaluators. One could additionally handle an expression involving y and z by following the same approach as the invertedJoiner (the on_expr in the InvertedJoinerSpec) if it is a conjunction with an expression involving x. One could slightly generalize this code but as far as I can tell not much -- the example expression above has sub-expressions involving x that are anded with sub-expressions involving y and z -- this conjunction structure is a generalization of the on_expr which is anded with the whole expression. But I don't think we can handle something like: x <@ '{"c":3}'::jsonb OR y > 30 using the inverted index.

There are two contexts in which the above expression can originate:

A single table expression with a WHERE clause: That expression could also involve columns not in the inverted index. I am assuming that the optimizer has changed the expression by removing the columns not in the inverted index when making the determination to use the inverted index -- the removed parts must correspond to conjunctions with the parts that can be evaluated using the inverted index to ensure that what is returned by the inverted index does not omit some rows that belong in the final result
- is there existing code in the optimizer that does such a transformation? I am assuming the same need must arise when using normal indexes.
- the optimizer can use this transformed expression to compute the spans needed. This PR assumes the optimizer would represent this expression using RPInvertedIndexExpr and use the batchedInvertedExprEvaluator.
- The RPInvertedIndexExpr is also the expression communicated to the execution machinery which will instantiate another batchedInvertedExprEvaluator to do the evaluation.
A join between two tables using the inverted index of one table: Again, I am assuming that the join expression has been transformed by the optimizer to exclude columns that cannot be evaluated using the inverted index.
In this case we have an expression involving the input columns and the inverted columns. Also say that the optimizer has separated the parts involving the inverted column and the remaining columns, as is represented using the InvertedJoinerSpec.{inverted_expr, on_expr}. Now the execution machinery, for each row in the input (I am ignoring the batching in the implementation since it is unimportant for this discussion):
- Use the read input row and the inverted_expr to compute an RPInvertedIndexExpr. This is a partial evaluation and I am not clear on what abstractions we currently have to do this. I see opt.Expr in the optimizer being used for generating constraints, and there is tree.Expr and tree.TypedExpr which can be evaluated to give a datum but neither seems to do quite what we want. I can write something from scratch, but I'd like to reuse as much existing machinery as possible.
- Once we have the RPInvertedIndexExpr we have the stuff we need to evaluate -- compute the spans using batchedInvertedExprEvaluator, read them and evaluate.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/execinfrapb/processors_sql.proto, line 461 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Is this the ID of the index? If so, that's not really clear from the comment...

Yes. Rephrased.

pkg/sql/execinfrapb/processors_sql.proto, line 494 at r2 (raw file):

Is this a rule that we should create in the optimizer?

Yes, I was assuming this mapping would be done in the optimizer. The execution code just follows the spec.

Is it always more efficient in this case to map Semi to Inner and Anti to Left Outer?

Hmm, maybe you mean that the optimizer could choose not to use the inverted index instead of doing this mapping. If so, yes that is possible.
The code comment was meant to be just about correctness -- we need the key information to be able to join back with the primary index.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 68 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

What would this expression look like after conversion to spans and refactoring?

This comment is incorrect -- my thinking was foggy when I wrote it.

I don't have the concrete output but abstractly if one labels the inverted value spans as
A for "a":1
B for "b":2
C for "c":3
G for ? 'g'
H for ? 'h'
where all the above spans are non-overlapping (normally there could be overlap in the spans) this would become

((A \intersection B) \union G) \intersection ((A \intersection C) \union H)

we can't really factor out A. The geospatial case had special structure which easily permitted factoring.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 75 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

By set elements do you mean the output after performing the scan?

Correct. Added a comment

pkg/sql/rowexec/inverted_expr_evaluator.go, line 236 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Will this replace the way inverted index scans are currently executed? How will that work?

AFAIK, we only use inverted indexes for zig-zag joins which can only do a very limited form of intersection between two sets where each set corresponds to one value in the inverted column. This is based on incomplete code reading, and running EXPLAIN on a bunch of queries, so I could have overlooked something.

batchedInvertedExprEvaluator is not limited in this manner. However, the generality comes at a cost -- it cannot skip scanning certain index entries at runtime based on what other parts of the scan are returning, unlike a zig-zag join. If we were able to identify some sub-expressions as highly selective, one could add some skipping behavior (later), but it may get complicated.

pkg/sql/rowexec/inverted_joiner.go, line 454 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Some dead code here...

Done

sumeerbhola

I am curious why we do joins with the primary index for the following three queries -- the id is in the inverted index:

CREATE TABLE test.t3 (a JSON, id SERIAL NOT NULL, PRIMARY KEY (id), INVERTED INDEX idx (a));

root@:26257/defaultdb> EXPLAIN SELECT id FROM test.t3 WHERE a @> '{"a":1}'::jsonb AND a @> '{"b":2}'::json;
          tree          |         field         |                description
------------------------+-----------------------+--------------------------------------------
                        | distributed           | true
                        | vectorized            | false
  render                |                       |
   └── lookup-join      |                       |
        │               | table                 | t3@primary
        │               | type                  | inner
        │               | equality              | (id) = (id)
        │               | equality cols are key |
        │               | parallel              |
        │               | pred                  | (@2 @> '{"a": 1}') AND (@2 @> '{"b": 2}')
        └── zigzag-join |                       |
             │          | type                  | inner
             ├── scan   |                       |
             │          | table                 | t3@idx
             │          | fixedvals             | 1 column
             └── scan   |                       |
                        | table                 | t3@idx
                        | fixedvals             | 1 column


root@:26257/defaultdb> EXPLAIN SELECT id FROM test.t3 WHERE a @> '{"a":1}'::jsonb AND a @> '{"b":2}'::json AND id=25;
          tree          |         field         |                description
------------------------+-----------------------+--------------------------------------------
                        | distributed           | true
                        | vectorized            | false
  render                |                       |
   └── lookup-join      |                       |
        │               | table                 | t3@primary
        │               | type                  | inner
        │               | equality              | (id) = (id)
        │               | equality cols are key |
        │               | parallel              |
        │               | pred                  | (@2 @> '{"a": 1}') AND (@2 @> '{"b": 2}')
        └── zigzag-join |                       |
             │          | type                  | inner
             │          | pred                  | @1 = 25
             ├── scan   |                       |
             │          | table                 | t3@idx
             │          | fixedvals             | 1 column
             └── scan   |                       |
                        | table                 | t3@idx
                        | fixedvals             | 1 column

root@:26257/defaultdb> EXPLAIN SELECT id FROM test.t3 WHERE a @> '{"a":1}'::jsonb;
       tree       |    field    |       description
------------------+-------------+--------------------------
                  | distributed | false
                  | vectorized  | false
  render          |             |
   └── index-join |             |
        │         | table       | t3@primary
        │         | key columns | id
        └── scan  |             |
                  | table       | t3@idx
                  | spans       | /"a"/1-/"a"/1/PrefixEnd

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

andy-kimball

It's a known, long-time issue: #46765. Hopefully we can fix this release, though there are complications. The basic problem is that the index-join is added by an exploration rule, we don't do column pruning during exploration, and this means we can't tell that the index join is not needed. A generalized fix requires quite a bit of work. A specific fix for just this particular case would be easier.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

rytaft

I think we discussed most of the issues listed above in the meeting we just had, but just to summarize some of the things we discussed:

For the complex filter expression you printed above, we agreed that even though y and z are primary key columns contained in the inverted index, we'd start by only generating spans based on the subexpressions with x. The remaining filters would be applied after, with a Select expression wrapping the scan (or as ON conditions of a join).
@mgartner will look into how feasible it is to extract a RPInvertedIndexExpr representing an arbitrary JSON expression, while also making the constraints usable by the optimizer for constructing a zig zag join.
The inverted index constraints library should be callable from the optimizer to support constrained scans (using this new batchedInvertedExprEvaluator) as well as from the execution engine in order to support inverted index lookup joins.

Reviewed 4 of 4 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/execinfrapb/processors_sql.proto, line 494 at r2 (raw file):

Previously, sumeerbhola wrote…

Is this a rule that we should create in the optimizer?

Yes, I was assuming this mapping would be done in the optimizer. The execution code just follows the spec.

Is it always more efficient in this case to map Semi to Inner and Anti to Left Outer?

Hmm, maybe you mean that the optimizer could choose not to use the inverted index instead of doing this mapping. If so, yes that is possible.
The code comment was meant to be just about correctness -- we need the key information to be able to join back with the primary index.

I mean the optimizer could plan a semi join with the inverted index instead of an inner join. This could be more efficient because it only has to return the input rows that matched something in the inverted index. As long as there are no intersection operations (e.g., for coveredBy), we wouldn't need to continue scanning the inverted index as soon as a single match was found.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 236 at r2 (raw file):

AFAIK, we only use inverted indexes for zig-zag joins

We also use them for constrained scans. As we discussed today, this operator should basically replace all inverted index scans.

rytaft · 2020-04-28T17:11:13Z

pkg/sql/execinfrapb/processors_sql.proto, line 494 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I mean the optimizer could plan a semi join with the inverted index instead of an inner join. This could be more efficient because it only has to return the input rows that matched something in the inverted index. As long as there are no intersection operations (e.g., for coveredBy), we wouldn't need to continue scanning the inverted index as soon as a single match was found.

Just to follow up -- I realize this probably wouldn't ever be relevant for geospatial functions since we'll always have false positives, but it could be relevant for JSON or ARRAY inverted indexes.

andy-kimball

The inverted index constraints library should be callable from the optimizer

My understanding is that the inverted index constraints would include the full expansion of Union/Intersection ops needed to satisfy a geo-spatial index seek (i.e. a single CoveredBy operator can require many geo-spatial index seeks). Are we saying the optimizer would perform that full expansion? Or are we saying we'd wait until execbuilder time to do that expansion?

Also, to see if I'm understanding how this would work, say we had the following JSON expression:

x @> '{"a": 1}' AND x @> '{"a": 10}'

Would the RPInvertedIndexExpr represent that using a RPSetOperator UNION expression with 2 input encInvertedValSpan expressions?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 22 at r3 (raw file):

type RowIndex = rowcontainer.RowIndex

// RPExprElement is an element in the Reverse Polish notation expression.

I'm confused as to why Reverse Polish notation is mentioned here. Are you encoding an expression tree in a binary buffer using Reverse Polish notation? If so, why mention that in your top-level interfaces? That'd just be a lower-level implementation detail. If you changed your lower-level encoding to use Polish notation instead, you wouldn't need to change the interfaces, right? I'd expect this comment to lay out a higher-level understanding of what this expression tree looks like:

It contains inverted value expressions.
It contains Union and Intersection operators that each take 2 operands.
The expressions can be composed together arbitrarily.

Something of that sort, perhaps with an example, would give me a quick mental model of the higher-level abstraction you're envisioning, rather than directing focus to a lower-level implementation detail that's presumably just about maximizing efficiency. You might mention Reverse polish notation in the comment, but only as a post-script, to explain how the current implementation works.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 66 at r3 (raw file):

// overlapping spans, though the evaluator does not require it. Note that even
// seemingly cumbersome to refactor expressions like (x @> '{"a":1,
// "b":2}'::json or x ? 'g') and (x @> '{"a":1, "c":3}'::json or x ? 'h') can

What's the "?" operator here? Is that meant to indicate "any comparison" op like Eq, Gt, etc? If so, maybe use instead.

jordanlewis

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @otan, @rytaft, and @sumeerbhola)

pkg/geo/geoindex/config.proto, line 22 at r3 (raw file):

// At the moment, only one major indexing strategy is implemented (S2 cells).
message Config {
  option (gogoproto.equal) = true;

What I meant in the meeting - I think all of these changes should be kept in a separate PR.

pkg/sql/execinfrapb/processors_sql.proto, line 455 at r3 (raw file):

// InvertedJoinerSpec is the specification for a join between two columns
// where one has an inverted index.

You should include the "internal schema" of this join, the way the other processor specs have. The purpose of this schema is to show, for each processor, which columns are outputted from the processor, and in which order. It's very useful to be able to refer to this later.

I also think this would benefit from an example. This spec documentation is usually where I think to go to understand the purpose of a processor.

pkg/sql/rowcontainer/inverted_index_row_container.go, line 28 at r3 (raw file):

// InvertedIndexRowContainer is a container that dedups the added rows and
// assigns an index to each deduped row. NULLs are considered equal.
type InvertedIndexRowContainer interface {

I think you should make this row container work into a separate PR and assign to Yahor or Alfonso for review. Overall, I'm not entirely thrilled that we had to make another row container for this work as we've already got several implementations, but it looks like this isn't too complex so maybe it's fine.

Also, isn't this container an extension of the normal container interface? You should embed that interface into this one:

type NewType interface {
  EmbeddedType // this gives all of EmbeddedType's methods to NewType
  Foo()
  ...

pkg/sql/rowexec/inverted_expr_evaluator.go, line 66 at r3 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

What's the "?" operator here? Is that meant to indicate "any comparison" op like Eq, Gt, etc? If so, maybe use instead.

? in Postgres means "contains key", maybe Sumeer meant that?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 256 at r3 (raw file):

func immediateSuccessor(a []byte) []byte {
	return append(a, '\x00')
}

You can use roachpb.BytesNext for this, and I think it's more correct - don't you want to increment the last byte if possible?

pkg/sql/rowexec/inverted_joiner.go, line 59 at r3 (raw file):

// constructed using the IndexDescriptor.
type RowToInvertedIndexExpr interface {
	Convert(sqlbase.EncDatumRow) (RPInvertedIndexExpr, error)

This needs a comment to pass the linter.

Convert converts rows into ...

pkg/sql/rowexec/inverted_joiner.go, line 62 at r3 (raw file):

}

type invertedJoiner struct {

I assume you're planning to add tests for this, but this processor would like to be tested in isolation.

pkg/sql/rowexec/inverted_joiner.go, line 353 at r3 (raw file):

	//   a batchedInvertedExprEvaluator. Use that evaluator to generate spans
	//   to read in the index.
	// - Retrieve the index lookup results and place these rows in the

Could we implement this stage using the existing lookup join code somehow? I feel that it's not optimal that we're introducing another user of the low level fetcher logic.

pkg/sql/rowexec/inverted_joiner.go, line 517 at r3 (raw file):

		if !seenMatch {
			switch ij.joinType {

So we have something called joinerBase that does a lot of this work - is it not usable here?

andy-kimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_joiner.go, line 77 at r3 (raw file):

	// TODO(sumeer): onExpr could benefit from vectorized execution -- the rest
	// of the code here doesn't seem amenable to column-by-column processing
	// since it needs to process set operators on rows.

Have you already talked with Yahor/Alfonso on the SQL Execution team about this? In general, I think we're trying to move away from row-by-row. They might have concerns with adding even more row-wise execution operators that would have to be converted.

andy-kimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 66 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

? in Postgres means "contains key", maybe Sumeer meant that?

Another corner of Postgres that I didn't know...

andy-kimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 32 at r3 (raw file):

// RPSetOperator is a set operator in the Reverse Polish notation expression.
type RPSetOperator int

The Intersection/Union/Value expressions feel similar to Constraint/Span/Datum expressions in the constraints library. How are these different? Is it the fact that these operations can be arbitrarily composed that you need? Or is it the efficiency aspect of encoding them in a byte buffer that you need?

Are there practical limits to how much composition we would want to allow? If I just slap together a bunch of Union/Intersect/Value expressions, I'd think perf would quickly degrade. My assumption is that there are a limited number of patterns that would actually provide real performance benefits. For standard indexes the pattern we care about is some conjunction (ConstraintSet) of span disjunctions (Constraint containing Spans). Is arbitrary composition of UNION/INTERSECTION really necessary to support for inverted indexes?

sumeerbhola

TFTR!

@mgartner will look into how feasible it is to extract a RPInvertedIndexExpr representing an arbitrary JSON expression, while also making the constraints usable by the optimizer for constructing a zig zag join.

We need two entry points to this library:

given an expression involving one variable column, generate the RPInvertedIndexExpr.
given an expression involving two variable columns, construct an expression generator. The expression generator will be fed values for one of the columns and will return an RPInvertedIndexExpr for each.
Having an interface for such a library would be a good first step. The implementation for the geospatial case will be much simpler since we expect all expressions will be single function calls.

x @> '{"a": 1}' AND x @> '{"a": 10}'
Would the RPInvertedIndexExpr represent that using a RPSetOperator UNION expression with 2 input encInvertedValSpan expressions?

Correct

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/geo/geoindex/config.proto, line 22 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

What I meant in the meeting - I think all of these changes should be kept in a separate PR.

Ah, didn't realize you were talking about the first commit -- it is from a different PR and to be ignored, as mentioned in this PR description. I have rebased now.

pkg/sql/execinfrapb/processors_sql.proto, line 494 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Just to follow up -- I realize this probably wouldn't ever be relevant for geospatial functions since we'll always have false positives, but it could be relevant for JSON or ARRAY inverted indexes.

It sounds like we are in agreement. And it seems to be captured in the code comment. Let me know if you think it could be rephrased to make it clearer.

pkg/sql/execinfrapb/processors_sql.proto, line 455 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

You should include the "internal schema" of this join, the way the other processor specs have. The purpose of this schema is to show, for each processor, which columns are outputted from the processor, and in which order. It's very useful to be able to refer to this later.

I also think this would benefit from an example. This spec documentation is usually where I think to go to understand the purpose of a processor.

Are the "internal columns" the output schema of the processor -- I couldn't find the definition anywhere?
Under that assumption, I've added a comment and an example.

pkg/sql/rowcontainer/inverted_index_row_container.go, line 28 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I think you should make this row container work into a separate PR and assign to Yahor or Alfonso for review. Overall, I'm not entirely thrilled that we had to make another row container for this work as we've already got several implementations, but it looks like this isn't too complex so maybe it's fine.

Also, isn't this container an extension of the normal container interface? You should embed that interface into this one:
type NewType interface {
  EmbeddedType // this gives all of EmbeddedType's methods to NewType
  Foo()
  ...

Should they look at it now? My plan was to separate out into 3 PRs after this passes initial sanity check -- it would be hard to explain why we need this container without the context.
Regarding embedding, I did notice that we are embedding for the other row container implementations, but the invertedJoiner does not call the wide swathe of embedded methods and I didn't want to unnecessarily expose them. I didn't see any other place that would call into them e.g. mon.BytesMonitor seems to be called by the containers and not the other way around. Did I miss something?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 68 at r2 (raw file):

Previously, sumeerbhola wrote…

This comment is incorrect -- my thinking was foggy when I wrote it.

I don't have the concrete output but abstractly if one labels the inverted value spans as
A for "a":1
B for "b":2
C for "c":3
G for ? 'g'
H for ? 'h'
where all the above spans are non-overlapping (normally there could be overlap in the spans) this would become

((A \intersection B) \union G) \intersection ((A \intersection C) \union H)

we can't really factor out A. The geospatial case had special structure which easily permitted factoring.

I've updated the comment to drop the incorrect parts and moved the example earlier in the file.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 236 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

AFAIK, we only use inverted indexes for zig-zag joins

We also use them for constrained scans. As we discussed today, this operator should basically replace all inverted index scans.

Ack

pkg/sql/rowexec/inverted_expr_evaluator.go, line 268 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

It would help to have a comment explaining what this function is doing (and why)

I've added a comment with an example.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 314 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Ditto -- need a comment

Added a comment.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 22 at r3 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I'm confused as to why Reverse Polish notation is mentioned here. Are you encoding an expression tree in a binary buffer using Reverse Polish notation? If so, why mention that in your top-level interfaces? That'd just be a lower-level implementation detail. If you changed your lower-level encoding to use Polish notation instead, you wouldn't need to change the interfaces, right? I'd expect this comment to lay out a higher-level understanding of what this expression tree looks like:

It contains inverted value expressions.

It contains Union and Intersection operators that each take 2 operands.

The expressions can be composed together arbitrarily.

Something of that sort, perhaps with an example, would give me a quick mental model of the higher-level abstraction you're envisioning, rather than directing focus to a lower-level implementation detail that's presumably just about maximizing efficiency. You might mention Reverse polish notation in the comment, but only as a post-script, to explain how the current implementation works.

These files did not have much commentary regarding context, especially this one since it started out as supporting code for invertedJoiner. I've added more context and moved an example up from below. Hopefully that gives a higher-level picture to start with.

When the RPExprElement is introduced (after the high-level picture) I can't avoid saying reverse-polish since the only reason for the existence of this interface is to be able to represent the expression as []RPExprElement. btw, I only used reverse-polish since it minimized the code for representation and evaluation.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 32 at r3 (raw file):

The Intersection/Union/Value expressions feel similar to Constraint/Span/Datum expressions in the constraints library. How are these different? Is it the fact that these operations can be arbitrarily composed that you need? Or is it the efficiency aspect of encoding them in a byte buffer that you need?

I think we are semantically doing a different thing here. My read of the constraint package (and I could have missed something) is that the abstractions there are for adjusting the (multi-column) spans one needs to fetch. Making up a very simple single column example, something like

([a, c) \union [b, f)) \intersection [e, g) = [a, f) \intersection [e, g) = [e, f)

The spans in the code in this file are spans of the inverted index -- they don't tell us what primary keys we will find when we read those entries. So one evaluates after the sets are populated -- the spans just represent what spans need to be read to populate the sets. What needs to be read from the index is the union of the spans.

There is some factoring of these expressions one can do in special cases -- the geospatial expression generation code can completely factor the expression to eliminate all overlapping spans in a single expression (though that does not save what needs to be scanned which is still the union of the spans, but it can make the post-scan evaluation faster).

Are there practical limits to how much composition we would want to allow? If I just slap together a bunch of Union/Intersect/Value expressions, I'd think perf would quickly degrade. My assumption is that there are a limited number of patterns that would actually provide real performance benefits. For standard indexes the pattern we care about is some conjunction (ConstraintSet) of span disjunctions (Constraint containing Spans). Is arbitrary composition of UNION/INTERSECTION really necessary to support for inverted indexes?

I am not sure what aspect of performance "would quickly degrade" -- is there anything particular you are worried about?

For geospatial we do need a combination of union and intersection that looks like an arbitrary composition after factoring, as described in this example https://github.com/cockroachdb/cockroach/blob/master/pkg/geo/geoindex/index.go#L44-L58
The evaluation below is somewhat optimized by using integers for the set members (we've interned the PK rows in the row container).

But stepping back to the big picture in CockroachDB -- inverted indexes are not efficiently represented and one can't use efficient bit set representations (like roaring bitmaps) for computing set operations. But that isn't a fundamental limitation -- efficient posting list representation and operations are a well accepted way to efficiently search.

Also, as I mentioned in response to a different comment from Becca, there will be cases where a zig-zag join is better since it can dynamically seek one side based on what is discovered in the other side -- we'll rely on the optimizer to make a choice.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 66 at r3 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Another corner of Postgres that I didn't know...

Yes, it is the "Does the string exist as a top-level key within the JSON value?" operator mentioned here https://www.postgresql.org/docs/9.5/functions-json.html

I noticed this is also mentioned in #23905

pkg/sql/rowexec/inverted_expr_evaluator.go, line 256 at r3 (raw file):
I've switched to roachpb.BytesNext -- it is nice that it tries to avoid a slice allocation.

don't you want to increment the last byte if possible?

Do you mean PrefixSuccessor() https://github.com/google/re2/blob/master/util/strutil.cc#L78
That is not an immediate successor so wouldn't work here -- we are trying to convert a span representing 1 key k into a well-formed tight span, so need [k, ImmediateSuccessor(k)).

pkg/sql/rowexec/inverted_joiner.go, line 59 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This needs a comment to pass the linter.
Convert converts rows into ...

Added. I've also added a TODO that this should only take the EncDatum as parameter (more about that in response to Andy's question about columnar/vectorized execution).

pkg/sql/rowexec/inverted_joiner.go, line 62 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I assume you're planning to add tests for this, but this processor would like to be tested in isolation.

Ack.

pkg/sql/rowexec/inverted_joiner.go, line 77 at r3 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Have you already talked with Yahor/Alfonso on the SQL Execution team about this? In general, I think we're trying to move away from row-by-row. They might have concerns with adding even more row-wise execution operators that would have to be converted.

Good question -- I have not yet. And though in a past life I have worked with the internals of a vectorized execution engine over data stored in columnar format, I am not familiar with the details of how things are implemented in CockroachDB. My current understanding is:

we have vectorized equivalents for many operators. Do we need both or it is ok to just have a vectorized implementation?
it is easy to change the code here such that the input (left side) is a coldata.Batch. The relevant column Vec can be fed to RowToInvertedIndexExpr.Convert.
our storage layer is not columnarized and we need to do set operations on the primary key rows retrieved from the inverted index (which is convenient with a row format). Unlike the row container I wrote here, I could use/write a column container that will dedup the retrieved keys before decoding, and then decode the keys and store them in a columnar format with a particular row index (assigned when deduping). The in-memory implementation would be straightforward. Looks like our on-disk implementation is using Apache Arrow (I am not familiar with the details of Arrow) -- we would need the ability to retrieve certain columns for a particular row index, for the OnExpr evaluation, and all the columns for all the indexes that do succeed in the join, for constructing the output batch.

@jordanlewis @asubiotto

pkg/sql/rowexec/inverted_joiner.go, line 353 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Could we implement this stage using the existing lookup join code somehow? I feel that it's not optimal that we're introducing another user of the low level fetcher logic.

Are you suggesting using parts of joinReader? I think it would significantly obscure the logic and not save more than a handful of lines of code -- it would also impose an unnecessary limitation on howjoinReader could evolve because there would be yet another use of it. joinReader as currently written (and for good reason) assumes that it is doing equality relationships -- an example of the code there that is not relevant here is the routing code (in joinReader.{keyToInputRowIndices, inputRowIdxToLookedUpRowIdx})

pkg/sql/rowexec/inverted_joiner.go, line 517 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

So we have something called joinerBase that does a lot of this work - is it not usable here?

I did try hard to use that first. But it assumes eqCols -- I found it really hard to be sure that it was not going to do something incorrect for the invertedJoiner case. This switch statement and the render*() logic below is straightforward.

blathers-crl · 2020-04-29T15:13:30Z

❌ The GitHub CI (Cockroach) build has failed on 6eea2793.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

andy-kimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 22 at r3 (raw file):

Previously, sumeerbhola wrote…

These files did not have much commentary regarding context, especially this one since it started out as supporting code for invertedJoiner. I've added more context and moved an example up from below. Hopefully that gives a higher-level picture to start with.

When the RPExprElement is introduced (after the high-level picture) I can't avoid saying reverse-polish since the only reason for the existence of this interface is to be able to represent the expression as []RPExprElement. btw, I only used reverse-polish since it minimized the code for representation and evaluation.

Hm. It's just that this representation is different from anything we've done, anywhere else in our codebase. I'm not seeing the advantage over just using a regular tree representation, which I think would be more understandable to others. I'd think the code for evaluating a tree would be very comparable as well, wouldn't it? I was thinking something like this:


type InvertedConstraint interface {
    // I actually prefer a `Type` method that returns an enumeration, rather than a marker method.
    // Sometimes it's easier to switch on an enum, sometimes easier to do type switch.
    invertedConstraintMarker()
}

type InvertedConstraintUnion struct {
    Left InvertedConstraint
    Right InvertedConstraint
}

type InvertedConstraintIntersection struct {
    Left InvertedConstraint
    Right InvertedConstraint
}

type InvertedConstraintSpan struct {
    Start InvertedConstraintValue
    End InvertedConstraintValue
}

Is your worry that this representation would use 32 bytes for the set ops rather than 8, as with your representation? I would think this difference in memory usage means nothing, as these constraints will be a fixed overhead cost of no more than a few thousand bytes in usual cases, since this is only allocated once per plan.

If we want to use this representation in the optimizer, we'd want to pull it into its own package, with no accompanying evaluation code, since that creates unnecessary coupling with a particular evaluation engine. We'd also need to discuss the Value representation. The optimizer exclusively uses Datum, so we need to reconcile that with the raw encoded bytes that you're using here.

andy-kimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 32 at r3 (raw file):

Previously, sumeerbhola wrote…

The Intersection/Union/Value expressions feel similar to Constraint/Span/Datum expressions in the constraints library. How are these different? Is it the fact that these operations can be arbitrarily composed that you need? Or is it the efficiency aspect of encoding them in a byte buffer that you need?

I think we are semantically doing a different thing here. My read of the constraint package (and I could have missed something) is that the abstractions there are for adjusting the (multi-column) spans one needs to fetch. Making up a very simple single column example, something like
([a, c) \union [b, f)) \intersection [e, g) = [a, f) \intersection [e, g) = [e, f)
The spans in the code in this file are spans of the inverted index -- they don't tell us what primary keys we will find when we read those entries. So one evaluates after the sets are populated -- the spans just represent what spans need to be read to populate the sets. What needs to be read from the index is the union of the spans.

There is some factoring of these expressions one can do in special cases -- the geospatial expression generation code can completely factor the expression to eliminate all overlapping spans in a single expression (though that does not save what needs to be scanned which is still the union of the spans, but it can make the post-scan evaluation faster).

Are there practical limits to how much composition we would want to allow? If I just slap together a bunch of Union/Intersect/Value expressions, I'd think perf would quickly degrade. My assumption is that there are a limited number of patterns that would actually provide real performance benefits. For standard indexes the pattern we care about is some conjunction (ConstraintSet) of span disjunctions (Constraint containing Spans). Is arbitrary composition of UNION/INTERSECTION really necessary to support for inverted indexes?

I am not sure what aspect of performance "would quickly degrade" -- is there anything particular you are worried about?

For geospatial we do need a combination of union and intersection that looks like an arbitrary composition after factoring, as described in this example https://github.com/cockroachdb/cockroach/blob/master/pkg/geo/geoindex/index.go#L44-L58
The evaluation below is somewhat optimized by using integers for the set members (we've interned the PK rows in the row container).

But stepping back to the big picture in CockroachDB -- inverted indexes are not efficiently represented and one can't use efficient bit set representations (like roaring bitmaps) for computing set operations. But that isn't a fundamental limitation -- efficient posting list representation and operations are a well accepted way to efficiently search.

Also, as I mentioned in response to a different comment from Becca, there will be cases where a zig-zag join is better since it can dynamically seek one side based on what is discovered in the other side -- we'll rely on the optimizer to make a choice.

I think our existing Constraint library could (theoretically) be generalized to support arbitrary composition of UNION/INTERSECTION operations. For example:

CREATE TABLE abcd (a INT PRIMARY KEY, b INT, c INT, d INT);
SELECT a FROM abcd WHERE b = 1 OR (c = 2 AND d = 3)

Today, that can't be represented in the Constraint library, because it can't represent a disjunction of conjunctions. However, using generalized composition like you're building, this could be represented in a constraint that would be the equivalent of:

(SELECT a FROM abcd WHERE b=1) UNION ((SELECT a FROM abcd WHERE c=2) INTERSECTION (SELECT a FROM abcd WHERE d=3))

Anyway, though, I'm just trying to understand why we need to generalize. I think the answer you're giving is that we need to have arbitrary composition in order to merge together the results of multiple seeks at different quad-tree levels. And if we have that, we might as well also have the optimizer encode AND/OR operations into this generalized constraint representation (at least for inverted index operations).

andy-kimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, @rytaft, and @sumeerbhola)

pkg/sql/rowexec/inverted_expr_evaluator.go, line 22 at r3 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Hm. It's just that this representation is different from anything we've done, anywhere else in our codebase. I'm not seeing the advantage over just using a regular tree representation, which I think would be more understandable to others. I'd think the code for evaluating a tree would be very comparable as well, wouldn't it? I was thinking something like this:
type InvertedConstraint interface {
    // I actually prefer a `Type` method that returns an enumeration, rather than a marker method.
    // Sometimes it's easier to switch on an enum, sometimes easier to do type switch.
    invertedConstraintMarker()
}

type InvertedConstraintUnion struct {
    Left InvertedConstraint
    Right InvertedConstraint
}

type InvertedConstraintIntersection struct {
    Left InvertedConstraint
    Right InvertedConstraint
}

type InvertedConstraintSpan struct {
    Start InvertedConstraintValue
    End InvertedConstraintValue
}
Is your worry that this representation would use 16 bytes for the set ops rather than 8, as with your representation? I would think this difference in memory usage means nothing, as these constraints will be a fixed overhead cost of no more than a few hundred bytes in usual cases.

If we want to use this representation in the optimizer, we'd want to pull it into its own package, with no accompanying evaluation code, since that creates unnecessary coupling with a particular evaluation engine. We'd also need to discuss the Value representation. The optimizer exclusively uses Datum, so we need to reconcile that with the raw encoded bytes that you're using here.

Correction: should be 32 bytes for the set ops, since their children are interfaces, not pointers.

rytaft

Are we saying the optimizer would perform that full expansion? Or are we saying we'd wait until execbuilder time to do that expansion?

For geospatial, I think the optimizer would perform the expansion into a RPInvertedIndexExpr (or something equivalent) in the case of a constrained scan, which corresponds to a function such as ST_Covers with one variable input and one constant input. The optimizer can perform expansion on the constant input for the purpose of stats estimation and costing (so this would happen before execbuilder time). But for joins, the optimizer won't do any expansion -- it can't without having access to the values in each row.

The comment I made above was specifically related to the work @mgartner is doing, though. If we want to use this operator for JSON inverted index scans and lookup joins, the inverted constraints library would need to continue to support index selection in the optimizer, as well as be callable (or provide some generator function like Sumeer described) to construct a set expression given the value of an input row during execution.

For example, if we want to support a join on an expression like a @> b, we can't construct a constraint without having the value of b. Once the value of b is available, we could determine an appropriate constraint on the inverted index of a.

x @> '{"a": 1}' AND x @> '{"a": 10}'
Would the RPInvertedIndexExpr represent that using a RPSetOperator UNION expression with 2 input encInvertedValSpan expressions?

Shouldn't this be INTERSECTION? Also, shouldn't we be able to tell that this is a contradiction since AFAIK JSON only supports one instance of each key?

Reviewed 15 of 15 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, and @sumeerbhola)

pkg/sql/execinfrapb/processors_sql.proto, line 494 at r2 (raw file):

Previously, sumeerbhola wrote…

It sounds like we are in agreement. And it seems to be captured in the code comment. Let me know if you think it could be rephrased to make it clearer.

Yep, sorry. Just took me some time to understand but now it makes sense. This is something we'll need to make sure to add tests for in the optimizer. I'll open an issue.

pkg/sql/execinfrapb/processors_sql.proto, line 476 at r4 (raw file):

// where e' is derived from e. For instance if e is an array, e' will
// correspond to elements of the array.
// The OnExpr can use columns a, c, since they are the other columns that

a, b and c, right?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 37 at r4 (raw file):

// (x @> '{"a":1, "b":2}'::json or x ? 'g') and (x @> '{"a":1, "c":3}'::json or x ? 'h')
// where x is a JSON column that has been indexed, @> is the containment operator and
// the ? operator checks existence as a top-level key in the JSON value.

[nit] existence as -> existence of

pkg/sql/rowexec/inverted_expr_evaluator.go, line 110 at r4 (raw file):

}

// Spans are not in sorted order and can be overlapping.

This comment should give more info about what the function is actually doing

pkg/sql/rowexec/inverted_expr_evaluator.go, line 121 at r4 (raw file):

		}
	}
	ev.operandSets = make([]setContainer, len(ev.expr))

This seems like a side effect that should be more clearly called out in the function comment.

pkg/sql/rowexec/inverted_expr_evaluator.go, line 238 at r4 (raw file):

type exprAndOperand struct {
	exprNum      int
	operandIndex int

Would help to give a bit more context about what each of these data members are -- what are they indexing into?

pkg/sql/rowexec/inverted_expr_evaluator.go, line 274 at r4 (raw file):

// - Joins across two tables using the inverted index of one table.
//
// The batched evaluator can be reused by calling Reset(). In the build phase,

Reset -> reset

pkg/sql/rowexec/inverted_expr_evaluator.go, line 312 at r4 (raw file):

//
// Example 1:
// pendingSpans contains

Nice examples!

pkg/sql/rowexec/inverted_expr_evaluator.go, line 323 at r4 (raw file):

//    c-e
//
// For the c-e span, all the exprAndOperand slices for these spans are unioned

I'm having a hard time understanding this paragraph. Maybe this just needs some additional explanation about how all these different data structures fit together. (Or maybe that explanation should go in the comment for batchedInvertedExprEvaluator).

pkg/sql/rowexec/inverted_expr_evaluator.go, line 362 at r4 (raw file):

			}
		} else {
			// We can remove all spans whose ened key is the same as span[0].

ened -> end

pkg/sql/rowexec/inverted_joiner.go, line 275 at r4 (raw file):

	// Initialize memory monitors and row container for looked up rows.
	ctx := flowCtx.EvalCtx.Ctx()
	if false {

Maybe add a TODO here explaining why it's currently false?

sumeerbhola

Shouldn't this be INTERSECTION?

my mistake in previous response. Yes, it is intersection

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, and @sumeerbhola)

andy-kimball

Right, yes, INTERSECTION.

Regarding that other expression, I experimented and you're right. I thought it would return true when "a": [1,10]. But that doesn't seem to be the case. It means we have a bug with the NormalizeJSONFieldAccess rule, ouch:

root@:26257/defaultdb> SELECT j->'a' @> '"foo"' AND j->'a' @> '"baz"' FROM (VALUES ('{"a": ["foo", "bar", "baz"]}'::jsonb)) v(j);
  ?column?
------------
    true
(1 row)

Time: 2.157ms

root@:26257/defaultdb> SELECT j @> '{"a": "foo"}' AND j @> '{"a": "baz"}' FROM (VALUES ('{"a": ["foo", "bar", "baz"]}'::jsonb)) v(j);
  ?column?
------------
   false

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, and @sumeerbhola)

andy-kimball

Actually, scratch that last bit. I misread the rule. The input expression has to use =, not @>. Phew.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @asubiotto, @jordanlewis, @otan, and @sumeerbhola)

sumeerbhola

I'll remove the parts about costing for the join case and leave that as a possibility in code comments. You are right that not having even the parameterized InvertedExpression for the geospatial case breaks the suggestion I had made.

Regarding this infix notation versus the reverse-polish/postfix notation in the geospatial code:

when parsing a user's infix expression this builder interface that builds a tree structure seems a more natural fit.
I am assuming that the InvertedJoinerSpec.InvertedExpr string will use infix. Some machinery needs to parse that string and then do partial application. Would reverse polish be more convenient there?
I can easily change the geospatial code to return infix.
Thoughts?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @jordanlewis, @otan, and @sumeerbhola)

rytaft · 2020-05-05T22:34:38Z

Sounds good.

I am assuming that the InvertedJoinerSpec.InvertedExpr string will use infix. Some machinery needs to parse that string and then do partial application. Would reverse polish be more convenient there?

We already have parsing machinery to convert expression strings to tree expressions represented as go structs, so that would probably be easiest. Pretty much all our existing code in the optimizer is built to work with trees, and if we want to re-use any of the code in the optimizer or in sem/tree/eval.go for partial application, we'll need to use tree-based expressions.

So trees would be my preference, but if you think that performance would suffer I'm sure we could make reverse polish work...

over an inverted index and and the corresponding expression builder functions This is for use by the optimizer to build expressions to be evaluated for two cases (a) using the inverted index for a single table query, (b) using the inverted index of one table when joining two tables. Extracted from cockroachdb#48019 Release note: None

48453: execinfrapb: add InvertedJoinerSpec for joins between r=sumeerbhola a=sumeerbhola two tables using an inverted index on one table This is extracted from #48019 which has more context, including the implementation. Release note: None Co-authored-by: sumeerbhola <[email protected]>

over an inverted index and and the corresponding expression builder functions This is for use by the optimizer to build expressions to be evaluated for two cases (a) using the inverted index for a single table query, (b) using the inverted index of one table when joining two tables. Extracted from cockroachdb#48019 Release note: None

This is for evaluating invertedexpr.SpanExpressions, both for join queries and single table filtering, using an inverted index. This PR depends on cockroachdb#48466 which is the first commit. More usage context can be found in cockroachdb#48019 which contains an invertedJoiner that uses an earlier version of this evaluator. I am looking for a review of the structure before adding tests. Release note: None

over an inverted index and and the corresponding expression builder functions This is for use by the optimizer to build expressions to be evaluated for two cases (a) using the inverted index for a single table query, (b) using the inverted index of one table when joining two tables. Extracted from cockroachdb#48019 Release note: None

48466: invertedexpr: definition of an expression for evaluation r=sumeerbhola a=sumeerbhola over an inverted index and and the corresponding expression builder functions This is for use by the optimizer to build expressions to be evaluated for two cases (a) using the inverted index for a single table query, (b) using the inverted index of one table when joining two tables. Extracted from #48019 Release note: None 49085: geo/geomfn: implement DWithin for geometry types r=sumeerbhola a=otan Moved the core of the logic for distance calculation operations to the `geodist` package, with required methods interface-d into the package for geometry/geography calculations. The code is mostly the same as it was in geogfn/distance.go, just the spheroid/sphere calculations mvoed out. The geometry based operations are contained in geomfn/distance.go. All math is annotated with links as well. Changed MinDistance to use the self-made implementation. The tests still call GEOS to make sure the return values are similar. Needed to bring some vendored packages in from go-geom for this to work. Resolves #48923. Release note (sql change): Implemented DWithin for Geometry types. Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Oliver Tan <[email protected]>

This is for evaluating invertedexpr.SpanExpressionProto, both for join queries and single table filtering, using an inverted index. This PR depends on cockroachdb#49217 which is the first commit. More usage context can be found in cockroachdb#48019 which contains an invertedJoiner that uses an earlier version of this evaluator. Release note: None

This is for evaluating invertedexpr.SpanExpressionProto, both for join queries and single table filtering, using an inverted index. More usage context can be found in cockroachdb#48019 which contains an invertedJoiner that uses an earlier version of this evaluator. Release note: None

48720: rowexec: add batchedInvertedExprEvaluator for inverted index execution r=sumeerbhola a=sumeerbhola This is for evaluating invertedexpr.SpanExpressions, both for join queries and single table filtering, using an inverted index. This PR depends on #48466 which is the first commit. More usage context can be found in #48019 which contains an invertedJoiner that uses an earlier version of this evaluator. I am looking for a review of the structure before adding tests. Release note: None 49247: geo/geo*fn: handle EMPTY cases in GEOMETRY/GEOGRAPHY r=sumeerbhola a=otan There are many cases of EMPTY that we need to handle for each method we support. This includes `POINT EMPTY`, `LINESTRING EMPTY` ... `GEOMETRYCOLLECTION EMPTY` (i.e. one for each shape). Furthermore, there's an even more confusing GEOMETRYCOLLECTION with EMPTY shapes, e.g. `GEOMETRYCOLLECTION ( LINESTRING EMPTY ) ` Both GEOS and PostGIS have various bugs in handling these, which have been commented in code and imitated where appropriate. Also bump twpayne/go-geom to pickup the fixed Empty() checks. Resolves #49182, Release note: None 49558: sql: add the max/min aggregations on ENUMs r=rohany a=rohany Fixes #48370. Release note (sql change): Add the max/min aggregation on ENUM types. Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Rohan Yadav <[email protected]>

This is for evaluating invertedexpr.SpanExpressionProto, both for join queries and single table filtering, using an inverted index. More usage context can be found in cockroachdb#48019 which contains an invertedJoiner that uses an earlier version of this evaluator. Release note: None

sumeerbhola requested review from jordanlewis, rytaft, otan and a team April 24, 2020 15:25

sumeerbhola requested review from a team as code owners April 24, 2020 15:25

rytaft reviewed Apr 27, 2020

View reviewed changes

sumeerbhola force-pushed the index_join branch from 15aee4f to a1e7fcb Compare April 28, 2020 11:34

sumeerbhola commented Apr 28, 2020

View reviewed changes

andy-kimball reviewed Apr 28, 2020

View reviewed changes

rytaft reviewed Apr 28, 2020

View reviewed changes

andy-kimball reviewed Apr 28, 2020

View reviewed changes

jordanlewis reviewed Apr 28, 2020

View reviewed changes

andy-kimball reviewed Apr 28, 2020

View reviewed changes

sumeerbhola force-pushed the index_join branch from a1e7fcb to 6eea279 Compare April 29, 2020 14:42

sumeerbhola commented Apr 29, 2020

View reviewed changes

andy-kimball reviewed Apr 29, 2020

View reviewed changes

rytaft reviewed Apr 29, 2020

View reviewed changes

sumeerbhola commented Apr 29, 2020

View reviewed changes

andy-kimball reviewed Apr 29, 2020

View reviewed changes

sumeerbhola commented May 5, 2020

View reviewed changes

sumeerbhola mentioned this pull request May 6, 2020

invertedexpr: definition of an expression for evaluation #48466

Merged

sumeerbhola mentioned this pull request May 12, 2020

rowexec: add batchedInvertedExprEvaluator for inverted index execution #48720

Merged

otan removed request for a team and otan August 31, 2020 20:40

sumeerbhola closed this Sep 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] rowexec, rowcontainer: add an invertedJoiner, for #48019

[DNM] rowexec, rowcontainer: add an invertedJoiner, for #48019

sumeerbhola commented Apr 24, 2020

cockroach-teamcity commented Apr 24, 2020

rytaft left a comment

blathers-crl bot commented Apr 28, 2020

sumeerbhola left a comment

sumeerbhola left a comment

andy-kimball left a comment

rytaft left a comment

rytaft commented Apr 28, 2020

andy-kimball left a comment •

edited

Loading

jordanlewis left a comment

andy-kimball left a comment

andy-kimball left a comment

andy-kimball left a comment

sumeerbhola left a comment

blathers-crl bot commented Apr 29, 2020

andy-kimball left a comment •

edited

Loading

andy-kimball left a comment

andy-kimball left a comment

rytaft left a comment

sumeerbhola left a comment

andy-kimball left a comment

andy-kimball left a comment

sumeerbhola left a comment

rytaft commented May 5, 2020

[DNM] rowexec, rowcontainer: add an invertedJoiner, for #48019

[DNM] rowexec, rowcontainer: add an invertedJoiner, for #48019

Conversation

sumeerbhola commented Apr 24, 2020

cockroach-teamcity commented Apr 24, 2020

rytaft left a comment

Choose a reason for hiding this comment

blathers-crl bot commented Apr 28, 2020

sumeerbhola left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

rytaft commented Apr 28, 2020

andy-kimball left a comment • edited Loading

Choose a reason for hiding this comment

jordanlewis left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

blathers-crl bot commented Apr 29, 2020

andy-kimball left a comment • edited Loading

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

rytaft commented May 5, 2020

andy-kimball left a comment •

edited

Loading

andy-kimball left a comment •

edited

Loading