Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[DNM] rowexec, rowcontainer: add an invertedJoiner, for
joining two tables where one has an inverted index There are currently no tests for this code. There are also some questions listed as bare todos ("TODO:") in the code, relating to descriptors and using the encoded inverted column to construct spans to retrieve from the index. I'd like to get a sanity check on the approach, and get answers to those questions before proceeding to add tests. - InvertedJoinerSpec is the spec for the invertedJoiner and consists of two expressions: - involving the inverted column and the corresponding column in the input that will be used for lookup. For geospatial these would be either both geometry columns or both geography columns. - on expression that can involve the other columns on the two sides. The join is a conjunction of both expressions. - RowToInvertedIndexExpr is an interface that uses an input row to produce a reverse polish set expression involving spans of the inverted column. For geospatial, this will be implemented by the functionality in GeographyIndex and GeometryIndex. - invertedJoiner is given an implementation of RowToInvertedIndexExpr for the join it is executing, so it can be abstracted from the details on how each input row is converted into an expression. invertedJoiner operates analogous to a "lookup join" -- it consumes a batch of input rows, computes the expression for each row, unions the spans needed by this batch of expressions, and fetches from the inverted index to evaluate the expressions. invertedJoiner will be used for geospatial joins and could be used for JSON and array joins (it is not clear to me if we currently use inverted indexes for JSON and array). - batchedInvertedExprEvaluator is used by invertedJoiner to evaluate the join on a batch of input rows. It is also to be used for the non-join case where one is selecting from a table using an expression that involves literals and the inverted column. - InvertedIndexRowContainer is used by invertedJoiner to dedup the rows retrieved from the inverted index (minus the inverted column). This allows the expr evaluators to work with integers as set members. There is only a memory-backed implementation for now but adding a disk-backed implementation will be straightforward. Release note: None
- Loading branch information