Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: inverted indexes always require index join, even when no reads from primary index required #46765

Closed
jordanlewis opened this issue Mar 31, 2020 · 6 comments
Assignees
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.

Comments

@jordanlewis
Copy link
Member

Currently, inverted index searches always seem to require an index join, even if there are no requested values from the table. This is unoptimal, and unless I'm forgetting a special property of inverted indexes, seems unnecessary.

Can we remove the index join if there are no projections involving columns? This improves the case of checking for existence within a collection on a table.

[email protected]:50516/defaultdb> create table a (a int primary key, j json, inverted index (j));
CREATE TABLE

Time: 3.084ms

[email protected]:50516/defaultdb> explain (opt,verbose) select true from a where j @> '{"foo": "bar"}';
                                    text
----------------------------------------------------------------------------
  project
   ├── columns: bool:3
   ├── stats: [rows=111.111111]
   ├── cost: 572.252222
   ├── fd: ()-->(3)
   ├── prune: (3)
   ├── index-join a
   │    ├── columns: j:2
   │    ├── stats: [rows=111.111111]
   │    ├── cost: 570.02
   │    └── scan a@a_j_idx
   │         ├── columns: a:1
   │         ├── constraint: /2/1: [/'{"foo": "bar"}' - /'{"foo": "bar"}']
   │         ├── stats: [rows=111.111111]
   │         ├── cost: 117.787778
   │         └── key: (1)
   └── projections
        └── true [as=bool:3]
(18 rows)

Time: 767µs

Prompted by a forum question: https://forum.cockroachlabs.com/t/checking-value-existence-in-index/3578

@jordanlewis jordanlewis added the C-performance Perf of queries or internals. Solution not expected to change functional behavior. label Mar 31, 2020
@RaduBerinde
Copy link
Member

CC @mgartner

@mgartner
Copy link
Collaborator

@RaduBerinde I'll take a look.

@rytaft
Copy link
Collaborator

rytaft commented Sep 18, 2020

This appears to be fixed. I think @mgartner fixed it with #53586. Closing.

@rytaft rytaft closed this as completed Sep 18, 2020
@mgartner
Copy link
Collaborator

Note that #53586 only eliminates unnecessary IndexJoins when they are inside Projects. I think this covers all cases of unnecessary IndexJoins after inverted index scans—I couldn't think of a case where a non-Project would wrap an unnecesary IndexJoin.

But there are still cases of unnecessary IndexJoins after partial index scans. I've created a ticket here to track: #54588

@sumeerbhola
Copy link
Collaborator

What about index joins when the only columns needed from the primary index are the primary key columns, which are already in the inverted index. Do we eliminate those?

@rytaft
Copy link
Collaborator

rytaft commented Sep 18, 2020

Yep, that case is fixed (as is the case described in this issue):

[email protected]:63335/movr> create table a (a int primary key, j json, inverted index (j));
CREATE TABLE

Server Execution Time: 1.835ms
Network Latency: 747µs

[email protected]:63335/movr> explain (opt,verbose) select true from a where j @> '{"foo": "bar"}';
                                 text
-----------------------------------------------------------------------
  project
   ├── columns: bool:6
   ├── immutable
   ├── stats: [rows=111.111111]
   ├── cost: 122.842222
   ├── fd: ()-->(6)
   ├── prune: (6)
   ├── scan a@a_j_idx
   │    ├── columns: a:1
   │    ├── constraint: /2/1: [/'{"foo": "bar"}' - /'{"foo": "bar"}']
   │    ├── stats: [rows=110]
   │    ├── cost: 120.61
   │    └── key: (1)
   └── projections
        └── true [as=bool:6]
(15 rows)

Server Execution Time: 52µs
Network Latency: 225µs

[email protected]:63335/movr> explain (opt,verbose) select a from a where j @> '{"foo": "bar"}';
                                 text
-----------------------------------------------------------------------
  project
   ├── columns: a:1
   ├── immutable
   ├── stats: [rows=0.111111111]
   ├── cost: 4.13888889
   ├── key: (1)
   ├── prune: (1)
   └── scan a@a_j_idx
        ├── columns: a:1
        ├── constraint: /2/1: [/'{"foo": "bar"}' - /'{"foo": "bar"}']
        ├── stats: [rows=0.111111111]
        ├── cost: 4.12777778
        └── key: (1)
(13 rows)

Server Execution Time: 69µs
Network Latency: 1.231ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

No branches or pull requests

5 participants