Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: foreign key checks should only scan the primary column family of a row #30852

Closed
nvanbenschoten opened this issue Oct 1, 2018 · 11 comments · Fixed by #42572
Closed

sql: foreign key checks should only scan the primary column family of a row #30852

nvanbenschoten opened this issue Oct 1, 2018 · 11 comments · Fixed by #42572
Assignees
Labels
A-sql-execution Relating to SQL execution. A-sql-fks A-sql-mutations Mutation statements: UPDATE/INSERT/UPSERT/DELETE. C-performance Perf of queries or internals. Solution not expected to change functional behavior.

Comments

@nvanbenschoten
Copy link
Member

We currently scan all column families in the destination row when performing a foreign key check. We should limit this to the primary column family, which we know will never be missing as long as the row exists.

This has implications on transaction contention. For instance, see #30624.

cc. @knz for triage.

@nvanbenschoten nvanbenschoten added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-execution Relating to SQL execution. labels Oct 1, 2018
@nvanbenschoten nvanbenschoten added this to the 2.2 milestone Oct 1, 2018
@knz knz added the A-sql-mutations Mutation statements: UPDATE/INSERT/UPSERT/DELETE. label Oct 2, 2018
@petermattis petermattis removed this from the 2.2 milestone Oct 5, 2018
@knz
Copy link
Contributor

knz commented Jan 3, 2019

@nvanbenschoten can you double check the issue description? Is this optimization also valid if the FK references a non-primary (unique) index?

@nvanbenschoten
Copy link
Member Author

If we supported multiple column families in secondary indexes then it would be. I forget if we do or if secondary indexes are encoded in such a way that we could in the future. The only requirement is that we need to verify that the row exists, and that can be performed looking only at the primary column family of whatever index we need to look at. Scanning any other column families in the destination row is a waste.

@knz knz removed their assignment May 6, 2019
@andy-kimball
Copy link
Contributor

Ping @RaduBerinde. Would be good to keep this in mind when doing FK work.

@RaduBerinde
Copy link
Member

@jordanlewis this will come down to optimizing the scans performed by lookup semi/anti joins.

@jordanlewis
Copy link
Member

Yep - we'll make sure this gets done. Thanks!

@jordanlewis
Copy link
Member

In conjunction with planning foreign key checks via the optimizer, this work was materially finished by @rohany in #38301. Assigning to @RaduBerinde to close once opt-driven fks are in.

@nvanbenschoten
Copy link
Member Author

I just confirmed that this is still a blocker for #30624. This is true with and without sql.defaults.experimental_optimizer_foreign_keys.enabled enabled at the moment.

With a tpcc dataset initialized from that PR with 1 warehouse:

# In terminal 1
root@localhost:26257/tpcc> show create table district;
  table_name |                                         create_statement
+------------+--------------------------------------------------------------------------------------------------+
  district   | CREATE TABLE district (
             |     d_id INT8 NOT NULL,
             |     d_w_id INT8 NOT NULL,
             |     d_name VARCHAR(10) NULL,
             |     d_street_1 VARCHAR(20) NULL,
             |     d_street_2 VARCHAR(20) NULL,
             |     d_city VARCHAR(20) NULL,
             |     d_state CHAR(2) NULL,
             |     d_zip CHAR(9) NULL,
             |     d_tax DECIMAL(4,4) NULL,
             |     d_ytd DECIMAL(12,2) NULL,
             |     d_next_o_id INT8 NULL,
             |     CONSTRAINT "primary" PRIMARY KEY (d_w_id ASC, d_id ASC),
             |     CONSTRAINT fk_d_w_id_ref_warehouse FOREIGN KEY (d_w_id) REFERENCES warehouse(w_id),
             |     FAMILY static (d_id, d_w_id, d_name, d_street_1, d_street_2, d_city, d_state, d_zip, d_tax),
             |     FAMILY dynamic_1 (d_ytd),
             |     FAMILY dynamic_2 (d_next_o_id)
             | )

root@localhost:26257/tpcc> show create table history;
  table_name |                                                      create_statement
+------------+-----------------------------------------------------------------------------------------------------------------------------+
  history    | CREATE TABLE history (
             |     rowid UUID NOT NULL DEFAULT gen_random_uuid(),
             |     h_c_id INT8 NOT NULL,
             |     h_c_d_id INT8 NOT NULL,
             |     h_c_w_id INT8 NOT NULL,
             |     h_d_id INT8 NOT NULL,
             |     h_w_id INT8 NOT NULL,
             |     h_date TIMESTAMP NULL,
             |     h_amount DECIMAL(6,2) NULL,
             |     h_data VARCHAR(24) NULL,
             |     CONSTRAINT "primary" PRIMARY KEY (h_w_id ASC, rowid ASC),
             |     CONSTRAINT fk_h_c_w_id_ref_customer FOREIGN KEY (h_c_w_id, h_c_d_id, h_c_id) REFERENCES customer(c_w_id, c_d_id, c_id),
             |     CONSTRAINT fk_h_w_id_ref_district FOREIGN KEY (h_w_id, h_d_id) REFERENCES district(d_w_id, d_id),
             |     INDEX history_customer_fk_idx (h_c_w_id ASC, h_c_d_id ASC, h_c_id ASC),
             |     INDEX history_district_fk_idx (h_w_id ASC, h_d_id ASC),
             |     FAMILY "primary" (rowid, h_c_id, h_c_d_id, h_c_w_id, h_d_id, h_w_id, h_date, h_amount, h_data)
             | )

root@localhost:26257/tpcc> BEGIN;
BEGIN

root@localhost:26257/tpcc  OPEN> UPDATE district
SET d_next_o_id = d_next_o_id + 1
WHERE d_w_id = 0 AND d_id = 1
RETURNING d_tax, d_next_o_id;
  d_tax  | d_next_o_id
+--------+-------------+
  0.1692 |        3003

# Don't commit or rollback
# In terminal 2
root@localhost:26257/tpcc> insert into history values (DEFAULT, 1, 1, 0, 1, 0, '2006-01-02 15:04:05+00:00', 10, '9cdLXe0YhgLRrw');
... hangs ...

@RaduBerinde
Copy link
Member

I tried to reproduce locally with two simple tables. A few things I found:

  • make sure you restart your session(s) after changing the cluster setting
  • once automatic stats run on the (almost) empty tables, the best plan for the FK check is a hash join. If I disable automatic stats I get the lookup join and the second session doesn't hang.

Can you get the EXPLAIN (VERBOSE) for the insert query?

@nvanbenschoten
Copy link
Member Author

Even with those two changes, it still hangs. Here's the EXPLAIN (VERBOSE). It doesn't mention the FK lookup:

root@localhost:26257/tpcc> EXPLAIN (VERBOSE) insert into history values (DEFAULT, 1, 1, 0, 1, 0, '2006-01-02 15:04:05+00:00', 10, '9cdLXe0YhgLRrw');
          tree          |     field     |                                     description                                      |                                      columns                                       | ordering
+-----------------------+---------------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------+
                        | distributed   | false                                                                                |                                                                                    |
                        | vectorized    | false                                                                                |                                                                                    |
  count                 |               |                                                                                      | ()                                                                                 |
   └── insert           |               |                                                                                      | ()                                                                                 |
        │               | into          | history(rowid, h_c_id, h_c_d_id, h_c_w_id, h_d_id, h_w_id, h_date, h_amount, h_data) |                                                                                    |
        │               | strategy      | inserter                                                                             |                                                                                    |
        └── render      |               |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, h_amount, column9) |
             │          | render 0      | column1                                                                              |                                                                                    |
             │          | render 1      | column2                                                                              |                                                                                    |
             │          | render 2      | column3                                                                              |                                                                                    |
             │          | render 3      | column4                                                                              |                                                                                    |
             │          | render 4      | column5                                                                              |                                                                                    |
             │          | render 5      | column6                                                                              |                                                                                    |
             │          | render 6      | column7                                                                              |                                                                                    |
             │          | render 7      | h_amount                                                                             |                                                                                    |
             │          | render 8      | column9                                                                              |                                                                                    |
             └── values |               |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, column9, h_amount) |
                        | size          | 9 columns, 1 row                                                                     |                                                                                    |
                        | row 0, expr 0 | gen_random_uuid()                                                                    |                                                                                    |
                        | row 0, expr 1 | 1                                                                                    |                                                                                    |
                        | row 0, expr 2 | 1                                                                                    |                                                                                    |
                        | row 0, expr 3 | 0                                                                                    |                                                                                    |
                        | row 0, expr 4 | 1                                                                                    |                                                                                    |
                        | row 0, expr 5 | 0                                                                                    |                                                                                    |
                        | row 0, expr 6 | '2006-01-02 15:04:05+00:00'                                                          |                                                                                    |
                        | row 0, expr 7 | '9cdLXe0YhgLRrw'                                                                     |                                                                                    |
                        | row 0, expr 8 | 10                                                                                   |                                                                                    |

If I drop the FK, things do work, so that's where the contention is coming from.

@RaduBerinde
Copy link
Member

If that is your explain, the new FK path is not on. Make sure you restarted the session if you changed the cluster setting, or use the session variable (set experimental_optimizer_foreign_keys = true;)

@nvanbenschoten
Copy link
Member Author

Oh yes, you're right. The FK check does not conflict with the UPDATE when using the new FK path.

root@localhost:26257/tpcc> explain (verbose) insert into history values (DEFAULT, 1, 1, 0, 1, 0, '2006-01-02 15:04:05+00:00', 10, '9cdLXe0YhgLRrw');
                    tree                    |         field         |                                     description                                      |                                      columns                                       | ordering
+-------------------------------------------+-----------------------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------+
                                            | distributed           | false                                                                                |                                                                                    |
                                            | vectorized            | false                                                                                |                                                                                    |
  root                                      |                       |                                                                                      | ()                                                                                 |
   ├── count                                |                       |                                                                                      | ()                                                                                 |
   │    └── insert                          |                       |                                                                                      | ()                                                                                 |
   │         │                              | into                  | history(rowid, h_c_id, h_c_d_id, h_c_w_id, h_d_id, h_w_id, h_date, h_amount, h_data) |                                                                                    |
   │         │                              | strategy              | inserter                                                                             |                                                                                    |
   │         └── buffer node                |                       |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, h_amount, column9) |
   │              │                         | label                 | buffer 1                                                                             |                                                                                    |
   │              └── render                |                       |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, h_amount, column9) |
   │                   │                    | render 0              | column1                                                                              |                                                                                    |
   │                   │                    | render 1              | column2                                                                              |                                                                                    |
   │                   │                    | render 2              | column3                                                                              |                                                                                    |
   │                   │                    | render 3              | column4                                                                              |                                                                                    |
   │                   │                    | render 4              | column5                                                                              |                                                                                    |
   │                   │                    | render 5              | column6                                                                              |                                                                                    |
   │                   │                    | render 6              | column7                                                                              |                                                                                    |
   │                   │                    | render 7              | h_amount                                                                             |                                                                                    |
   │                   │                    | render 8              | column9                                                                              |                                                                                    |
   │                   └── values           |                       |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, column9, h_amount) |
   │                                        | size                  | 9 columns, 1 row                                                                     |                                                                                    |
   │                                        | row 0, expr 0         | gen_random_uuid()                                                                    |                                                                                    |
   │                                        | row 0, expr 1         | 1                                                                                    |                                                                                    |
   │                                        | row 0, expr 2         | 1                                                                                    |                                                                                    |
   │                                        | row 0, expr 3         | 0                                                                                    |                                                                                    |
   │                                        | row 0, expr 4         | 1                                                                                    |                                                                                    |
   │                                        | row 0, expr 5         | 0                                                                                    |                                                                                    |
   │                                        | row 0, expr 6         | '2006-01-02 15:04:05+00:00'                                                          |                                                                                    |
   │                                        | row 0, expr 7         | '9cdLXe0YhgLRrw'                                                                     |                                                                                    |
   │                                        | row 0, expr 8         | 10                                                                                   |                                                                                    |
   ├── postquery                            |                       |                                                                                      | ()                                                                                 |
   │    └── error if rows                   |                       |                                                                                      | ()                                                                                 |
   │         └── lookup-join                |                       |                                                                                      | (column4, column3, column2)                                                        |
   │              │                         | table                 | customer@primary                                                                     |                                                                                    |
   │              │                         | type                  | anti                                                                                 |                                                                                    |
   │              │                         | equality              | (column4, column3, column2) = (c_w_id, c_d_id, c_id)                                 |                                                                                    |
   │              │                         | equality cols are key |                                                                                      |                                                                                    |
   │              │                         | parallel              |                                                                                      |                                                                                    |
   │              └── render                |                       |                                                                                      | (column4, column3, column2)                                                        |
   │                   │                    | render 0              | column4                                                                              |                                                                                    |
   │                   │                    | render 1              | column3                                                                              |                                                                                    |
   │                   │                    | render 2              | column2                                                                              |                                                                                    |
   │                   └── scan buffer node |                       |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, h_amount, column9) |
   │                                        | label                 | buffer 1                                                                             |                                                                                    |
   └── postquery                            |                       |                                                                                      | ()                                                                                 |
        └── error if rows                   |                       |                                                                                      | ()                                                                                 |
             └── lookup-join                |                       |                                                                                      | (column6, column5)                                                                 |
                  │                         | table                 | district@primary                                                                     |                                                                                    |
                  │                         | type                  | anti                                                                                 |                                                                                    |
                  │                         | equality              | (column6, column5) = (d_w_id, d_id)                                                  |                                                                                    |
                  │                         | equality cols are key |                                                                                      |                                                                                    |
                  │                         | parallel              |                                                                                      |                                                                                    |
                  └── render                |                       |                                                                                      | (column6, column5)                                                                 |
                       │                    | render 0              | column6                                                                              |                                                                                    |
                       │                    | render 1              | column5                                                                              |                                                                                    |
                       └── scan buffer node |                       |                                                                                      | (column1, column2, column3, column4, column5, column6, column7, h_amount, column9) |
                                            | label                 | buffer 1                                                                             |                                                                                    |

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Nov 19, 2019
This change improves the legacy FK path to only scan the primary
column family. The new path already does this (as long as the best
plan is a lookup join). Adding a test for both paths.

Fixes cockroachdb#30852.

Release note: None
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Nov 19, 2019
This change improves the legacy FK path to only scan the primary
column family. The new path already does this (as long as the best
plan is a lookup join). Adding a test for both paths.

Fixes cockroachdb#30852.

Release note: None
craig bot pushed a commit that referenced this issue Nov 20, 2019
42572: sql: scan only primary column for FK checks r=RaduBerinde a=RaduBerinde

This change improves the legacy FK path to only scan the primary
column family. The new path already does this (as long as the best plan is a
lookup join). Adding a test for both paths.

Fixes #30852.

Release note: None

42596: sql: remove local ordinality r=jordanlewis a=jordanlewis

Now that it's got a distributed version, we can delete the local
implementation.

Release note: None

Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Jordan Lewis <[email protected]>
@craig craig bot closed this as completed in f9b9485 Nov 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-execution Relating to SQL execution. A-sql-fks A-sql-mutations Mutation statements: UPDATE/INSERT/UPSERT/DELETE. C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants