sql: defer FK checks to end of statement #33475

knz · 2019-01-03T18:41:27Z

Discussed with @BramGruneir and team. The current semantics in CRDB 2.1 are incorrect -- the FK work must be deferred to no earlier than the end of the current statement.

In the first iteration of this we are reusing the same FK code; and instead delay the processing until the end of the statement's execution by means of callbacks.

Also the set of modified rows must be collected using a "disk row writer" that spills to disk to limit RAM usage.

vivekmenezes · 2019-01-07T17:29:17Z

I think defer FK checks at the end of a transaction could be move valuable to a user, in terms of prioritizing work.

BramGruneir · 2019-01-07T19:34:15Z

I agree that being able to delay the FK checks to the end of transaction would be really helpful, but for now, even moving them to the end of the statement, which should be the default, would be a big performance win over per row as we do them right now. I expect cascading actions would be sped up here as well.

Once we're at that point, we can consider adding in the deferred keyword.

vivekmenezes · 2019-01-07T20:48:12Z

we discussed, I had forgotten that there is value to executing checks in parallel as statements are executing so as to 1. find violations earlier, 2. not deferring all check computations to the very end thereby slowing down a transaction.

jordanlewis · 2019-07-31T03:33:33Z

This will also get done as part of the recent opt FK work.

RaduBerinde · 2019-12-21T20:01:03Z

Opt-driven FK checks are enabled on master.

Previously, all individual KV reads performed by a SQL statement were able to observe the most recent KV writes that it performed itself. This is in violation of PostgreSQL's dialect semantics, which mandate that statements can only observe data as per a read snapshot taken at the instant a statement begins execution. Moreover, this invalid behavior causes a real observable bug: a statement that reads and writes to the same table may never complete, as the read part may become able to consume the rows that it itself writes. Or worse, it could cause logical operations to be performed multiple times: https://en.wikipedia.org/wiki/Halloween_Problem This patch fixes it (partially) by exploiting the new KV `Step()` API which decouples the read and write sequence numbers. The fix is not complete however; additional sequence points must also be introduced prior to FK existence checks and cascading actions. See [cockroachdb#42864](cockroachdb#42864) and [cockroachdb#33475](cockroachdb#33475) for details. For now, this patch excludes any mutation that 1) involves a foreign key and 2) does not uyse the new CBO-driven FK logic, from the new (fixed) semantics. When a mutation involves a FK without CBO involvement, the previous (broken) semantics still apply. Release note (bug fix): SQL mutation statements that target tables with no foreign key relationships now correctly read data as per the state of the database when the statement started execution. This is required for compatibility with PostgreSQL and to ensure deterministic behavior when certain operations are parallelized. Prior to this fix, a statement [could incorrectly operate multiple times](https://en.wikipedia.org/wiki/Halloween_Problem) on data that itself was writing, and potentially never terminate. This fix is limited to tables without FK relationships, and for certain operations on tables with FK relationships; in other cases, the fix is not active and the bug is still present. A full fix will be provided in a later release.

42862: sql: make SQL statements operate on a read snapshot r=andreimatei a=knz Fixes #33473. Fixes #28842. Informs #41569 and #42864. Previously, all individual KV reads performed by a SQL statement were able to observe the most recent KV writes that it performed itself. This is in violation of PostgreSQL's dialect semantics, which mandate that statements can only observe data as per a read snapshot taken at the instant a statement begins execution. Moreover, this invalid behavior causes a real observable bug: a statement that reads and writes to the same table may never complete, as the read part may become able to consume the rows that it itself writes. Or worse, it could cause logical operations to be performed multiple times: https://en.wikipedia.org/wiki/Halloween_Problem This patch fixes it by exploiting the new KV `Step()` API which decouples the read and write sequence numbers. The fix is not complete however; additional sequence points must also be introduced prior to FK existence checks and cascading actions. See #42864 and #33475 for details. For now, this patch excludes any mutation that involves a foreign key from the new (fixed) semantics. When a mutation involves a FK, the previous (broken) semantics still apply. Release note (bug fix): SQL mutation statements that target tables with no foreign key relationships now correctly read data as per the state of the database when the statement started execution. This is required for compatibility with PostgreSQL and to ensure deterministic behavior when certain operations are parallelized. Prior to this fix, a statement [could incorrectly operate multiple times](https://en.wikipedia.org/wiki/Halloween_Problem) on data that itself was writing, and potentially never terminate. This fix is limited to tables without FK relationships, and for certain operations on tables with FK relationships; in other cases, the fix is not active and the bug is still present. A full fix will be provided in a later release. Co-authored-by: Raphael 'kena' Poss <[email protected]>

knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-execution Relating to SQL execution. A-sql-mutations Mutation statements: UPDATE/INSERT/UPSERT/DELETE. labels Jan 3, 2019

knz assigned BramGruneir Jan 3, 2019

jordanlewis added the A-sql-fks label Apr 24, 2019

jordanlewis assigned RaduBerinde and unassigned BramGruneir Jul 31, 2019

knz mentioned this issue Nov 29, 2019

sql: make SQL statements operate on a read snapshot #42862

Merged

This was referenced Dec 4, 2019

sql: INSERT with self-referencing FK check issues #20041

Closed

sql: self-referencing constraint fails #40399

Closed

sql: Foreign Key Cannot Reference Its Own Row #27871

Closed

RaduBerinde closed this as completed Dec 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: defer FK checks to end of statement #33475

sql: defer FK checks to end of statement #33475

knz commented Jan 3, 2019

vivekmenezes commented Jan 7, 2019

BramGruneir commented Jan 7, 2019

vivekmenezes commented Jan 7, 2019

jordanlewis commented Jul 31, 2019

RaduBerinde commented Dec 21, 2019

sql: defer FK checks to end of statement #33475

sql: defer FK checks to end of statement #33475

Comments

knz commented Jan 3, 2019

vivekmenezes commented Jan 7, 2019

BramGruneir commented Jan 7, 2019

vivekmenezes commented Jan 7, 2019

jordanlewis commented Jul 31, 2019

RaduBerinde commented Dec 21, 2019