Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
45520: opt: FK checks for Upsert r=RaduBerinde a=RaduBerinde

These commit add support for opt-driven FK checks for UPSERT and INSERT ON CONFLICT DO UPDATE. This is a simplified and cleaned up reimplementation of cockroachdb#45095.

There aren't a lot of logictests around FKs and upsert. I plan to try to build a randomized testing facility for FK checks, where we generate randomized mutations and cross-check FK violations or lack there of against a separate instance of the database without FK relationships.

#### Revert "opt: add first half of upsert FK checks"

This reverts commit 96447c8.

We are reconsidering the direction of the upsert FK checks: instead of filtering
inserted vs updated rows, we can use the return columns to get the "new" values.
This change also made some things harder to understand: we used to build the FK
check input using just the columns corresponding to the FK instead of all
columns.

Release note: None

#### opt: refactor optbuilder FK check code

The optbuilder FK check code has become unnecessarily complicated:
 - there are two ways of creating a WithScan (`makeFKInputScan` and
   `projectOrdinals`) which are similar but take different kinds of information
   as input.
 - there are various slices of column ordinals or column IDs that are threaded
   through long parts of code, making it hard to follow.

This change cleans this up by creating a fkCheckHelper which contains the
metadata related to FK table ordinals and is capable of creating a WithScan with
either the "new" or the "fetched" values.

The new code should generate the same effective plans; the differences are:
 - we now always create the WithScan before the other table Scan so some column IDs in the plan changed;
 - we no longer have an unnecessary projection for Update (it was used to
   renumber the columns coming out of WithScan).

Release note: None

#### opt: add outbound FK checks for upsert

This is a re-implementation of Justin's change 3d1dd0f, except that we now
perform the insertion check for all new rows instead of just the inserted rows.
We do this by utilizing the same column that would be used for a RETURNING
clause, which in some cases is of the form
`CASE WHEN canary IS NULL col1 ELSE col2`.

Release note: None

#### opt: add inbound FK checks for upsert

This change adds inbound FK checks for upsert and switches execution over to the
new style FK checks for upsert.

Similar to UPDATE, the inbound FK checks run on the set difference between "old"
values for the FK columns and "new" values.

Release note (performance improvement): improved execution plans of foreign key
checks for UPSERT and INSERT ON CONFLICT in some cases (in particular
multi-region).


Co-authored-by: Radu Berinde <[email protected]>
  • Loading branch information
craig[bot] and RaduBerinde committed Mar 4, 2020
2 parents 7b64b0b + 3daeced commit 2e621c7
Show file tree
Hide file tree
Showing 14 changed files with 3,512 additions and 1,552 deletions.
88 changes: 88 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/fk_opt
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,94 @@ DROP TABLE child
statement ok
DROP TABLE parent

# Pseudo-deletions

statement ok
CREATE TABLE parent (a INT PRIMARY KEY, b INT, UNIQUE (b))

statement ok
CREATE TABLE child (a INT PRIMARY KEY, b INT REFERENCES parent (b))

statement ok
INSERT INTO parent VALUES (1, 2)

statement ok
INSERT INTO child VALUES (10, 2)

statement error pq: upsert on table "parent" violates foreign key constraint "fk_b_ref_parent" on table "child"\nDETAIL: Key \(b\)=\(2\) is still referenced from table "child"\.
UPSERT INTO parent VALUES (1, 3)

statement ok
INSERT INTO parent VALUES (1, 3), (2, 2) ON CONFLICT (a) DO UPDATE SET b = 3

query II
SELECT * FROM child
----
10 2

query II rowsort
SELECT * FROM parent
----
1 3
2 2

# child references the second '2' column in parent. This mutation removes that
# row via an update, and is disallowed.
statement error pq: insert on table "parent" violates foreign key constraint "fk_b_ref_parent" on table "child"\nDETAIL: Key \(b\)=\(2\) is still referenced from table "child"\.
INSERT INTO parent VALUES (2, 2) ON CONFLICT (a) DO UPDATE SET b = parent.b - 1

# This mutation *also* removes that row, but also via an update, introduces a
# new one, making it acceptable.
statement ok
INSERT INTO parent VALUES (2, 2), (1, 3) ON CONFLICT (a) DO UPDATE SET b = parent.b - 1

query II rowsort
SELECT * FROM parent
----
1 2
2 1

statement ok
DROP TABLE child

statement ok
DROP TABLE parent

# Self-reference.

statement ok
CREATE TABLE self (k int primary key, a int unique, b int references self(a))

statement error pq: upsert on table "self" violates foreign key constraint "fk_b_ref_self"\nDETAIL: Key \(b\)=\(2\) is not present in table "self"
UPSERT INTO self VALUES (1, 1, 2)

statement ok
UPSERT INTO self VALUES (1, 1, 1)

statement ok
UPSERT INTO self VALUES (1, 1, 1)

statement error pq: upsert on table "self" violates foreign key constraint "fk_b_ref_self"\nDETAIL: Key \(b\)=\(2\) is not present in table "self"
UPSERT INTO self VALUES (1, 1, 2)

statement ok
UPSERT INTO self VALUES (1, 2, 2)

statement error pq: upsert on table "self" violates foreign key constraint "fk_b_ref_self"\nDETAIL: Key \(b\)=\(2\) is not present in table "self"
UPSERT INTO self(k,a) VALUES (1, 1)

statement ok
UPSERT INTO self VALUES (1, 1, 2), (2, 2, 1)

statement ok
INSERT INTO self VALUES (2, 2, 2) ON CONFLICT (k) DO UPDATE SET b = self.b + 1

statement error pq: insert on table "self" violates foreign key constraint "fk_b_ref_self"\nDETAIL: Key \(b\)=\(3\) is not present in table "self"
INSERT INTO self VALUES (2, 2, 2) ON CONFLICT (k) DO UPDATE SET b = self.b + 1

statement ok
DROP TABLE self

# --- Tests that follow are copied from the fk tests and adjusted as needed.

statement ok
Expand Down
443 changes: 218 additions & 225 deletions pkg/sql/opt/exec/execbuilder/testdata/fk_opt

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion pkg/sql/opt/norm/custom_funcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -2084,9 +2084,11 @@ func (c *CustomFuncs) WithUses(r opt.Expr) props.WithUsesMap {
switch e := r.(type) {
case memo.RelExpr:
relProps := e.Relational()

// Lazily calculate and store the WithUses value.
if !relProps.IsAvailable(props.WithUses) {
relProps.SetAvailable(props.WithUses)
relProps.Shared.Rule.WithUses = c.deriveWithUses(r)
relProps.SetAvailable(props.WithUses)
}
return relProps.Shared.Rule.WithUses

Expand Down
78 changes: 39 additions & 39 deletions pkg/sql/opt/norm/testdata/rules/with
Original file line number Diff line number Diff line change
Expand Up @@ -571,12 +571,12 @@ norm format=show-all
WITH cte AS (INSERT INTO child VALUES (1, 1) RETURNING c) SELECT c FROM cte UNION SELECT c+1 FROM cte
----
with &2 (cte)
├── columns: c:11(int!null)
├── columns: c:10(int!null)
├── cardinality: [1 - 2]
├── side-effects, mutations
├── stats: [rows=2, distinct(11)=2, null(11)=0]
├── stats: [rows=2, distinct(10)=2, null(10)=0]
├── cost: 1037.7025
├── key: (11)
├── key: (10)
├── insert t.public.child
│ ├── columns: t.public.child.c:1(int!null)
│ ├── insert-mapping:
Expand All @@ -603,74 +603,74 @@ with &2 (cte)
│ └── f-k-checks
│ └── f-k-checks-item: child(p) -> parent(p)
│ └── anti-join (hash)
│ ├── columns: column2:6(int!null)
│ ├── columns: column2:5(int!null)
│ ├── cardinality: [0 - 1]
│ ├── stats: [rows=1e-10]
│ ├── cost: 1037.5625
│ ├── key: ()
│ ├── fd: ()-->(6)
│ ├── fd: ()-->(5)
│ ├── cte-uses
│ │ └── &1: count=1 used-columns=(4)
│ ├── with-scan &1
│ │ ├── columns: column2:6(int!null)
│ │ ├── columns: column2:5(int!null)
│ │ ├── mapping:
│ │ │ └── column2:4(int) => column2:6(int)
│ │ │ └── column2:4(int) => column2:5(int)
│ │ ├── cardinality: [1 - 1]
│ │ ├── stats: [rows=1, distinct(6)=1, null(6)=0]
│ │ ├── stats: [rows=1, distinct(5)=1, null(5)=0]
│ │ ├── cost: 0.01
│ │ ├── key: ()
│ │ ├── fd: ()-->(6)
│ │ ├── prune: (6)
│ │ ├── fd: ()-->(5)
│ │ ├── prune: (5)
│ │ └── cte-uses
│ │ └── &1: count=1 used-columns=(4)
│ ├── scan t.public.parent
│ │ ├── columns: t.public.parent.p:7(int!null)
│ │ ├── stats: [rows=1000, distinct(7)=1000, null(7)=0]
│ │ ├── columns: t.public.parent.p:6(int!null)
│ │ ├── stats: [rows=1000, distinct(6)=1000, null(6)=0]
│ │ ├── cost: 1020.02
│ │ ├── key: (7)
│ │ ├── prune: (7)
│ │ └── interesting orderings: (+7)
│ │ ├── key: (6)
│ │ ├── prune: (6)
│ │ └── interesting orderings: (+6)
│ └── filters
│ └── eq [type=bool, outer=(6,7), constraints=(/6: (/NULL - ]; /7: (/NULL - ]), fd=(6)==(7), (7)==(6)]
│ ├── variable: column2:6 [type=int]
│ └── variable: t.public.parent.p:7 [type=int]
│ └── eq [type=bool, outer=(5,6), constraints=(/5: (/NULL - ]; /6: (/NULL - ]), fd=(5)==(6), (6)==(5)]
│ ├── variable: column2:5 [type=int]
│ └── variable: t.public.parent.p:6 [type=int]
└── union
├── columns: c:11(int!null)
├── left columns: c:8(int)
├── right columns: "?column?":10(int)
├── columns: c:10(int!null)
├── left columns: c:7(int)
├── right columns: "?column?":9(int)
├── cardinality: [1 - 2]
├── stats: [rows=2, distinct(11)=2, null(11)=0]
├── stats: [rows=2, distinct(10)=2, null(10)=0]
├── cost: 0.1
├── key: (11)
├── key: (10)
├── with-scan &2 (cte)
│ ├── columns: c:8(int!null)
│ ├── columns: c:7(int!null)
│ ├── mapping:
│ │ └── t.public.child.c:1(int) => c:8(int)
│ │ └── t.public.child.c:1(int) => c:7(int)
│ ├── cardinality: [1 - 1]
│ ├── stats: [rows=1, distinct(8)=1, null(8)=0]
│ ├── stats: [rows=1, distinct(7)=1, null(7)=0]
│ ├── cost: 0.01
│ ├── key: ()
│ ├── fd: ()-->(8)
│ └── prune: (8)
│ ├── fd: ()-->(7)
│ └── prune: (7)
└── project
├── columns: "?column?":10(int!null)
├── columns: "?column?":9(int!null)
├── cardinality: [1 - 1]
├── stats: [rows=1, distinct(10)=1, null(10)=0]
├── stats: [rows=1, distinct(9)=1, null(9)=0]
├── cost: 0.04
├── key: ()
├── fd: ()-->(10)
├── prune: (10)
├── fd: ()-->(9)
├── prune: (9)
├── with-scan &2 (cte)
│ ├── columns: c:9(int!null)
│ ├── columns: c:8(int!null)
│ ├── mapping:
│ │ └── t.public.child.c:1(int) => c:9(int)
│ │ └── t.public.child.c:1(int) => c:8(int)
│ ├── cardinality: [1 - 1]
│ ├── stats: [rows=1, distinct(9)=1, null(9)=0]
│ ├── stats: [rows=1, distinct(8)=1, null(8)=0]
│ ├── cost: 0.01
│ ├── key: ()
│ ├── fd: ()-->(9)
│ └── prune: (9)
│ ├── fd: ()-->(8)
│ └── prune: (8)
└── projections
└── plus [as="?column?":10, type=int, outer=(9)]
├── variable: c:9 [type=int]
└── plus [as="?column?":9, type=int, outer=(8)]
├── variable: c:8 [type=int]
└── const: 1 [type=int]
3 changes: 3 additions & 0 deletions pkg/sql/opt/optbuilder/insert.go
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,9 @@ func (b *Builder) buildInsert(ins *tree.Insert, inScope *scope) (outScope *scope
// values specified for them.
// 3. Each update value is the same as the corresponding insert value.
//
// TODO(radu): once FKs no longer require indexes, this function will have to
// take FKs into account explicitly.
//
// TODO(andyk): The fast path is currently only enabled when the UPSERT alias
// is explicitly selected by the user. It's possible to fast path some queries
// of the form INSERT ... ON CONFLICT, but the utility is low and there are lots
Expand Down
Loading

0 comments on commit 2e621c7

Please sign in to comment.