-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: Project merged Upsert columns #33681
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r1, 8 of 10 files at r2, 11 of 11 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained
pkg/sql/opt/memo/expr.go, line 290 at r3 (raw file):
// (which holds the existing value of the column). And for the Upsert case, use // either the UpdateCols/FetchCols or the InsertCols value, as long as it is // non-zero. If both are non-zero, then they must be the same, since they are
[nit] maybe them being the same could be asserted inside the function? (unless it's performance concern)
pkg/sql/opt/optbuilder/insert.go, line 883 at r3 (raw file):
// column. This will be used by the Upsert operator in place of the // original columns. if mb.insertColList[i] != 0 {
[nit] Consider doing this inside each case
above (it should simplify a lot in each case), I think it would be easier to follow what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained
pkg/sql/opt/memo/expr.go, line 290 at r3 (raw file):
Previously, RaduBerinde wrote…
[nit] maybe them being the same could be asserted inside the function? (unless it's performance concern)
Just messier code, but I went ahead and made that change.
pkg/sql/opt/optbuilder/insert.go, line 883 at r3 (raw file):
Previously, RaduBerinde wrote…
[nit] Consider doing this inside each
case
above (it should simplify a lot in each case), I think it would be easier to follow what's going on.
I tried it, but I still have to do the if
statement in case of updateColList
, so I think it ends up being messier. I did update the comment a bit in hopes of making it clearer.
pkg/sql/opt/optbuilder/insert.go, line 883 at r3 (raw file): Previously, andy-kimball (Andy Kimball) wrote…
I don't think you pushed your changes. |
It is unnecessary to pass a table descriptor by value to TablesNeededForFKs. This causes the entire struct to be copied by value onto the stack, and also triggers a heap allocation when assigning the address of the stack struct when setting the TableLookup.Table field. Release note: None
When the type check phase returns a NULL value, wrap it with a cast to the desired type (if it's not types.Any). Without this cast, the expression's type is types.Unknown, which can cause typing issues elsewhere in the code. In particular, it was causing a panic after modifying the Upsert operator to use CASE expressions: CREATE TABLE ab (a INT, b INT[] AS (ARRAY[a]) STORED INSERT INTO ab (a) VALUES (NULL) ON CONFLICT (a) DO UPDATE SET b=ARRAY[1] A projection like this is created and causes a panic: SELECT CASE WHEN canary IS NULL THEN ARRAY[NULL] ELSE ARRAY[1] END ... The CASE operator is typed as UNKNOWN[], and DistSQL panics at runtime when it encounters the INT[1] array. The extra type cast ensures that we assign NULL values a static type that will be propagated throughout the expression and prevent cases like this. Release note: None
Currently, the Upsert operator's insert and update columns are separately created and maintained. For example: CREATE TABLE abc (a INT PRIMARY KEY, b INT) INSERT INTO abc VALUES (1) ON CONFLICT (a) DO UPDATE SET b=5 Here are the projected columns: SELECT a AS fetch_a, b AS fetch_b, 1 AS ins_a, NULL AS ins_b, 5 AS upd_b FROM ... This commit constructs CASE statements that merge the insert and update columns, like this: SELECT a AS fetch_a, b AS fetch_b, CASE WHEN fetch_a IS NULL THEN 1 ELSE fetch_a END AS ups_a, CASE WHEN fetch_a IS NULL THEN NULL::INT ELSE 5 END AS ups_b FROM ... The advantages to making this change are: 1. This sets us up to implement check constraints, since they will now be able to operate on the combined columns. 2. This often avoids evaluating the update expression when an insert needs to occur for a given row. 3. This simplifies the property derivation, making Upsert act just like Update, Insert, and Delete. It's now possible to cleanly map each mutation output column to a single input column. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/sql/opt/optbuilder/insert.go, line 883 at r3 (raw file):
Previously, RaduBerinde wrote…
I don't think you pushed your changes.
I ended up finding a bug once I added the assert, and had to change a few things. Should be correct now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 24 files at r4, 8 of 10 files at r5, 13 of 13 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained
pkg/sql/opt/optbuilder/insert.go, line 883 at r3 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
I ended up finding a bug once I added the assert, and had to change a few things. Should be correct now.
Nice!
bors r+ |
Build failed (retrying...) |
Build failed |
bors r+ |
Build failed (retrying...) |
Build failed |
bors r+ |
33681: opt: Project merged Upsert columns r=andy-kimball a=andy-kimball Currently, the Upsert operator's insert and update columns are separately created and maintained. For example: CREATE TABLE abc (a INT PRIMARY KEY, b INT) INSERT INTO abc VALUES (1) ON CONFLICT (a) DO UPDATE SET b=5 Here are the projected columns: SELECT a AS fetch_a, b AS fetch_b, 1 AS ins_a, NULL AS ins_b, 5 AS upd_b FROM ... This commit constructs CASE statements that merge the insert and update columns, like this: SELECT a AS fetch_a, b AS fetch_b, CASE WHEN fetch_a IS NULL THEN 1 ELSE fetch_a END AS ups_a, CASE WHEN fetch_a IS NULL THEN NULL::INT ELSE 5 END AS ups_b FROM ... The advantages to making this change are: 1. This sets us up to implement check constraints, since they will now be able to operate on the combined columns. 2. This often avoids evaluating the update expression when an insert needs to occur for a given row. 3. This simplifies the property derivation, making Upsert act just like Update, Insert, and Delete. It's now possible to cleanly map each mutation output column to a single input column. Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Build succeeded |
Currently, the Upsert operator's insert and update columns are
separately created and maintained. For example:
CREATE TABLE abc (a INT PRIMARY KEY, b INT)
INSERT INTO abc VALUES (1) ON CONFLICT (a) DO UPDATE SET b=5
Here are the projected columns:
SELECT
a AS fetch_a, b AS fetch_b,
1 AS ins_a, NULL AS ins_b,
5 AS upd_b
FROM ...
This commit constructs CASE statements that merge the insert and
update columns, like this:
SELECT
a AS fetch_a, b AS fetch_b,
CASE WHEN fetch_a IS NULL THEN 1 ELSE fetch_a END AS ups_a,
CASE WHEN fetch_a IS NULL THEN NULL::INT ELSE 5 END AS ups_b
FROM ...
The advantages to making this change are:
be able to operate on the combined columns.
needs to occur for a given row.
Update, Insert, and Delete. It's now possible to cleanly map each
mutation output column to a single input column.