-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: fetch minimal set of columns on returning mutations #38594
opt: fetch minimal set of columns on returning mutations #38594
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought about his deeply enough, but I was expecting this work to happen in optimization rules and/or execution engine code. The optbuilder
mutation code is very tricky as-is, and I'd like to avoid adding any optimizations directly to that code. From what I can see, this PR is folding the "fetch minimal set of columns" optimization directly into the optbuilder
code, making it even more difficult to reason about. Is there some reason why you went with this design? Do other alternatives not work well?
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should've marked this as a WIP. My initial approach was to fix the creation of the returning columns correctly in the optbuilder instead of fixing it later as a norm rule. I wasn't certain how we'd extract the return cols and thought it'd be easier to add that in the optbuilder. But now that I'm extracting them from the projection, you're right that its better to have this elsewhere. I'll push up a change to add a norm rule to do this.
Reviewable status: complete! 0 of 0 LGTMs obtained
6fd7498
to
93d6b63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a norm rule. The output columns don't look right though (and also the not null columns I suspect). I'll take a closer look in the morning.
Reviewable status: complete! 0 of 0 LGTMs obtained
93d6b63
to
6872528
Compare
37436c8
to
2fac7f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've made some good progress on this.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @RaduBerinde, and @ridwanmsharif)
pkg/sql/opt/norm/prune_cols.go, line 535 at r4 (raw file):
// Find all the columns referenced by the projections. returningOrdSet := exec.ColumnOrdinalSet{}
We shouldn't take a dependency on the exec
package from norm
. Coupled with the comment below, I'd create a NeededMutationReturnCols
function that returns an opt.ColSet
that contains the column ids for the columns needed by the projection + primary key columns. That way, it fits better with the patterns already established in this file. In PruneMutationReturnCols
, you can then iterate over the ReturnCols
and only include them in the new MutationPrivate
if they're part of that needed set.
pkg/sql/opt/norm/prune_cols.go, line 548 at r4 (raw file):
}) // The columns of the primary index are always returned regardless of
NIT: primary index
=> primary key
pkg/sql/opt/norm/prune_cols.go, line 567 at r4 (raw file):
newPrivate.ReturnCols = newReturnCols newPrivate.ReturnPruned = true
We should avoid using ReturnPruned
, because it means that we won't be able to repeatedly prune more columns as more of the expression is optimized. Like imagine something like:
SELECT a
FROM
(
SELECT a, b
FROM [UPDATE abcde SET c=5 WHERE d=100 RETURNING *]
)
We'd first prune according to the inner projection, which might prune away column 'e'. But then if we set ReturnPruned
, we won't be able to prune column b
. This is why we typically follow the pattern CanPruneXXX
and PruneXXX
. We don't want to assume that one pruning attempt is enough.
I'd recommend that you follow the pattern we established for other pruning rules:
// Not sure what
(CanPruneMutationReturnCols $input $needed:(NeededMutationReturnCols $mutationPrivate))
pkg/sql/opt/norm/rules/prune_cols.opt, line 472 at r4 (raw file):
[PruneMutationReturnCols, Normalize] (Project $input: (Insert | Update | Upsert | Delete
NIT: Remove blank space after $input:
.
pkg/sql/opt/norm/testdata/rules/prune_cols, line 1941 at r4 (raw file):
# Do not prune columns that must be returned. # TODO(justin): in order to prune e here we need a PruneMutationReturnCols rule.
Can update the comment now.
pkg/sql/rowcontainer/datum_row_container.go, line 216 at r4 (raw file):
func (c *RowContainer) AddRow(ctx context.Context, row tree.Datums) (tree.Datums, error) { if len(row) != c.numCols { //panic(fmt.Sprintf("invalid row length %d, expected %d", len(row), c.numCols))
Need to uncomment this.
6da5461
to
cc8fa38
Compare
00b30ad
to
b919eab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @justinj, @RaduBerinde, and @ridwanmsharif)
pkg/sql/delete.go, line 337 at r12 (raw file):
// public columns. resultValues := make(tree.Datums, d.run.rows.NumCols()) for i := 0; i < d.run.rows.NumCols(); i++ {
can use copy
for this, but this is doing an additional copy now, is that needed? why can we no longer just pass in the existing slice?
pkg/sql/opt_exec_factory.go, line 1290 at r12 (raw file):
var returnCols sqlbase.ResultColumns if rowsNeeded { // Update always returns the minimal set of non-mutation columns required,
super nit: it's not really, conceptually, the minimal set here, it's just the set that is specified in the operator. no need to change this but it's a little bit abstraction-leaky
pkg/sql/opt_exec_factory.go, line 1292 at r12 (raw file):
// Update always returns the minimal set of non-mutation columns required, // in the same order they are defined in the table. var returnColDescs []sqlbase.ColumnDescriptor
no need to var
this, i think, just use :=
below
pkg/sql/opt/memo/expr.go, line 385 at r12 (raw file):
// OutputOrd returns true if the i-th ordinal column should be part of the // mutations output columns.
nit: mutation's
, and maybe give this a more descriptive name, like IsColumnOutput
.
pkg/sql/opt/memo/expr.go, line 395 at r12 (raw file):
// InsertOrd returns true if the i-th ordinal column is being inserted by // the mutation. func (m *MutationPrivate) InsertOrd(i int) bool {
ditto
pkg/sql/opt/memo/logical_props_builder.go, line 1091 at r12 (raw file):
// Only non-mutation columns are output columns. for i, n := 0, tab.ColumnCount(); i < n; i++ { if private.OutputOrd(i) || private.InsertOrd(i) {
are these only referenced here? if so can they be combined into a single condition somehow?
pkg/sql/opt/norm/prune_cols.go, line 515 at r5 (raw file):
// CanPruneMutationReturnCols checks whether the mutations return columns can // be pruned. This is the base case for the PruneMutationReturnCols rule.
nit: s/base case/precondition/?
pkg/sql/opt/norm/prune_cols.go, line 527 at r5 (raw file):
// keep ReturnCols that are not referenced by the RETURNING clause or are not // part of the primary key. The caller must have already done the analysis to // prove that these columns are not needed, by calling CanPruneMutationReturnCols.
possibly: s/that these columns are not needed/that such columns exist/? but nothing bad will happen if there are no such columns, right? it just won't be productive for the caller
pkg/sql/opt/norm/prune_cols.go, line 404 at r12 (raw file):
// to be pruned, then there must be logic in the PruneCols method to actually // prune those columns when requested. func DerivePruneCols(e memo.RelExpr) opt.ColSet {
I think there should be a corresponding change here in DerivePruneCols
so that we can push down Projections to allow pruning to occur
pkg/sql/opt/norm/prune_cols.go, line 523 at r12 (raw file):
tabID := c.mem.Metadata().TableMeta(private.Table).MetaID for i := 0; i < len(private.ReturnCols); i++ {
for i := range private.ReturnCols {
pkg/sql/opt/norm/prune_cols.go, line 543 at r12 (raw file):
tabID := c.mem.Metadata().TableMeta(private.Table).MetaID for i := 0; i < len(private.ReturnCols); i++ {
nit: for i := range private.ReturnCols {
pkg/sql/opt/norm/prune_cols.go, line 547 at r12 (raw file):
newReturnCols[i] = private.ReturnCols[i] } else { newReturnCols[i] = 0
nit: this is unnecessary since it will default to 0
pkg/sql/opt/norm/prune_cols.go, line 559 at r12 (raw file):
// the mutation. func (c *CustomFuncs) NeededMutationReturnCols( private *memo.MutationPrivate, projections memo.ProjectionsExpr, passthrough opt.ColSet,
as mentioned in prune_cols.opt
, I think this can be simplified
pkg/sql/opt/norm/rules/prune_cols.opt, line 475 at r12 (raw file):
$innerInput:* $checks:* $mutationPrivate:*)
nit: )
on new line
pkg/sql/opt/norm/rules/prune_cols.opt, line 479 at r12 (raw file):
$passthrough:* & (CanPruneMutationReturnCols $mutationPrivate $needed: (NeededMutationReturnCols $mutationPrivate $projections $passthrough))
do you need to pass the projections and passthrough directly to this function? a common pattern we use is:
$needed:(UnionCols3
(ProjectionOuterCols $projections)
$passthrough
(NeededMutationReturnCols $mutationPrivate)
)
which I think keeps things more contained
pkg/sql/opt/norm/rules/prune_cols.opt, line 486 at r12 (raw file):
$innerInput $checks (PruneMutationReturnCols $mutationPrivate $needed))
nit: )
on new line
pkg/sql/opt/norm/testdata/rules/prune_cols, line 2241 at r12 (raw file):
└── CASE WHEN a IS NULL THEN column2 ELSE 10 END [type=int, outer=(7,10)]
nit: 2 extra blank lines
pkg/sql/rowcontainer/datum_row_container.go, line 198 at r12 (raw file):
rsz := c.fixedColsSize for _, i := range c.varSizedColumns { if i < len(row) && row[i] != nil {
this smells to me like we've violated an existing contract. why was this change necessary? in general I think making nils valid where they were not before is a risky proposition, is there a way we can avoid having to do so?
de69d15
to
ff6ce77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(other than the outstanding comments)
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @RaduBerinde, and @ridwanmsharif)
pkg/sql/opt/norm/prune_cols.go, line 562 at r22 (raw file):
) opt.ColSet { // Find all the columns referenced by the projections. returningColSet := opt.ColSet{}
var returningColSet opt.ColSet
pkg/sql/opt/norm/prune_cols.go, line 566 at r22 (raw file):
for _, projection := range projections { outCols := projection.ScalarProps(c.mem).OuterCols outCols.ForEach(func(colID opt.ColumnID) {
returningColSet.UnionWith(outCols)
(same below)
pkg/sql/opt/norm/prune_cols.go, line 572 at r22 (raw file):
passthrough.ForEach(func(colID opt.ColumnID) { if colID > 0 {
Why would passthrough
contain a 0 ColumnID?
pkg/sql/opt/norm/rules/prune_cols.opt, line 479 at r22 (raw file):
$passthrough:* & (CanPruneMutationReturnCols $mutationPrivate $needed: (NeededMutationReturnCols $mutationPrivate $projections $passthrough))
[nit] no space after :
, and )
on new line
pkg/sql/opt/norm/testdata/rules/prune_cols, line 2270 at r22 (raw file):
# Fetch all the columns for the RETURN expression. opt
any reason we're using opt
instead of norm
? (I see it's already used a lot in this file, not sure why)
pkg/sql/opt/norm/testdata/rules/prune_cols, line 2323 at r22 (raw file):
└── a + d [type=int, outer=(9,12)] # Fetch only whats being updated (not the (d, e, f, g)) family.
[nit] should "family" be inside the outer )
pkg/sql/opt/xform/testdata/rules/join, line 2127 at r21 (raw file):
│ │ ├── side-effects, mutations │ │ ├── insert abc │ │ │ ├── columns: abc.a:5(int) abc.b:6(int) abc.c:7(int) abc.rowid:8(int!null)
how come we lost the FDs and !null info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I don't have much to add beyond what Radu/Justin/Andy already said.
Don't forget to squash the commits...
Reviewed 1 of 8 files at r5, 1 of 7 files at r15, 1 of 5 files at r16, 1 of 4 files at r18, 2 of 5 files at r19, 3 of 6 files at r20, 2 of 2 files at r21, 1 of 1 files at r22.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @RaduBerinde, and @ridwanmsharif)
pkg/sql/opt/norm/prune_cols.go, line 513 at r22 (raw file):
} // CanPruneMutationReturnCols checks whether the mutations return columns can
[nit] mutations -> mutation's
pkg/sql/opt/norm/rules/prune_cols.opt, line 479 at r22 (raw file):
Previously, RaduBerinde wrote…
[nit] no space after
:
, and)
on new line
also put $mutationPrivate and $needed on different lines
There will me some more involved changes in the upcoming commits in sql. This PR doesn't correctly set the output columns in upsert and insert mutations and right now. Even though the mutations fetch the minimal set of columns it needs to, it populates the result row with all the fetch columns. That is why we have weird exec plans where the output columns include more columns than the returning clause specifies. I'll address and fix all the comments once I've made the appropriate changes. |
00c9100
to
1580d09
Compare
cc @jordanlewis to get some eyes on the SQL changes. |
ee94de8
to
c4a93b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @jordanlewis, @justinj, @RaduBerinde, and @rytaft)
pkg/sql/delete.go, line 337 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
can use
copy
for this, but this is doing an additional copy now, is that needed? why can we no longer just pass in the existing slice?
Good catch! This remained from one of my debugging sessions. A copy shouldn't be necessary, just appropriate indexing.
pkg/sql/opt_exec_factory.go, line 1290 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
super nit: it's not really, conceptually, the minimal set here, it's just the set that is specified in the operator. no need to change this but it's a little bit abstraction-leaky
Done.
pkg/sql/opt_exec_factory.go, line 1292 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
no need to
var
this, i think, just use:=
below
Will need this now (returnColumns is empty when no RETURNING clause is specified)
pkg/sql/opt/memo/expr.go, line 395 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
ditto
Done.
pkg/sql/opt/memo/logical_props_builder.go, line 1091 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
are these only referenced here? if so can they be combined into a single condition somehow?
Done.
pkg/sql/opt/norm/prune_cols.go, line 527 at r5 (raw file):
Previously, justinj (Justin Jaffray) wrote…
possibly: s/that these columns are not needed/that such columns exist/? but nothing bad will happen if there are no such columns, right? it just won't be productive for the caller
Done.
pkg/sql/opt/norm/prune_cols.go, line 404 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
I think there should be a corresponding change here in
DerivePruneCols
so that we can push down Projections to allow pruning to occur
I added a second commit to make the appropriate changes to DerivePruneCols
to allow this to happen.
pkg/sql/opt/norm/prune_cols.go, line 523 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
for i := range private.ReturnCols {
Done.
pkg/sql/opt/norm/prune_cols.go, line 559 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
as mentioned in
prune_cols.opt
, I think this can be simplified
Done.
pkg/sql/opt/norm/prune_cols.go, line 513 at r22 (raw file):
Previously, rytaft wrote…
[nit] mutations -> mutation's
Done.
pkg/sql/opt/norm/rules/prune_cols.opt, line 479 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
do you need to pass the projections and passthrough directly to this function? a common pattern we use is:
$needed:(UnionCols3 (ProjectionOuterCols $projections) $passthrough (NeededMutationReturnCols $mutationPrivate) )
which I think keeps things more contained
Done.
pkg/sql/opt/xform/testdata/rules/join, line 2127 at r21 (raw file):
Previously, RaduBerinde wrote…
how come we lost the FDs and !null info?
It was a problem with how the OutputCols
of mutations were collected. Fixed now.
pkg/sql/rowcontainer/datum_row_container.go, line 198 at r12 (raw file):
Previously, justinj (Justin Jaffray) wrote…
this smells to me like we've violated an existing contract. why was this change necessary? in general I think making nils valid where they were not before is a risky proposition, is there a way we can avoid having to do so?
We avoid needing to do this now.
f65ddde
to
8910ee1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r31, 8 of 18 files at r33, 4 of 5 files at r34, 9 of 9 files at r35, 3 of 3 files at r36.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @jordanlewis, @justinj, @RaduBerinde, and @ridwanmsharif)
pkg/sql/delete.go, line 219 at r35 (raw file):
// to the columns in the resultRowBuffer. A value of -1 is used to indicate // that the column at that index is not part of the resultRowBuffer // of the mutation. Otherwise, the value an the i-th index refers to the
an -> at
pkg/sql/insert.go, line 389 at r35 (raw file):
// columns in the resultRowBuffer. A value of -1 is used to indicate // that the table column at that index is not part of the resultRowBuffer // of the mutation. Otherwise, the value an the i-th index refers to the
an -> at
pkg/sql/insert.go, line 392 at r35 (raw file):
// index of the resultRowBuffer where the i-th column of the table is // to be returned. tabIdxToRetIdx []int
[nit] for consistency, either call this tabColIdxToRetIdx
or change the above name to rowIdxToTabIdx
. Or maybe just get rid of tab
and use colIdx
for both.
pkg/sql/opt_exec_factory.go, line 1166 at r35 (raw file):
// Initialize the rowIdxToTabColIdx array. rowIdxToRetIdx := make([]int, len(origColDescs)) for i := range origColDescs {
[nit] I'd do range rowIdxToRetIdx
to make it obvious that there's no array out of bounds issue here
pkg/sql/opt_exec_factory.go, line 1348 at r35 (raw file):
// Update the rowIdxToReturnColDescs for the mutation. Update returns // the of non-mutation columns specified, in the same order they are
should rowIdxToReturnColDescs
be rowIdxToRetIdx
?
the of -> the
pkg/sql/opt_exec_factory.go, line 1353 at r35 (raw file):
// The Updater derives/stores the fetch columns of the mutation and // since the return columns are always a subset of the fetch columns, // we can use use the fetch columns generate the mapping for the
columns generate -> columns to generate
pkg/sql/opt_exec_factory.go, line 1354 at r35 (raw file):
// since the return columns are always a subset of the fetch columns, // we can use use the fetch columns generate the mapping for the // returning rows.
returning -> returned (?)
pkg/sql/opt_exec_factory.go, line 1490 at r35 (raw file):
returnColumns = sqlbase.ResultColumnsFromColDescs(returnColDescs) // Update the tabIdxToReturnColDescs for the mutation. Upsert returns
rowIdxToReturnColDescs
-> rowIdxToRetIdx
pkg/sql/opt_exec_factory.go, line 1555 at r35 (raw file):
// Find all the columns that are part of the rows returned by the delete. deleteDescs := make([]sqlbase.ColumnDescriptor, 0, len(tableColDescs)) colMap := make(map[sqlbase.ColumnID]struct{}, len(tableColDescs))
Probably better to use FastIntSet
for this
pkg/sql/opt_exec_factory.go, line 1638 at r35 (raw file):
// Determine the relational type of the generated delete node. // If rows are not needed, no columns are returned. var returnColumns sqlbase.ResultColumns
this name is very similar to returnCols. I'd try to distinguish them a bit more. Perhaps call the input returnColSet
and this variable returnCols
?
pkg/sql/opt_exec_factory.go, line 1652 at r35 (raw file):
} // Update the tabIdxToReturnColDescs for the mutation. Delete returns
fix name for this variable
pkg/sql/tablewriter_upsert_opt.go, line 77 at r35 (raw file):
// columns in the resultRowBuffer. A value of -1 is used to indicate // that the table column at that index is not part of the resultRowBuffer // of the mutation. Otherwise, the value an the i-th index refers to the
an -> at
pkg/sql/update.go, line 494 at r35 (raw file):
// columns in the resultRowBuffer. A value of -1 is used to indicate // that the column at that index is not part of the resultRowBuffer // of the mutation. Otherwise, the value an the i-th index refers to the
an -> at
pkg/sql/opt/norm/custom_funcs.go, line 364 at r35 (raw file):
// The columns of the primary key are always returned regardless of // whether they are referenced.
This comment seems out of place
pkg/sql/opt/norm/custom_funcs.go, line 366 at r35 (raw file):
// whether they are referenced. tab := c.mem.Metadata().Table(table) primaryIndex := tab.Index(0)
use cat.PrimaryIndex
pkg/sql/opt/norm/custom_funcs.go, line 369 at r35 (raw file):
for i, n := 0, primaryIndex.KeyColumnCount(); i < n; i++ { primaryKeyCols.Add(tabID.ColumnID(primaryIndex.Column(i).Ordinal)) }
Can't you use the TableMeta.IndexKeyColumns
function here?
pkg/sql/opt/norm/prune_cols.go, line 515 at r36 (raw file):
// Any of the RETURNING columns are candidates for pruning if they are // not required for the mutation. relProps.Rule.PruneCols = relProps.OutputCols
I think you need to do relProps.OutputCols.Copy()
. Alternatively, you could change the last line to relProps.Rule.PruneCols = relProps.OutputCols.Difference(neededCols)
(seems more efficient)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @justinj, @RaduBerinde, and @ridwanmsharif)
pkg/sql/delete.go, line 348 at r35 (raw file):
// public columns. resultValues := make(tree.Datums, d.run.rows.NumCols()) for i := range d.run.rowIdxToRetIdx {
nit: it's idiomatic to do for i, retIdx := range
to save a line of code
pkg/sql/opt_exec_factory.go, line 1226 at r35 (raw file):
// In such cases, the newly added columns shouldn't be returned. // See regression logic tests for #29494. if len(tabDesc.Columns) < len(returnColDescs) {
Doesn't seem straightforward to me that returnColDescs would contain mutation columns, but I guess that's probably just the way things are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @justinj, @RaduBerinde, and @ridwanmsharif)
pkg/sql/insert.go, line 424 at r36 (raw file):
// // Also we need to re-order the values in the source, ordered by // insertCols, when writing them to resultRowBuffer, ordered by
"ordered by n.columns" doesn't quite make sense anymore no? Maybe just replace it with "according to rowIdxToTabColIdx"
pkg/sql/opt_exec_factory.go, line 1156 at r36 (raw file):
} // mutationRowIdxToReturnIdx returns the mapping from the origColDescs to the
This description is hard to understand without more context. Should start by saying that returnColDescs
is a subset of the columns in origColDescs
.
pkg/sql/opt_exec_factory.go, line 1157 at r36 (raw file):
// mutationRowIdxToReturnIdx returns the mapping from the origColDescs to the // returningColDescs. -1 is used for columns not part of the returnColDescs.
returnColDescs
pkg/sql/opt_exec_factory.go, line 1548 at r36 (raw file):
// of a row on the table. This will include the returnColDescs columns that // are referenced in the RETURNING clause of the delete mutation. This // is different from the fetch columns of the delete mutation as it doesn't
"it" naturally sounds like it refers to "this" but I think it refers to the fetch columns? How can fetch columns not include index keys and returning columns?
pkg/sql/opt_exec_factory.go, line 1657 at r36 (raw file):
returnColumns = sqlbase.ResultColumnsFromColDescs(returnColDescs) // Find all the columns that the Deleter returns. The returning columns
The first sentence seems wrong, isn't this "all the columns that the deleteNode receives" ?
It's a little strange that we have to calculate this set in here, surprised we can't just use the fetchCols
.
pkg/sql/opt_exec_factory.go, line 1658 at r36 (raw file):
// Find all the columns that the Deleter returns. The returning columns // of the mutation are a subset of this column set and will use this
Unclear what "will use this for the mapping" means, I'd just remove it altogether.
pkg/sql/opt/exec/execbuilder/testdata/orderby, line 487 at r36 (raw file):
│ render 0 b · · └── run · · (a, b) · └── insert · · (a, b) ·
How come insert
doesn't produce only b
?
pkg/sql/opt/norm/testdata/rules/prune_cols, line 2569 at r36 (raw file):
├── fd: (1)-->(2,4) └── delete returning_test ├── columns: a:1(int!null) b:2(int) d:4(int) rowid:8(int!null)
How come rowid
is still part of the columns produced by delete/upsert?
8910ee1
to
40031b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball, @jordanlewis, @justinj, @RaduBerinde, and @rytaft)
pkg/sql/delete.go, line 219 at r35 (raw file):
Previously, rytaft wrote…
an -> at
Done.
pkg/sql/insert.go, line 389 at r35 (raw file):
Previously, rytaft wrote…
an -> at
Done.
pkg/sql/insert.go, line 392 at r35 (raw file):
Previously, rytaft wrote…
[nit] for consistency, either call this
tabColIdxToRetIdx
or change the above name torowIdxToTabIdx
. Or maybe just get rid oftab
and usecolIdx
for both.
Done.
pkg/sql/insert.go, line 424 at r36 (raw file):
Previously, RaduBerinde wrote…
"ordered by n.columns" doesn't quite make sense anymore no? Maybe just replace it with "according to rowIdxToTabColIdx"
Done.
pkg/sql/opt_exec_factory.go, line 1166 at r35 (raw file):
Previously, rytaft wrote…
[nit] I'd do
range rowIdxToRetIdx
to make it obvious that there's no array out of bounds issue here
Done.
pkg/sql/opt_exec_factory.go, line 1226 at r35 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Doesn't seem straightforward to me that returnColDescs would contain mutation columns, but I guess that's probably just the way things are?
Yep.
pkg/sql/opt_exec_factory.go, line 1348 at r35 (raw file):
Previously, rytaft wrote…
should
rowIdxToReturnColDescs
berowIdxToRetIdx
?the of -> the
Done.
pkg/sql/opt_exec_factory.go, line 1353 at r35 (raw file):
Previously, rytaft wrote…
columns generate -> columns to generate
Done.
pkg/sql/opt_exec_factory.go, line 1354 at r35 (raw file):
Previously, rytaft wrote…
returning -> returned (?)
Done.
pkg/sql/opt_exec_factory.go, line 1490 at r35 (raw file):
Previously, rytaft wrote…
rowIdxToReturnColDescs
->rowIdxToRetIdx
Done.
pkg/sql/opt_exec_factory.go, line 1555 at r35 (raw file):
Previously, rytaft wrote…
Probably better to use
FastIntSet
for this
Good Call. Done.
pkg/sql/opt_exec_factory.go, line 1638 at r35 (raw file):
Previously, rytaft wrote…
this name is very similar to returnCols. I'd try to distinguish them a bit more. Perhaps call the input
returnColSet
and this variablereturnCols
?
Done.
pkg/sql/opt_exec_factory.go, line 1652 at r35 (raw file):
Previously, rytaft wrote…
fix name for this variable
Done.
pkg/sql/opt_exec_factory.go, line 1156 at r36 (raw file):
Previously, RaduBerinde wrote…
This description is hard to understand without more context. Should start by saying that
returnColDescs
is a subset of the columns inorigColDescs
.
Done.
pkg/sql/opt_exec_factory.go, line 1548 at r36 (raw file):
Previously, RaduBerinde wrote…
"it" naturally sounds like it refers to "this" but I think it refers to the fetch columns? How can fetch columns not include index keys and returning columns?
Yeah I'll word it better and elaborate. The fetch columns do include the columns this function returns yes, but the fetch columns also have columns (for example the ones referenced in the WHERE clause).
pkg/sql/opt_exec_factory.go, line 1657 at r36 (raw file):
Previously, RaduBerinde wrote…
The first sentence seems wrong, isn't this "all the columns that the deleteNode receives" ?
It's a little strange that we have to calculate this set in here, surprised we can't just use the
fetchCols
.
Done. Also, as discussed earlier, the fetchColumns
include more columns than we need and that's why we calculate it here.
pkg/sql/opt_exec_factory.go, line 1658 at r36 (raw file):
Previously, RaduBerinde wrote…
Unclear what "will use this for the mapping" means, I'd just remove it altogether.
Done.
pkg/sql/tablewriter_upsert_opt.go, line 77 at r35 (raw file):
Previously, rytaft wrote…
an -> at
Done.
pkg/sql/update.go, line 494 at r35 (raw file):
Previously, rytaft wrote…
an -> at
Done.
pkg/sql/opt/exec/execbuilder/testdata/orderby, line 487 at r36 (raw file):
Previously, RaduBerinde wrote…
How come
insert
doesn't produce onlyb
?
As discussed earlier, it's because in addition to the RETURNING clause columns, the mutation (currently) also returns the primary key columns.
pkg/sql/opt/norm/custom_funcs.go, line 364 at r35 (raw file):
Previously, rytaft wrote…
This comment seems out of place
Done.
pkg/sql/opt/norm/custom_funcs.go, line 366 at r35 (raw file):
Previously, rytaft wrote…
use
cat.PrimaryIndex
Done.
pkg/sql/opt/norm/custom_funcs.go, line 369 at r35 (raw file):
Previously, rytaft wrote…
Can't you use the
TableMeta.IndexKeyColumns
function here?
Done.
pkg/sql/opt/norm/prune_cols.go, line 515 at r36 (raw file):
Previously, rytaft wrote…
I think you need to do
relProps.OutputCols.Copy()
. Alternatively, you could change the last line torelProps.Rule.PruneCols = relProps.OutputCols.Difference(neededCols)
(seems more efficient)
Done.
pkg/sql/opt/norm/testdata/rules/prune_cols, line 2569 at r36 (raw file):
Previously, RaduBerinde wrote…
How come
rowid
is still part of the columns produced by delete/upsert?
As part of the needed
set in the rules, all the primary key columns are returned as well.
40031b8
to
01059a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @jordanlewis, @justinj, @RaduBerinde, @ridwanmsharif, and @rytaft)
pkg/sql/opt_exec_factory.go, line 1558 at r37 (raw file):
// Find all the columns that are part of the rows returned by the delete. deleteDescs := make([]sqlbase.ColumnDescriptor, 0, len(tableColDescs)) deleteCols := util.FastIntSet{}
[nit] var deleteCols util.FastIntSet
pkg/sql/opt/norm/rules/prune_cols.opt, line 471 at r37 (raw file):
# conservative with the fetch columns. # TODO(ridwanmsharif): Mutations shouldn't need to return the primary key # columns. Make appropriate changes to SQL to accommodate this.
[nit] extra space before columns
. Also, say "SQL execution" or it can read as something else
Previously, we used to fetch all columns when a mutation contained a `RETURNING` clause. This is an issue because it forces us to retrieve unnecessary data and creates extra contention. This change adds logic to compute the minimal set of required columns and fetches only those. This change passes down the minimal required return cols to SQL so it knows to only return columns that's requested. This fixes the output column calculation done by opt and makes sure the execPlan's output columns are the same as output cols of the opt plan. Fixes cockroachdb#30618. Unblocks cockroachdb#30624. Release note: None
01059a7
to
5a44334
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball, @jordanlewis, @justinj, @RaduBerinde, and @rytaft)
38594: opt: fetch minimal set of columns on returning mutations r=ridwanmsharif a=ridwanmsharif Previously, we used to fetch all columns when a mutation contained a `RETURNING` clause. This is an issue because it forces us to retrieve unnecessary data and creates extra contention. This change adds logic to compute the minimal set of required columns and fetches only those. Fixes #30618. Unblocks #30624. Release note: None Co-authored-by: Ridwan Sharif <[email protected]>
Build succeeded |
Previously, we used to fetch all columns when a mutation contained
a
RETURNING
clause. This is an issue because it forces us toretrieve unnecessary data and creates extra contention.
This change adds logic to compute the minimal set of required columns
and fetches only those.
Fixes #30618.
Unblocks #30624.
Release note: None