Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: query decorrelation error under recursive CTE in asyncpg #71908

Closed
gordthompson opened this issue Oct 24, 2021 · 10 comments · Fixed by #95234
Closed

sql: query decorrelation error under recursive CTE in asyncpg #71908

gordthompson opened this issue Oct 24, 2021 · 10 comments · Fixed by #95234
Assignees
Labels
A-tools-asyncpg C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner

Comments

@gordthompson
Copy link

gordthompson commented Oct 24, 2021

tl;dr: #71908 (comment)


Describe the problem

I've seen a number of issues regarding "could not decorrelate subquery" (e.g., #68086 , #67951 ) but this one might be a little different in that it seems to be related more specifically to asyncpg.

Reflection for PostgreSQL in SQLAlchemy uses queries that include ANY(). For example, to retrieve primary key columns it emits a query similar to this:

SELECT
cons.conname as name,
cons.conkey as key,
a.attnum as col_num,
a.attname as col_name
FROM
pg_catalog.pg_constraint cons
join pg_attribute a
  on cons.conrelid = a.attrelid AND
    a.attnum = ANY(cons.conkey)
WHERE
cons.contype = 'p'

If I create a table in an empty CockroachDB database

CREATE TABLE tbl1 (id int primary key, txt varchar(50))

then this code (using psycopg2) works fine:

import psycopg2

reflect_query = """\
SELECT
cons.conname as name,
cons.conkey as key,
a.attnum as col_num,
a.attname as col_name
FROM
pg_catalog.pg_constraint cons
join pg_attribute a
  on cons.conrelid = a.attrelid AND
    a.attnum = ANY(cons.conkey)
WHERE
cons.contype = 'p'
"""

# connect to CockroachDB
conn = psycopg2.connect(host="localhost", port=26257, user="root", dbname="defaultdb")
cur = conn.cursor()
cur.execute(reflect_query)
print(cur.fetchall())
# [('primary', [1], 1, 'id')]

However, if I connect to CockroachDB using asyncpg I get the "could not decorrelate subquery" error

import asyncio
import asyncpg

reflect_query = """\
SELECT
cons.conname as name,
cons.conkey as key,
a.attnum as col_num,
a.attname as col_name
FROM
pg_catalog.pg_constraint cons
join pg_attribute a
  on cons.conrelid = a.attrelid AND
    a.attnum = ANY(cons.conkey)
WHERE
cons.contype = 'p'
"""

async def run():
    cr = {"host": "localhost", "port": 26257, "user": "root", "database": "defaultdb"}
    conn = await asyncpg.connect(**cr)
    values = await conn.fetch(reflect_query)  # Error: "could not decorrelate subquery"
    print(values)
    await conn.close()


loop = asyncio.get_event_loop()
loop.run_until_complete(run())

However, if I use asyncpg to connect to PostgreSQL then the same query does not fail:

async def run():
    pg = {"host": "192.168.0.199", "user": "scott", "password": "tiger", "database": "test"}
    conn = await asyncpg.connect(**pg)
    values = await conn.fetch(reflect_query)
    print(values)
    # [<Record name='tbl1_pkey' key=[1] col_num=1 col_name='id'>]
    await conn.close()

It also seems to be directly related to ANY() because if I remove the AND a.attnum = ANY(cons.conkey) then the error does not occur for cockroachdb+asyncpg.

Environment:

  • CockroachDB CCL v21.2.0-rc.1 (x86_64-unknown-linux-gnu, built 2021/10/18 14:45:39, go1.16.6)
  • Server OS: Xubuntu 20.04
  • asyncpg 0.24.0 (compared with psycopg2 2.9.1)

Additional context

One possible workaround might be to have the sqlalchemy-cockroachdb dialect override SQLAlchemy's code for PostgreSQL reflection with code that does not emit queries that use ANY().

Epic CRDB-18886
Jira issue: CRDB-10828

@gordthompson gordthompson added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Oct 24, 2021
@blathers-crl
Copy link

blathers-crl bot commented Oct 24, 2021

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added A-disaster-recovery O-community Originated from the community X-blathers-triaged blathers was able to find an owner T-disaster-recovery labels Oct 24, 2021
@blathers-crl

This comment was marked as off-topic.

@gordthompson gordthompson changed the title "could not decorrelate subquery" error under asyncpg with a query that that uses ANY() "could not decorrelate subquery" error under asyncpg with a query that uses ANY() Oct 24, 2021
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Oct 25, 2021
@otan
Copy link
Contributor

otan commented Oct 25, 2021

the "real" issue is that asyncpg is trying to execute


WITH RECURSIVE
	typeinfo_tree
		(
			oid,
			ns,
			name,
			kind,
			basetype,
			elemtype,
			elemdelim,
			range_subtype,
			attrtypoids,
			attrnames,
			depth
		)
		AS (
			SELECT
				ti.oid,
				ti.ns,
				ti.name,
				ti.kind,
				ti.basetype,
				ti.elemtype,
				ti.elemdelim,
				ti.range_subtype,
				ti.attrtypoids,
				ti.attrnames,
				0
			FROM
				(
					SELECT
						t.oid AS oid,
						ns.nspname AS ns,
						t.typname AS name,
						t.typtype AS kind,
						(
							CASE
							WHEN t.typtype = 'd'
							THEN (
								WITH RECURSIVE
									typebases (oid, depth)
										AS (
											SELECT
												t2.typbasetype
													AS oid,
												0 AS depth
											FROM
												pg_type
													AS t2
											WHERE
												t2.oid
												= t.oid
											UNION ALL
												SELECT
													t2.typbasetype
														AS oid,
													tb.depth
													+ 1
														AS depth
												FROM
													pg_type
														AS t2,
													typebases
														AS tb
												WHERE
													(
														tb.oid
														= t2.oid
													)
													AND (
															t2.typbasetype
															!= 0
														)
										)
								SELECT
									oid
								FROM
									typebases
								ORDER BY
									depth DESC
								LIMIT
									1
							)
							ELSE NULL
							END
						)
							AS basetype,
						t.typelem AS elemtype,
						elem_t.typdelim AS elemdelim,
						range_t.rngsubtype AS range_subtype,
						(
							CASE
							WHEN t.typtype = 'c'
							THEN (
								SELECT
									array_agg(
										ia.atttypid ORDER BY
											ia.attnum
									)
								FROM
									pg_attribute AS ia
									INNER JOIN pg_class
											AS c ON
											(
												ia.attrelid
												= c.oid
											)
								WHERE
									(
										(ia.attnum > 0)
										AND (
												NOT
													ia.attisdropped
											)
									)
									AND (c.reltype = t.oid)
							)
							ELSE NULL
							END
						)
							AS attrtypoids,
						(
							CASE
							WHEN t.typtype = 'c'
							THEN (
								SELECT
									array_agg(
										ia.attname::STRING ORDER BY
											ia.attnum
									)
								FROM
									pg_attribute AS ia
									INNER JOIN pg_class
											AS c ON
											(
												ia.attrelid
												= c.oid
											)
								WHERE
									(
										(ia.attnum > 0)
										AND (
												NOT
													ia.attisdropped
											)
									)
									AND (c.reltype = t.oid)
							)
							ELSE NULL
							END
						)
							AS attrnames
					FROM
						pg_catalog.pg_type AS t
						INNER JOIN pg_catalog.pg_namespace
								AS ns ON
								(ns.oid = t.typnamespace)
						LEFT JOIN pg_type AS elem_t ON
								(
									(
										(t.typlen = -1)
										AND (t.typelem != 0)
									)
									AND (
											t.typelem
											= elem_t.oid
										)
								)
						LEFT JOIN pg_range AS range_t ON
								(t.oid = range_t.rngtypid)
				)
					AS ti
			WHERE
				ti.oid = ANY ($1::OID[])
			UNION ALL
				SELECT
					ti.oid,
					ti.ns,
					ti.name,
					ti.kind,
					ti.basetype,
					ti.elemtype,
					ti.elemdelim,
					ti.range_subtype,
					ti.attrtypoids,
					ti.attrnames,
					tt.depth + 1
				FROM
					(
						SELECT
							t.oid AS oid,
							ns.nspname AS ns,
							t.typname AS name,
							t.typtype AS kind,
							(
								CASE
								WHEN t.typtype = 'd'
								THEN (
									WITH RECURSIVE
										typebases
											(oid, depth)
											AS (
												SELECT
													t2.typbasetype
														AS oid,
													0
														AS depth
												FROM
													pg_type
														AS t2
												WHERE
													t2.oid
													= t.oid
												UNION ALL
													SELECT
														t2.typbasetype
															AS oid,
														tb.depth
														+ 1
															AS depth
													FROM
														pg_type
															AS t2,
														typebases
															AS tb
													WHERE
														(
															tb.oid
															= t2.oid
														)
														AND (
																t2.typbasetype
																!= 0
															)
											)
									SELECT
										oid
									FROM
										typebases
									ORDER BY
										depth DESC
									LIMIT
										1
								)
								ELSE NULL
								END
							)
								AS basetype,
							t.typelem AS elemtype,
							elem_t.typdelim AS elemdelim,
							range_t.rngsubtype
								AS range_subtype,
							(
								CASE
								WHEN t.typtype = 'c'
								THEN (
									SELECT
										array_agg(
											ia.atttypid ORDER BY
												ia.attnum
										)
									FROM
										pg_attribute AS ia
										INNER JOIN pg_class
												AS c ON
												(
													ia.attrelid
													= c.oid
												)
									WHERE
										(
											(ia.attnum > 0)
											AND (
													NOT
														ia.attisdropped
												)
										)
										AND (
												c.reltype
												= t.oid
											)
								)
								ELSE NULL
								END
							)
								AS attrtypoids,
							(
								CASE
								WHEN t.typtype = 'c'
								THEN (
									SELECT
										array_agg(
											ia.attname::STRING ORDER BY
												ia.attnum
										)
									FROM
										pg_attribute AS ia
										INNER JOIN pg_class
												AS c ON
												(
													ia.attrelid
													= c.oid
												)
									WHERE
										(
											(ia.attnum > 0)
											AND (
													NOT
														ia.attisdropped
												)
										)
										AND (
												c.reltype
												= t.oid
											)
								)
								ELSE NULL
								END
							)
								AS attrnames
						FROM
							pg_catalog.pg_type AS t
							INNER JOIN pg_catalog.pg_namespace
									AS ns ON
									(
										ns.oid
										= t.typnamespace
									)
							LEFT JOIN pg_type AS elem_t ON
									(
										(
											(t.typlen = -1)
											AND (
													t.typelem
													!= 0
												)
										)
										AND (
												t.typelem
												= elem_t.oid
											)
									)
							LEFT JOIN pg_range AS range_t ON
									(
										t.oid
										= range_t.rngtypid
									)
					)
						AS ti,
					typeinfo_tree AS tt
				WHERE
					(
						(
							(tt.elemtype IS NOT NULL)
							AND (ti.oid = tt.elemtype)
						)
						OR (
								(tt.attrtypoids IS NOT NULL)
								AND (
										ti.oid
										= ANY (
												tt.attrtypoids
											)
									)
							)
					)
					OR (
							(tt.range_subtype IS NOT NULL)
							AND (ti.oid = tt.range_subtype)
						)
		)
SELECT
	DISTINCT
	*,
	basetype::REGTYPE::STRING AS basetype_name,
	elemtype::REGTYPE::STRING AS elemtype_name,
	range_subtype::REGTYPE::STRING AS range_subtype_name
FROM
	typeinfo_tree
ORDER BY
	depth DESC;

Where $1 can be {23}.

Source: https://github.com/MagicStack/asyncpg/blob/383c711eb68bc6a042c121e1fddfde0cdefb8068/asyncpg/introspection.py#L8

@otan
Copy link
Contributor

otan commented Oct 25, 2021

It also seems to be directly related to ANY() because if I remove the AND a.attnum = ANY(cons.conkey) then the error does not occur for cockroachdb+asyncpg.

I still get it without that line.

@otan
Copy link
Contributor

otan commented Oct 25, 2021

cc @ZhouXing19 i'm surprised this wasn't caught in the tests here:

fmt.Sprintf("cd /mnt/data1/asyncpg && PGPORT=%d python3 setup.py test", cockroachPort),

are we sure we're hitting cockroach with those queries?

we should also fix

to a certain release instead of master

@otan otan changed the title "could not decorrelate subquery" error under asyncpg with a query that uses ANY() "could not decorrelate subquery" error under asyncpg Oct 25, 2021
@rytaft rytaft self-assigned this Nov 2, 2021
@michae2 michae2 assigned msirek and unassigned rytaft Apr 26, 2022
msirek pushed a commit to msirek/cockroach that referenced this issue May 24, 2022
Fixes cockroachdb#80169
Fixes cockroachdb#71908

Previously, some queries which follow the pattern:
`SELECT CASE WHEN <cond> THEN <subquery> END FROM <table>` would error
out with a `could not decorrelate subquery` error. The reason is that
the subquery should only run in cases where `<cond>` is true or when it
is leak-proof (when running the subquery when `<cond>` is false couldn't
have side-effects, e.g. erroring out due to overflow during a CAST).
When the subquery is not leak-proof, it cannot be pulled above the
CASE expression into an apply join (which is a necessary first step
in executing a subquery expression).

To address this, this patch introduces a new normalization rule which
attempts to push the WHEN clause condition into the THEN clause subquery
and remove the CASE expression entirely (replace the CASE with the
subquery). The preconditions for this normalization are:
  1. There is a single WHEN clause.
  2. There is an ELSE NULL clause (either explicitly specified or
     implicit).
  3. The WHEN clause condition is not volatile (for example, the result
     is the same no matter how many times it is evaluated).
  4. The WHEN clause condition does not cause any side-effects, like
     writing rows to a table.
  5. The relational expressions in the THEN clause are only of the
     following types: (Select, Project, Limit, Offset, RecursiveCTE,
     InnerJoin, Scan, WithScan, ScalarGroupBy, Window).
  6. There are no aggregate functions which produce a non-null value
     when the input is empty, such as COUNT.
  7. There are no projected expressions above an aggregation in the
     subquery operation tree.

If these conditions are met, a new Select is placed above each Scan or
WithScan operation using the WHEN condition as a filter and the CASE
expression is replaced by the subquery.

Release note (sql change): This patch is adding support for some queries
which asyncpg generates internally, which previously would error out
with the message, "could not decorrelate subquery".
@jlinder jlinder removed the sync-me-3 label May 27, 2022
@rafiss rafiss mentioned this issue Jun 22, 2022
3 tasks
@gnat
Copy link

gnat commented Sep 8, 2022

Any updates on this? @knz Currently a blocker for ARRAY support in Piccolo ORM.

Example query where we get asyncpg.exceptions._base.UnknownPostgresError: could not decorrelate subquery:

SELECT "band"."name", 
        ARRAY(
            SELECT
                "inner_genre"."name"
            FROM 
"genre_to_band"
JOIN "band" "inner_band" ON (
    "genre_to_band"."band" = "inner_band"."id"
)
JOIN "genre" "inner_genre" ON (
    "genre_to_band"."genre" = "inner_genre"."id"
)
WHERE "genre_to_band"."band" = "band"."id"

        ) AS "genres"
     FROM band

gnat added a commit to gnat/piccolo that referenced this issue Sep 8, 2022
@knz
Copy link
Contributor

knz commented Sep 8, 2022

I think the title of this issue (71908) should be modified to refer to the specific kind of correlation mentioned by the OP.

@gnat your specific query is using a different structure, and should be filed as a different issue. Thanks

@knz knz changed the title "could not decorrelate subquery" error under asyncpg sql: query decorrelation error under recursive CTE in SQLAlchemy Sep 8, 2022
@rafiss rafiss changed the title sql: query decorrelation error under recursive CTE in SQLAlchemy sql: query decorrelation error under recursive CTE in asyncpg Sep 8, 2022
@rafiss
Copy link
Collaborator

rafiss commented Sep 8, 2022

We'll be working on subquery decorrelation (including this issue specifically) in the coming months, but it likely won't be in any release until v23.1 or later

@gnat
Copy link

gnat commented Sep 8, 2022

That's great. We'll move forward with a Cockroach release for Piccolo ORM relatively soon- if we cannot figure something out ourselves, we'll just flag the affected features as in-progress and keep an eye out here.

dantownsend added a commit to piccolo-orm/piccolo that referenced this issue Oct 11, 2022
* Cockroach config for tests.

* Added feedback from #607

* New CockroachDB engine.

* No more extension=[] !

* Running tests as postgres.

* Initial build for CockroachDB

* Running tests.

* Run workflow.

* Naming consistency. Making roughly equivalent to Postgres.

* Timestamp bug.

* DDL for Cockroach. for_engines decorator.

* Progress.

* Added database prep for cluster settings.

* Tests passing.

* Passing migration type checks.

* Cockroach ALTER COLUMN TYPE moved outside of transaction until #49351 is solved.

* New test helpers.

* Fixtures.

* Cockroach specific stuff.

* Array tests.

* Column tests.

* JSON tests.

* M2M tables. as_list is effected by cockroachdb/cockroach#71908

* Skip test for numeric.

* Pool test.

* Transaction tests.

* Related readable tests.

* Save tests.

* To dict tests.

* JOIN tests.

* Output tests.

* Raw tests.

* Repr tests.

* Select tests.

* "As string" tests.

* Cockroach having trouble with one specific operator.

* Skipping until CRDB issue is resolved. Should work though!

* Minor updates.

* Test dump load. Other CRDB things.

* Migration manager tests.

* ALTER tests.

* SQL Generator tests.

* Serialization tests.

* Test helpers.

* Cockroach Specific stuff.

* Numerics all the same for CRDB.

* Cockroach specific stuff.

* Cleanup.

* Skipping timestamp tests.

* DDL resolution stuff for Cockroach.

* WIP.

* Cohesion and special test tables for Cockroach.

* WIP

* Special table for Cockroach.

* Removed hack.

* Should pass now.

* Removed some debug stuff.

* LGTM stuff.

* Feedback.

* Feedback.

* Feedback.

* More readable "postgres" or "cockroach.

* Added Cockroach overrides for COLUMN_TYPE_MAP, COLUMN_DEFAULT_PARSER

* Restore SmallInt functionality for Cockroach.

* Restored type() comparison for reflection / generators.

* Updated to latest Cockroach release.

* Cleanup.

* Cleanup.

* Build stuff.

* Cockroach Serial now uses unique_rowid() as default value always instead of DEFAULT.

* Removed first_id() test helper.

* Time Travel queries using AS OF SYSTEM TIME

* Added documentation.

* Small docs fix.

* Test update.

* Refactored out unnecessary code because we use unique_rowid() as default for Cockroach Serial and BigSerial.

* BIGINT instead of BIGSERIAL for Cockroach.

* LGTM stuff

* LGTM...

* Made compatible with older Python versions.

* Get Cockroach passing without Python 3.9 stuff.

* Removed unused variable.

* Fixed test for SQLite.

* Re-add Python 3.7 for CRDB + new CRDB version.

* Consolidated cockroach_skip() into engines_skip() everywhere.

* Re-add SQLite DDL.

* Moved back to original engine DDL selector.

* Reverted mistake in test suite.

* Remove migration that should not be available.

* Moving back to postgres_only for specific test. Probably cannot detect engine type at this stage.

* postgres_only() to engine_only()

* Set sane default for JSONB because we override column_type in JSON for some engines.

* Ran isort.

* Black being obnoxious, part 1.

* Black being obnoxious Part 2.

* Flake8

* Black

* Flake8

* Flake8

* Flake8

* isort

* Flake8

* Flake8

* Added alternate test names for duplicates to remove F811

* Added alternative name.

* mypy

* mypy

* mypy

* mypy

* mypy

* mypy

* mypy

* mypy

* mypy

* Cockroach: Now testing latest stable and upcoming release.

* Cockroach: Testing only v22.2

* Cockroach version documentation.

* Engine detection now based on type because CockroachEngine inherits PostgresEngine.

* postgres_only is now engines_only("postgres", "cockroach") because CockroachEngine was always tested as a child class.

* Updated to Cockroach 22.2 beta.

* isort.

* assertAlmostEqual for Decimal test. Docs update.

* Linter

* Cockroach 22.2 Beta 2.

* Added notes about special Column types for Cockroach

* version pin xpresso, so integration tests pass

* add small section about running cockroach in contributing docs

* fix typo

Co-authored-by: Daniel Townsend <[email protected]>
mgartner added a commit to mgartner/cockroach that referenced this issue Jan 13, 2023
Previously, the optimizer would error in rare cases when it was unable
to hoist correlated subqueries into apply-joins. Now, scalar, correlated
subqueries that aren't hoisted are executed successfully. There is
remaining work to apply the same method in this commit to `EXISTS` and
`<op> ANY` subqueries.

Hoisting correlated subqueries is not possible when a conditional
expression, like a `CASE`, wraps a subquery that is not leak-proof. One
of the effects of hoisting a subquery is that the subquery will be
unconditionally evaluated. For leak-proof subqueries, the worst case is
that unnecessary computation is performed. For non-leak-proof
subqueries, errors could originate from the subquery when it should have
never been evaluated because the corresponding conditional expression
was never true. So, in order to support these cases, we must be able to
execute a correlated subquery.

A correlated subquery can be thought of as a relational expression with
parameters that need to be filled in with constant value arguments for
each invocation. It is essentially a user-defined function with a single
statement in the function body. So, the `tree.RoutineExpr` machinery
that powers UDFs is easily repurposed to facilitate evaluation of
correlated subqueries.

Fixes cockroachdb#71908
Fixes cockroachdb#73573
Fixes cockroachdb#80169

Release note (sql change): Some queries which previously resulted in the
error "could not decorrelate subquery" now succeed.
mgartner added a commit to mgartner/cockroach that referenced this issue Jan 13, 2023
Previously, the optimizer would error in rare cases when it was unable
to hoist correlated subqueries into apply-joins. Now, scalar, correlated
subqueries that aren't hoisted are executed successfully. There is
remaining work to apply the same method in this commit to `EXISTS` and
`<op> ANY` subqueries.

Hoisting correlated subqueries is not possible when a conditional
expression, like a `CASE`, wraps a subquery that is not leak-proof. One
of the effects of hoisting a subquery is that the subquery will be
unconditionally evaluated. For leak-proof subqueries, the worst case is
that unnecessary computation is performed. For non-leak-proof
subqueries, errors could originate from the subquery when it should have
never been evaluated because the corresponding conditional expression
was never true. So, in order to support these cases, we must be able to
execute a correlated subquery.

A correlated subquery can be thought of as a relational expression with
parameters that need to be filled in with constant value arguments for
each invocation. It is essentially a user-defined function with a single
statement in the function body. So, the `tree.RoutineExpr` machinery
that powers UDFs is easily repurposed to facilitate evaluation of
correlated subqueries.

Fixes cockroachdb#71908
Fixes cockroachdb#73573
Fixes cockroachdb#80169

Release note (sql change): Some queries which previously resulted in the
error "could not decorrelate subquery" now succeed.
craig bot pushed a commit that referenced this issue Jan 18, 2023
94670: metrics: add tenant name to _status/vars output r=knz a=dhartunian

This commit adds a `tenant` prometheus label to each metrics output via the
`_status/vars` HTTP endpoint. The label is populated with either "system" or
the name of the tenant generating the metric. When queried from the system
tenant's HTTP port or handler, the result will now also contain metrics from
all in-process tenants on the same node.

When initializing a tenant, an optional "parent" metrics recorder is passed
down allowing the tenant's recorder to be registered with the parent. When we
query the parent it iterates over all child recorders (much like we already do
for stores) and outputs all of their metrics as well.

Example:
```
sql_txn_commit_count{tenant="system"} 0
sql_txn_commit_count{tenant="demo-tenant"} 0
```

Resolves: #94663
Epic: CRDB-18798

Release note (ops change): Metrics output via `_status/vars` now contain
`tenant` labels allowing the user to distinguish between metrics emitted by
the system tenant vs other app tenants identified by their IDs.

Co-authored-by: Alex Barganier <[email protected]>
Co-authored-by: Aaditya Sondhi <[email protected]>

95234: sql: evaluate correlated subqueries as routines r=mgartner a=mgartner

#### opt: create tree.Routine planning closure in helper function

The logic for creating a planning closure for a `tree.Routine` has been
moved to a helper function so that it can be reused in future commits.

Release note: None

#### sql: evaluate correlated subqueries as routines

Previously, the optimizer would error in rare cases when it was unable
to hoist correlated subqueries into apply-joins. Now, scalar, correlated
subqueries that aren't hoisted are executed successfully. There is
remaining work to apply the same method in this commit to `EXISTS` and
`<op> ANY` subqueries.

Hoisting correlated subqueries is not possible when a conditional
expression, like a `CASE`, wraps a subquery that is not leak-proof. One
of the effects of hoisting a subquery is that the subquery will be
unconditionally evaluated. For leak-proof subqueries, the worst case is
that unnecessary computation is performed. For non-leak-proof
subqueries, errors could originate from the subquery when it should have
never been evaluated because the corresponding conditional expression
was never true. So, in order to support these cases, we must be able to
execute a correlated subquery.

A correlated subquery can be thought of as a relational expression with
parameters that need to be filled in with constant value arguments for
each invocation. It is essentially a user-defined function with a single
statement in the function body. So, the `tree.RoutineExpr` machinery
that powers UDFs is easily repurposed to facilitate evaluation of
correlated subqueries.

Fixes #71908
Fixes #73573
Fixes #80169

Release note (sql change): Some queries which previously resulted in the
error "could not decorrelate subquery" now succeed.


95432: kvserver,kvstorage: move InitEngine r=pavelkalinnikov a=tbg

I'm working on a datadriven test for `kvstorage` and I need to be able
to initialize the engine there.

There's more that should move, but I'm taking it one step at a time.
As we build up more and more sophisticated datadriven tests in
`kvstorage` we can move whatever we need when we need it.

PS: I looked at the unit test being modified by the move and it can't
yet be moved to kvstorage - it also bootstraps the initial ranges, etc,
requiring a lot more code movement until we can move it.

100% mechanical movement via Goland.

Touches #93310.

Epic: CRDB-220
Release note: None


Co-authored-by: David Hartunian <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
@craig craig bot closed this as completed in 61233f0 Jan 18, 2023
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tools-asyncpg C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner
Projects
Archived in project
10 participants