-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: "could not decorrelate subquery" on insert to enum with asyncpg #80169
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here. I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
cc @cockroachdb/bulk-io |
Thanks for filing! I ran this and saw that the INSERT query itself runs fine, but then right after
|
Could also be similar to #73573. |
Tagging @vy-ton |
@kernfeld-cockroach fyi there's a group of |
Looking at this query, there are at least 4 different things preventing decorrelation:
|
Well, the problem is that we'd likely need to use a I suspect that the best solution will be to do group-by decorrelation for most common cases, and then to fall back on generalized decorrelation for other cases that we can't handle well otherwise. But we'd need to do some performance testing to be sure. Also, note that generalized decorrelation does not solve the other problem with this query, which is that we're not able to create an |
I'm not sure I understand this point. Are you worried that rows will be inserted into the source relation of Edit: Oh, maybe if the input to |
Right, the input to |
Leakproof, short definition:
|
Fixes cockroachdb#80169 Fixes cockroachdb#71908 Previously, some queries which follow the pattern: `SELECT CASE WHEN <cond> THEN <subquery> END FROM <table>` would error out with a `could not decorrelate subquery` error. The reason is that the subquery should only run in cases where `<cond>` is true or when it is leak-proof (when running the subquery when `<cond>` is false couldn't have side-effects, e.g. erroring out due to overflow during a CAST). When the subquery is not leak-proof, it cannot be pulled above the CASE expression into an apply join (which is a necessary first step in executing a subquery expression). To address this, this patch introduces a new normalization rule which attempts to push the WHEN clause condition into the THEN clause subquery and remove the CASE expression entirely (replace the CASE with the subquery). The preconditions for this normalization are: 1. There is a single WHEN clause. 2. There is an ELSE NULL clause (either explicitly specified or implicit). 3. The WHEN clause condition is not volatile (for example, the result is the same no matter how many times it is evaluated). 4. The WHEN clause condition does not cause any side-effects, like writing rows to a table. 5. The relational expressions in the THEN clause are only of the following types: (Select, Project, Limit, Offset, RecursiveCTE, InnerJoin, Scan, WithScan, ScalarGroupBy, Window). 6. There are no aggregate functions which produce a non-null value when the input is empty, such as COUNT. 7. There are no projected expressions above an aggregation in the subquery operation tree. If these conditions are met, a new Select is placed above each Scan or WithScan operation using the WHEN condition as a filter and the CASE expression is replaced by the subquery. Release note (sql change): This patch is adding support for some queries which asyncpg generates internally, which previously would error out with the message, "could not decorrelate subquery".
Avoids 'cannot decorrelate query', until at least: * cockroachdb/cockroach#81706 * cockroachdb/cockroach#80169 are solved. Is good enough to introspect into enums at least.
Previously, the optimizer would error in rare cases when it was unable to hoist correlated subqueries into apply-joins. Now, scalar, correlated subqueries that aren't hoisted are executed successfully. There is remaining work to apply the same method in this commit to `EXISTS` and `<op> ANY` subqueries. Hoisting correlated subqueries is not possible when a conditional expression, like a `CASE`, wraps a subquery that is not leak-proof. One of the effects of hoisting a subquery is that the subquery will be unconditionally evaluated. For leak-proof subqueries, the worst case is that unnecessary computation is performed. For non-leak-proof subqueries, errors could originate from the subquery when it should have never been evaluated because the corresponding conditional expression was never true. So, in order to support these cases, we must be able to execute a correlated subquery. A correlated subquery can be thought of as a relational expression with parameters that need to be filled in with constant value arguments for each invocation. It is essentially a user-defined function with a single statement in the function body. So, the `tree.RoutineExpr` machinery that powers UDFs is easily repurposed to facilitate evaluation of correlated subqueries. Fixes cockroachdb#71908 Fixes cockroachdb#73573 Fixes cockroachdb#80169 Release note (sql change): Some queries which previously resulted in the error "could not decorrelate subquery" now succeed.
Previously, the optimizer would error in rare cases when it was unable to hoist correlated subqueries into apply-joins. Now, scalar, correlated subqueries that aren't hoisted are executed successfully. There is remaining work to apply the same method in this commit to `EXISTS` and `<op> ANY` subqueries. Hoisting correlated subqueries is not possible when a conditional expression, like a `CASE`, wraps a subquery that is not leak-proof. One of the effects of hoisting a subquery is that the subquery will be unconditionally evaluated. For leak-proof subqueries, the worst case is that unnecessary computation is performed. For non-leak-proof subqueries, errors could originate from the subquery when it should have never been evaluated because the corresponding conditional expression was never true. So, in order to support these cases, we must be able to execute a correlated subquery. A correlated subquery can be thought of as a relational expression with parameters that need to be filled in with constant value arguments for each invocation. It is essentially a user-defined function with a single statement in the function body. So, the `tree.RoutineExpr` machinery that powers UDFs is easily repurposed to facilitate evaluation of correlated subqueries. Fixes cockroachdb#71908 Fixes cockroachdb#73573 Fixes cockroachdb#80169 Release note (sql change): Some queries which previously resulted in the error "could not decorrelate subquery" now succeed.
94670: metrics: add tenant name to _status/vars output r=knz a=dhartunian This commit adds a `tenant` prometheus label to each metrics output via the `_status/vars` HTTP endpoint. The label is populated with either "system" or the name of the tenant generating the metric. When queried from the system tenant's HTTP port or handler, the result will now also contain metrics from all in-process tenants on the same node. When initializing a tenant, an optional "parent" metrics recorder is passed down allowing the tenant's recorder to be registered with the parent. When we query the parent it iterates over all child recorders (much like we already do for stores) and outputs all of their metrics as well. Example: ``` sql_txn_commit_count{tenant="system"} 0 sql_txn_commit_count{tenant="demo-tenant"} 0 ``` Resolves: #94663 Epic: CRDB-18798 Release note (ops change): Metrics output via `_status/vars` now contain `tenant` labels allowing the user to distinguish between metrics emitted by the system tenant vs other app tenants identified by their IDs. Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Aaditya Sondhi <[email protected]> 95234: sql: evaluate correlated subqueries as routines r=mgartner a=mgartner #### opt: create tree.Routine planning closure in helper function The logic for creating a planning closure for a `tree.Routine` has been moved to a helper function so that it can be reused in future commits. Release note: None #### sql: evaluate correlated subqueries as routines Previously, the optimizer would error in rare cases when it was unable to hoist correlated subqueries into apply-joins. Now, scalar, correlated subqueries that aren't hoisted are executed successfully. There is remaining work to apply the same method in this commit to `EXISTS` and `<op> ANY` subqueries. Hoisting correlated subqueries is not possible when a conditional expression, like a `CASE`, wraps a subquery that is not leak-proof. One of the effects of hoisting a subquery is that the subquery will be unconditionally evaluated. For leak-proof subqueries, the worst case is that unnecessary computation is performed. For non-leak-proof subqueries, errors could originate from the subquery when it should have never been evaluated because the corresponding conditional expression was never true. So, in order to support these cases, we must be able to execute a correlated subquery. A correlated subquery can be thought of as a relational expression with parameters that need to be filled in with constant value arguments for each invocation. It is essentially a user-defined function with a single statement in the function body. So, the `tree.RoutineExpr` machinery that powers UDFs is easily repurposed to facilitate evaluation of correlated subqueries. Fixes #71908 Fixes #73573 Fixes #80169 Release note (sql change): Some queries which previously resulted in the error "could not decorrelate subquery" now succeed. 95432: kvserver,kvstorage: move InitEngine r=pavelkalinnikov a=tbg I'm working on a datadriven test for `kvstorage` and I need to be able to initialize the engine there. There's more that should move, but I'm taking it one step at a time. As we build up more and more sophisticated datadriven tests in `kvstorage` we can move whatever we need when we need it. PS: I looked at the unit test being modified by the move and it can't yet be moved to kvstorage - it also bootstraps the initial ranges, etc, requiring a lot more code movement until we can move it. 100% mechanical movement via Goland. Touches #93310. Epic: CRDB-220 Release note: None Co-authored-by: David Hartunian <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Tobias Grieger <[email protected]>
Hi folks, I just wanted to get a random enum value to insert it into dummy data and this all started... At first, I had an issue with not random random() in subquery. I solved it with some magic by pushing an unnecessary parameter into a subquery so that it is not optimized and remains random. Prepared a simplified example:
|
What version of CockroachDB are you running? The example works for me on 24.1.0:
|
Yes, you are right, I had version v22.2.1, I updated to v24.1 and there is no error. |
Describe the problem
As originally reported in
sqlalchemy/sqlalchemy#7945
attempting to insert into a table with an enum column results in "could not decorrelate subquery".
To Reproduce
psycopg2 (2.9.3) - No issue
This code runs fine for both PostgreSQL (pg) and CockroachDB (cr)
asyncpg with PostgreSQL - No issue
asyncpg with CockroachDB - "could not decorrelate subquery"
Environment:
Epic CRDB-18886
Jira issue: CRDB-15824
The text was updated successfully, but these errors were encountered: