-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: incorrect query plans for trigram similarity filters when pg_trgm.similarity_threshold=0
#122443
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-3-erroneous-edge-case
Database produces or stores erroneous data without visible error/warning, in rare edge cases.
T-sql-queries
SQL Queries Team
Comments
mgartner
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-3-erroneous-edge-case
Database produces or stores erroneous data without visible error/warning, in rare edge cases.
labels
Apr 16, 2024
This was discovered by @michae2 here: #121973 (review) |
This affects versions v22.2.0+. It has existed since trigram inverted indexes were introduced. |
normanchenn
added a commit
to normanchenn/cockroach
that referenced
this issue
Jan 16, 2025
Previously, the optimizer would produce incorrect query plans for queries with trigram similarity filters when `pg_trgm.similarity_threshold == 0`, producing incorrect results. To address this, this patch adds a check to return early if `pg_trgm.similarity_threshold == 0` in trigram similarity queries on inverted indices. Fixes: cockroachdb#122443 Release note (bug fix): The optimizer would produce incorrect query plans for some queries with trigram similarity filters when `pg_trgm.similarity_threshold` is set to 0. This has now been fixed.
normanchenn
added a commit
to normanchenn/cockroach
that referenced
this issue
Jan 17, 2025
Previously, the optimizer would produce incorrect query plans for queries with trigram similarity filters when `pg_trgm.similarity_threshold == 0`, producing incorrect results. To address this, this patch adds a check to return early if `pg_trgm.similarity_threshold == 0` in trigram similarity queries on inverted indices. Fixes: cockroachdb#122443 Release note (bug fix): The optimizer could produce incorrect query plans for queries using trigram similarity filters when `pg_trgm.similarity_threshold` is set to 0. This bug was introduced in v22.2.0. It does not affect v24.2.0+ releases, provided the `optimizer_use_trigram_similarity_optimization` session variable is set to its default value `true`.
craig bot
pushed a commit
that referenced
this issue
Jan 28, 2025
139265: opt: remove incorrect query plans for trigram similarity filters r=normanchenn a=normanchenn Previously, the optimizer would produce incorrect query plans for queries with trigram similarity filters when `pg_trgm.similarity_threshold == 0`, producing incorrect results. To address this, this patch adds a check to return early if `pg_trgm.similarity_threshold == 0` in trigram similarity queries on inverted indices. Fixes: #122443 Release note (bug fix): The optimizer could produce incorrect query plans for queries using trigram similarity filters (e.g. `col % 'val'`) when `pg_trgm.similarity_threshold` was set to 0. This bug was introduced in v22.2.0 and is now fixed. Note that this issue does not affect v24.2.0+ releases when the `optimizer_use_trigram_similarity_optimization` session variable (introduced in v24.2.0) is set to its default value `true`, as it would skip this behaviour. 139914: cdctest: use legacy schema changer when sql smith is enabled r=aerfrei a=wenyihu6 This patch disables the declarative schema changer when TestChangefeedNemeses uses SQLSmith. Enabling it causes a decoder error, so it is temporarily disabled as a workaround. This ensures nemesis testing can run in CI with SQLSmith without being skipped. Informs: #137125 Release note: None Co-authored-by: Norman Chen <[email protected]> Co-authored-by: Wenyi Hu <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Jan 28, 2025
139265: opt: remove incorrect query plans for trigram similarity filters r=normanchenn a=normanchenn Previously, the optimizer would produce incorrect query plans for queries with trigram similarity filters when `pg_trgm.similarity_threshold == 0`, producing incorrect results. To address this, this patch adds a check to return early if `pg_trgm.similarity_threshold == 0` in trigram similarity queries on inverted indices. Fixes: #122443 Release note (bug fix): The optimizer could produce incorrect query plans for queries using trigram similarity filters (e.g. `col % 'val'`) when `pg_trgm.similarity_threshold` was set to 0. This bug was introduced in v22.2.0 and is now fixed. Note that this issue does not affect v24.2.0+ releases when the `optimizer_use_trigram_similarity_optimization` session variable (introduced in v24.2.0) is set to its default value `true`, as it would skip this behaviour. Co-authored-by: Norman Chen <[email protected]>
Hi @normanchenn, please add branch-* labels to identify which branch(es) this C-bug affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-3-erroneous-edge-case
Database produces or stores erroneous data without visible error/warning, in rare edge cases.
T-sql-queries
SQL Queries Team
The optimizer produces incorrect query plans for queries with trigram similarity filters, e.g.,
col % 'foo'
, whenpg_trgm.similarity_threshold
is set to0
. These incorrect query plans produce incorrect results. For example:When
pg_trgm.similarity_threshold
is set to0
, all strings are similar:Therefore, there is no subset of trigrams in the inverted index we can scan in order to find all rows matching the filter. We'd need to scan the entire inverted index. But that would be less efficient than scanning the entire primary index, so we simply shouldn't plan an inverted index scan at all.
Jira issue: CRDB-37893
The text was updated successfully, but these errors were encountered: