Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: remove incorrect query plans for trigram similarity filters #139265

Conversation

normanchenn
Copy link
Contributor

@normanchenn normanchenn commented Jan 16, 2025

Previously, the optimizer would produce incorrect query plans for queries with trigram similarity filters when pg_trgm.similarity_threshold == 0, producing incorrect results.

To address this, this patch adds a check to return early if pg_trgm.similarity_threshold == 0 in trigram similarity queries on inverted indices.

Fixes: #122443

Release note (bug fix): The optimizer could produce incorrect query plans for queries using trigram similarity filters (e.g. col % 'val') when pg_trgm.similarity_threshold was set to 0. This bug was introduced in v22.2.0 and is now fixed. Note that this issue does not affect v24.2.0+ releases when the optimizer_use_trigram_similarity_optimization session variable (introduced in v24.2.0) is set to its default value true, as it would skip this behaviour.

@normanchenn normanchenn requested a review from a team as a code owner January 16, 2025 21:18
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I have a couple of nits to improve things a bit. I also will defer to Marcus on whether we should add an optimizer test for this or not (it doesn't seem necessary to me).

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @normanchenn)


-- commits line 14 at r1:
nit: I would extend the release note a bit to mention a couple things:

  • that the bug was introduced in 22.2 version, and
  • that 24.2+ releases are unaffected if optimizer_use_trigram_similarity_optimization session variable has the default value true.

pkg/sql/opt/invertedidx/trigram.go line 76 at r1 (raw file):

		}

		// Do not plan inverted index scans when the trigram similarity threshold is 0

nit: missing period. Also I'd extend the comment a bit saying something like "because in this case all strings match the filter, yet we wouldn't perform a full scan of the inverted index".

@normanchenn normanchenn force-pushed the norman/trigram-similarity-inverted-indexes branch from 2a77492 to 3d207c4 Compare January 17, 2025 16:27
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: modulo Yahor's comments. I don't think we need an optimizer test—the logic test should be sufficient.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)

Previously, the optimizer would produce incorrect query plans for
queries with trigram similarity filters when
`pg_trgm.similarity_threshold == 0`, producing incorrect results.

To address this, this patch adds a check to return early if
`pg_trgm.similarity_threshold == 0` in trigram similarity queries on
inverted indices.

Fixes: cockroachdb#122443

Release note (bug fix): The optimizer could produce incorrect query
plans for queries using trigram similarity filters (e.g. `col % 'val'`) when
`pg_trgm.similarity_threshold` was set to 0. This bug was introduced in
v22.2.0 and is now fixed. Note that this issue does not affect v24.2.0+
releases when the `optimizer_use_trigram_similarity_optimization`
session variable (introduced in v24.2.0) is set to its default value
`true`, as it would skip this behaviour.
@normanchenn normanchenn force-pushed the norman/trigram-similarity-inverted-indexes branch from 3d207c4 to 496f35f Compare January 28, 2025 17:48
@normanchenn
Copy link
Contributor Author

TFTR!

bors r+

craig bot pushed a commit that referenced this pull request Jan 28, 2025
139265: opt: remove incorrect query plans for trigram similarity filters r=normanchenn a=normanchenn

Previously, the optimizer would produce incorrect query plans for queries with trigram similarity filters when `pg_trgm.similarity_threshold == 0`, producing incorrect results.

To address this, this patch adds a check to return early if `pg_trgm.similarity_threshold == 0` in trigram similarity queries on inverted indices.

Fixes: #122443

Release note (bug fix): The optimizer could produce incorrect query plans for queries using trigram similarity filters (e.g. `col % 'val'`) when `pg_trgm.similarity_threshold` was set to 0. This bug was introduced in v22.2.0 and is now fixed. Note that this issue does not affect v24.2.0+ releases when the `optimizer_use_trigram_similarity_optimization` session variable (introduced in v24.2.0) is set to its default value `true`, as it would skip this behaviour.

139914: cdctest: use legacy schema changer when sql smith is enabled r=aerfrei a=wenyihu6

This patch disables the declarative schema changer when TestChangefeedNemeses
uses SQLSmith. Enabling it causes a decoder error, so it is temporarily disabled
as a workaround. This ensures nemesis testing can run in CI with SQLSmith
without being skipped.

Informs: #137125
Release note: None

Co-authored-by: Norman Chen <[email protected]>
Co-authored-by: Wenyi Hu <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jan 28, 2025

Build failed (retrying...):

@craig craig bot merged commit aba435d into cockroachdb:master Jan 28, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: incorrect query plans for trigram similarity filters when pg_trgm.similarity_threshold=0
4 participants