Redact sensitive information in catalog queries #23104

piotrrzysko · 2024-08-22T07:04:12Z

Description

This a follow-up to #23103 that introduces redacting of security-sensitive information in statements containing connector properties, specifically:

CREATE CATALOG
EXPLAIN CREATE CATALOG
EXPLAIN ANALYZE CREATE CATALOG

The current approach is as follows:

For syntactically valid statements, only properties containing sensitive information are masked.
If a valid query references a nonexistent connector, all properties are masked.
For queries that fail before or during parsing and contain the word 'catalog,' we attempt to find and mask property assignments using a regular expression. This is a best-effort approach.

Redacted queries are returned through the REST API, the system.runtime.queries table, and query events (QueryCreatedEvent and QueryCompletedEvent).

Notice that currently this PR includes two commits from #23103.

Additional context and related issues

Follow-up to Add connector SPI for returning security-sensitive properties #23103

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Redact sensitive information in statements containing connector properties. ({issue}`issuenumber`)

The SPI will be used by the engine to mask security-sensitive information in statements that manage catalogs. It has been added at the connector factory level, rather than the connector level, to allow more flexibility in retrieving properties. In some cases, we want to perform masking before a connector is initiated. For example, when we create a new catalog by issuing the CREATE CATALOG statement.

The purpose of the included test is to identify security-sensitive properties that may be used by the connector. It uses the output generated by the maven-dependency-plugin, configured in the connector's pom.xml file. This output contains the connector's runtime classpath, which is then scanned to identify all property names annotated with @ConfigSecuritySensitive. Scanning the classpath ensures that all configuration classes are included, even those used conditionally.

This commit introduces redacting of security-sensitive information in statements containing connector properties, specifically: * CREATE CATALOG * EXPLAIN CREATE CATALOG * EXPLAIN ANALYZE CREATE CATALOG The current approach is as follows: * For syntactically valid statements, only properties containing sensitive information are masked. * If a valid query references a nonexistent connector, all properties are masked. * For queries that fail before or during parsing and contain the word 'catalog,' we attempt to find and mask property assignments using a regular expression. This is a best-effort approach. The redacted form is created in DispatchManager and is propagated to all places that create QueryInfo and BasicQueryInfo. Before this change, QueryInfo/BasicQueryInfo stored the raw query text received from the end user. From now on, the text will be altered for the cases listed above.

@JsonConstructor for TrimmedBasicQueryInfo was introduced to facilitate the deserialization of server responses in tests.

piotrrzysko · 2024-12-23T11:40:33Z

Closing in favour of #24563

cla-bot bot added the cla-signed label Aug 22, 2024

piotrrzysko mentioned this pull request Aug 22, 2024

Add connector SPI for returning security-sensitive properties #23103

Closed

piotrrzysko force-pushed the piotrrzysko/redact-catalog-queries branch from 1f8a715 to 83b91cd Compare August 22, 2024 07:10

piotrrzysko added 3 commits August 22, 2024 09:31

Ensure queries in system.runtime.queries are redacted

e915378

Ensure queries returned via REST API are redacted

c444c3e

@JsonConstructor for TrimmedBasicQueryInfo was introduced to facilitate the deserialization of server responses in tests.

piotrrzysko force-pushed the piotrrzysko/redact-catalog-queries branch from 83b91cd to c444c3e Compare August 22, 2024 07:31

hashhar mentioned this pull request Aug 22, 2024

Redact properties from CREATE CATALOG in query info, so they are not present in any outputs #23106

Open

piotrrzysko closed this Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redact sensitive information in catalog queries #23104

Redact sensitive information in catalog queries #23104

piotrrzysko commented Aug 22, 2024 •

edited

Loading

piotrrzysko commented Dec 23, 2024

Redact sensitive information in catalog queries #23104

Redact sensitive information in catalog queries #23104

Conversation

piotrrzysko commented Aug 22, 2024 • edited Loading

Description

Additional context and related issues

Release notes

piotrrzysko commented Dec 23, 2024

piotrrzysko commented Aug 22, 2024 •

edited

Loading