-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for authorizing query context params #12396
Add support for authorizing query context params #12396
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm, this seems pretty useful to be able to control access to this stuff 🤘
* using {@link #withContext(QueryContext)}. When callers use query context, they should check first | ||
* if the query holder has a valid query context using {@link #isValidContext()}. | ||
*/ | ||
public class QueryHolder<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels somewhat sad given the existence of QueryPlus
, the other wrapper, though i guess it is used for processing queries instead of the lead up to it... still, seems like we have too many wrappers...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added QueryHolder
at first because it was quite confusing which is the valid context between Query.getContext()
and QueryContext
. QueryHolder
was a stateful object indicating whether Query.getContext
is valid. QueryLifecycle
could find a valid query context using QueryHolder.isValidContext()
. A better approach is replacing the context map in Query
with QueryContext
as we can consolidate the query context stores. I didn't do it at first because it seems quite invasive. However, I was curious how invasive it would be, so went ahead and tried it. It actually doesn't seem that invasive, this PR is rather a little bit smaller than it was before.
The Query
interface now has getQueryContext()
which returns QueryContext
. This new interface is preferred over Query.getContext()
which internally simply calls QueryContext.getMergedParams()
. QueryContext
in Query
is "valid" only in the broker in a sense that defaultParams
, userParams
, and systemParams
will not be kept after serialization. All parameters will be stored in userParams
after it is deserialized. This should not cause any issue today.
* - Updated with context after authorization in {@link #doAuthorize}. | ||
*/ | ||
@MonotonicNonNull | ||
private QueryHolder<?> baseQuery; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems error prone to have two contexts like this with one hidden in the wrapper, I think it might be easier to follow what is going on here to keep this as a Query
and have a baseContext
separate from the context
.
queryId | ||
); | ||
|
||
final NonnullPair<QueryHolder<?>, QueryContext> pair = readQuery(req, in, ioReaderWriter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again this feels confusing to have two query contexts, one of which is wrapped, i'd personally probably rather just see them all separately to make it easier to follow what is what
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice 👍
@@ -43,14 +42,15 @@ | |||
*/ | |||
public class DruidConnection | |||
{ | |||
private static final Logger LOG = new Logger(DruidConnection.class); | |||
private static final Set<String> SENSITIVE_CONTEXT_FIELDS = Sets.newHashSet( | |||
public static final Set<String> SENSITIVE_CONTEXT_FIELDS = ImmutableSet.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't used here anymore, and can be moved to DruidMeta
@@ -343,11 +359,56 @@ public void emitLogsAndMetrics( | |||
} | |||
} | |||
|
|||
public Query getQuery() | |||
/** | |||
* Returns the Query wrapped inside QueryHolder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: stale javadoc
private Query baseQuery; | ||
|
||
/** | ||
* A holder for the user query to run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: stale javadoc
return Objects.hash(getMergedParams()); | ||
} | ||
|
||
// TODO: toString? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unresolved todo?
* You can use {@code getX} methods or {@link #getMergedParams()} to compute the context params | ||
* merging 3 types of params above. | ||
* | ||
* Currently, this class is mainly used for query context parameter authorization in query entires, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Currently, this class is mainly used for query context parameter authorization in query entires, | |
* Currently, this class is mainly used for query context parameter authorization, |
* Auto-generated queryId or sqlQueryId are also set as default parameters. These default parameters can | ||
* be overridden by user or system parameters. | ||
* - User parameters. These are the params set by the user. User params override default parameters but | ||
* are overridden by system paramters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* are overridden by system paramters. | |
* are overridden by system parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Added some minor comments.
We could also update the utility methods inside QueryContexts
that accept a map to now accept a QueryContext
itself but this can be done later.
return makeSQLQueryRequest(httpClient, query, ImmutableMap.of(), expectedStatus); | ||
} | ||
|
||
protected StatusResponseHolder makeSQLQueryRequest( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Maybe rename to something like makeSqlRequestAndVerifyStatus
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually not a new method, but I added a new parameter for this method. I would like to not rename it as this PR is already quite big.
@@ -95,8 +94,11 @@ | |||
|
|||
DateTimeZone getTimezone(); | |||
|
|||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please add a comment/javadoc about the deprecation and the alternative.
return false; | ||
} | ||
QueryContext context = (QueryContext) o; | ||
return getMergedParams().equals(context.getMergedParams()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
As we are comparing only the final merged params, two QueryContext objects that are currently equal might not be so after performing the same operation (say removeUserParam
) on the two of them.
For example:
Context1: userParam={p1=10}, systemParam={}
Context2: userParam={}, systemParam={p1=10}
These two are currently equal.
But after performing the same operation, say removeUserParam(p1)
, the two contexts will not remain equal anymore.
I guess this should be okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scenario you described sounds OK to me. defaultParams
, userParams
, or systemParams
are to track how those params are set, and should not affect the equality test of the queryContext object. Any particular concern about the behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No concern as such. Just wanted to be sure.
protected abstract void setupDatasourceAndSysTableUser() throws Exception; | ||
protected abstract void setupDatasourceAndSysAndStateUser() throws Exception; | ||
protected abstract void setupSysTableAndStateOnlyUser() throws Exception; | ||
protected abstract void setupTestSpecificHttpClients() throws Exception; | ||
protected abstract String getAuthenticatorName(); | ||
protected abstract String getAuthorizerName(); | ||
protected abstract String getExpectedAvaticaAuthError(); | ||
protected abstract Properties getAvaticaConnectionProperties(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
Maybe we should just add a new testAvaticaQuery(properties, url)
and leave the existingtestAvaticaQuery
as is?
Otherwise, we are forced to pass the properties to testAvaticaQuery
in every downstream test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was my bad that I removed this method before. I missed that implementations of this class can have a different implementation for this method. I added them back.
for testAvaticaQuery
, I think it's OK to change the method signature as the new signature makes it clearer what user's credentials are used for the test.
} else { | ||
mergedUserAndConfigContext = defaultQueryConfig.getContext(); | ||
} | ||
baseQuery.getQueryContext().addDefaultParam(BaseQuery.QUERY_ID, UUID.randomUUID().toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner now!
new QueryContext( | ||
ImmutableMap.of( | ||
PlannerConfig.CTX_KEY_USE_GROUPING_SET_FOR_EXACT_DISTINCT, | ||
"true" | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe add a private util method for this? This code is present in 4 places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if that is worth. I think we rather need to refactor and split CalciteQueryTest
since this class is too big now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on splitting up the CalciteQueryTest
final DruidStatement statement = new DruidStatement( | ||
"", | ||
0, | ||
new QueryContext(), | ||
sqlLifecycleFactory.factorize(), | ||
() -> {} | ||
).prepare(sql, -1, AllowAllAuthenticator.ALLOW_ALL_RESULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe add a createStatement(sql)
private util method for this to avoid code repetition.
@@ -74,7 +73,7 @@ private DruidJoinRule(final PlannerContext plannerContext) | |||
operand(DruidRel.class, any()) | |||
) | |||
); | |||
this.enableLeftScanDirect = QueryContexts.getEnableJoinLeftScanDirect(plannerContext.getQueryContext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isEnableJoinLeftScanDirect
seems too specific a use case to be a part of the base QueryContext
class itself. Why have we moved it from QueryContexts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, now QueryContexts
should remain just as a holder of predefined context parameters. It doesn't give any more benefit over QueryContext
. getEnableJoinLeftScanDirect
is just a util method for our convenience and they all should be moved to QueryContext
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 Moving all of these methods to QueryContext
would be much better than it is now.
@@ -158,43 +156,37 @@ public boolean isUseNativeQueryExplain() | |||
return useNativeQueryExplain; | |||
} | |||
|
|||
public PlannerConfig withOverrides(final Map<String, Object> context) | |||
public PlannerConfig withOverrides(final QueryContext queryAndContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Rename to queryContext
@@ -135,32 +139,29 @@ public SqlLifecycle( | |||
* | |||
* If successful (it will be), it will transition the lifecycle to {@link State#INITIALIZED}. | |||
*/ | |||
public String initialize(String sql, Map<String, Object> queryContext) | |||
public String initialize(String sql, QueryContext queryAndContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
public String initialize(String sql, QueryContext queryAndContext) | |
public String initialize(String sql, QueryContext queryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤘
tagging |
@clintropolis maybe we can add a default implementation for |
|
Adds a default implementation of getQueryContext, which was added to the Query interface in #12396. Query is marked with @ExtensionPoint, and lately we have been trying to be less volatile on these interfaces by providing default implementations to be more chill for extension writers. The way this default implementation is done in this PR is a bit strange due to the way that getQueryContext is used (mutated with system default and system generated keys); the default implementation has a specific object that it returns, and I added another temporary default method isLegacyContext that checks if the getQueryContext returns that object or not. If not, callers fall back to using getContext and withOverriddenContext to set these default and system values. I am open to other ideas as well, but this way should work at least without exploding, and added some tests to ensure that it is wired up correctly for QueryLifecycle, including the context authorization stuff. The added test shows the strange behavior if query context authorization is enabled, mainly that the system default and system generated query context keys also need to be granted as permissions for things to function correctly. This is not great, so I mentioned it in the javadocs as well. Not sure if it needs to be called out anywhere else.
Adds a default implementation of getQueryContext, which was added to the Query interface in apache#12396. Query is marked with @ExtensionPoint, and lately we have been trying to be less volatile on these interfaces by providing default implementations to be more chill for extension writers. The way this default implementation is done in this PR is a bit strange due to the way that getQueryContext is used (mutated with system default and system generated keys); the default implementation has a specific object that it returns, and I added another temporary default method isLegacyContext that checks if the getQueryContext returns that object or not. If not, callers fall back to using getContext and withOverriddenContext to set these default and system values. I am open to other ideas as well, but this way should work at least without exploding, and added some tests to ensure that it is wired up correctly for QueryLifecycle, including the context authorization stuff. The added test shows the strange behavior if query context authorization is enabled, mainly that the system default and system generated query context keys also need to be granted as permissions for things to function correctly. This is not great, so I mentioned it in the javadocs as well. Not sure if it needs to be called out anywhere else.
Adds a default implementation of getQueryContext, which was added to the Query interface in #12396. Query is marked with @ExtensionPoint, and lately we have been trying to be less volatile on these interfaces by providing default implementations to be more chill for extension writers. The way this default implementation is done in this PR is a bit strange due to the way that getQueryContext is used (mutated with system default and system generated keys); the default implementation has a specific object that it returns, and I added another temporary default method isLegacyContext that checks if the getQueryContext returns that object or not. If not, callers fall back to using getContext and withOverriddenContext to set these default and system values. I am open to other ideas as well, but this way should work at least without exploding, and added some tests to ensure that it is wired up correctly for QueryLifecycle, including the context authorization stuff. The added test shows the strange behavior if query context authorization is enabled, mainly that the system default and system generated query context keys also need to be granted as permissions for things to function correctly. This is not great, so I mentioned it in the javadocs as well. Not sure if it needs to be called out anywhere else. Co-authored-by: Clint Wylie <[email protected]>
Description
The query context is a way that the user gives a hint to the Druid query engine, so that they enforce a certain behavior or at least let the query engine prefer a certain plan during query planning. Today, there are 3 types of query context params as below.
druid.query.default.context
in runtime properties. Any user context params can be default params.Today, any context params are allowed to users. This can cause 1) a bad UX if the context param is not matured yet or 2) even query failure or system fault in the worst case if a sensitive param is abused, ex)
maxSubqueryRows
.This PR adds an ability to limit context params per user role. That means, a query will fail if you have a context param set in the query that is not allowed to you. To do that, this PR adds a new built-in resource type,
QUERY_CONTEXT
. The resource to authorize has a name of the context param (such asmaxSubqueryRows
) and the type ofQUERY_CONTEXT
. To allow a certain context param for a user, the user should be grantedWRITE
permission on the context param resource. Here is an example of the permission.Each role can have multiple permissions for context params. Each permission should be set for different context params.
When a query is issued with a query context
X
, the query will fail if the user who issued the query does not have WRITE permission on the query contextX
. In this case,Note: there is a context param called
brokerService
that is used only by the router. This param is used to pin your query to run it in a specific broker. Because the authorization is done not in the router, but in the broker, if you havebrokerService
set in your query without a proper permission, your query will fail in the broker after routing is done. Technically, this is not right because the authorization is checked after the context param takes effect. However, this should not cause any user-facing issue and thus should be OK. The query will still fail if the user doesn’t have permission forbrokerService
.The context param authorization can be enabled using
druid.auth.authorizeQueryContextParams
. This is disabled by default to avoid any hassle when someone upgrades his cluster blindly without reading release notes.Key changed/added classes in this PR
QueryContext
tracks user params and separates them from others.QueryHolder
has a state indicating whether the context in the native query is valid.QueryLifecycle
retrieves context params from a valid source.This PR has: