-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add row-level security filter in query #17564
base: master
Are you sure you want to change the base?
Conversation
processing/src/test/java/org/apache/druid/query/RestrictedDataSourceTest.java
Fixed
Show fixed
Hide fixed
…it as the default interface for checking permission
resourceAction, | ||
authorizerMapper | ||
); | ||
if (!authResult.isAllowed()) { | ||
|
||
authResult.getPermissionErrorMessage(true).ifPresent(error -> { |
Check failure
Code scanning / CodeQL
User-controlled bypass of sensitive method High
this condition
user-controlled value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This review includes some high level design comments.
I haven't checked the tests yet (will do so in a follow up review). The things I'd be looking for in the tests are:
- we should have tests for the resources, including negative tests for resources (such as the Dart resource) that don't support policies yet. The negative tests should verify that we get an error like "this endpoint doesn't support policies".
- we should also have tests for the lower level pieces like
QueryLifecycle
, theDataSource
mapping, andRestrictedSegment
.
server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/Access.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/Access.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/RestrictedDataSource.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/RestrictedDataSource.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/DataSource.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/DataSource.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/segment/RestrictedSegment.java
Outdated
Show resolved
Hide resolved
…AuthorizationResult class, dart sql, msq sql, fix bug, added restricted data source to calcite test data
if (!(base instanceof TableDataSource)) { | ||
throw new IAE("Expected a TableDataSource, got [%s]", base.getClass()); | ||
} | ||
if (Objects.isNull(policy)) { | ||
throw new IAE("Policy can't be null for RestrictedDataSource"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passer-by nits:
- It is better to use DruidExceptions instead of IAE/ISE. That would inform the readers as to whom the error message is intended for. It also guides the wording of the error message. Currently, it is not actionable. It is fine if the check is defensive, but if it is a user-facing error it would be nice to word it in a way that's easier for users to understand/take action upon.
- The interpolations
[]
can be in a different manner, where the exception is still understandable even if the interpolated phrases are removed, like: "Incorrect datasource [%s] received, expected a TableDataSource"
throw new ForbiddenException(authResult.toString()); | ||
} | ||
final AuthorizationResult authResult = authorizeAccess(resource, key, action, request); | ||
authResult.getPermissionErrorMessage(true).ifPresent(error -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its weird that we throw a 401 if there is a permission error message. There should be a function available in AuthorizationResult to directly return a boolean indicating whether user is authorized or not.
server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java
Outdated
Show resolved
Hide resolved
public class RestrictedDataSource implements DataSource | ||
{ | ||
private final TableDataSource base; | ||
private final Policy policy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would work better as a List<Policy>
rather than a single Policy
. And Policy
should be an interface with the contract that each type of policy is never refined to have more security-relevant information. Instead, a new subclass should be created.
The reason is that one day we will want to add new kinds of policies, such as column-level filters. It's not going to work well to add them to this existing Policy
object, because it'll be difficult to ensure they are actually checked in all the correct places. (Some of those places may even be query types in third-party extensions.) The design needs to "blow up" if a new policy type is added that existing callers don't know about.
The list-of-policies achieves this, because if an existing caller encounters a Policy
of a type it does not recognize, it can throw an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank i didn't think about that before - planning to change Policy
to be an interface and add RowFilterPolicy
and NoRestrictionPolicy
subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this should be a List, otherwise auth extensions are going to have to some work to what might logically be multiple policies (in the general sense at least). Like imagine an auth setup like ldap or something where there are users and groups that allows defining policies at both levels, if we don't allow a list here than that extension would have to combine all of that stuff into a single Policy.
Maybe that is fine I guess, but it does seem like it would make extensions have to do more work instead of just mapping different logical types of policies such as row filtering and column access into a single CursorBuildSpec
transformation. I guess the benefit of a single policy would be that if there were multiple row filters in place for example it could be combined into a flat and
statement instead of like a = 'x' && (b = 'y' && c = 'z')
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Policies may come from multiple sources, and all need to be combined together by Authorizers. So, this either needs to be List<Policy>
or we need to have an AndPolicy
that can wrap multiple policies.
IMO, List<Policy>
is cleaner since it makes it easier to write code that combines policies and code that walks through policies. With a List
, that's just adding elements to a list or iterating a list. With Policy
, combining two policies would require creating a new AndPolicy
(and checking the input policies to see if they are already AndPolicy
; & if so, flattening them.) Walking through policies would require an instanceof AndPolicy
check.
So my vote is to go with List<Policy>
here.
{ | ||
protected final SegmentReference delegate; | ||
@Nullable | ||
private final DimFilter filter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a List<Policy>
just like in RestrictedDataSource
, and for the same reasons: #17564 (comment)
|
||
@Nullable | ||
@Override | ||
public <T> T as(@Nonnull Class<T> clazz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a problem here: some query engines use QueryableIndex
for direct access to columns and indexes, and we can't really make a clean restricted wrapper around QueryableIndex
(we can't anticipate what columns and indexes may occur in extensions). So we need some way for query engines to say "give me the QueryableIndex, I know what I'm doing and I will apply the policies directly".
IMO, a good way to do that is to offer an as(BypassRestrictedSegment.class)
which returns one of these:
class BypassRestrictedSegment implements SegmentReference
{
private final Segment segment;
private final List<Policy> policies;
public List<Policy> getPolicies()
{
return policies;
}
@Override
public CursorFactory asCursorFactory()
{
// Bypass policies
return delegate.asCursorFactory();
}
@Override
public <T> T as(Class<T> clazz)
{
// Bypass policies
return segment.as(clazz);
}
}
Policy-aware query engines that need to use QueryableIndex
directly are expected to:
- Call
as(QueryableIndex.class)
first. - If that returns null, try
as(BypassRestrictedSegment.class)
followed byas(QueryableIndex.class)
. - Call
getPolicies()
and apply the policies when using theQueryableIndex
. - Throw an exception if any unrecognized policies are found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call as(QueryableIndex.class) first.
If that returns null, try as(BypassRestrictedSegment.class) followed by as(QueryableIndex.class)
I was thinking that maybe this should throw an exception targeted at developers to indicate that they should call as(BypassRestrictedSegment.class)
, since imagining in the future if this were rolled out with some policies then the .as/asQueryableIndex methods would go from working to suddenly returning null.
Along with throwing an exception, would also maybe a helper method to indicate if the segment supports BypassRestrictedSegment
, though the same could also work if we just return null from .as, so I don't feel too strongly either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exception targeted at developers to indicate that they should call as(BypassRestrictedSegment.class)
I was thinking that query engines might have fall-back code to do something cursor-based if QueryableIndex
isn't available. They would need this anyway if they wanted to work on real-time data. Returning null
in these cases lets the fall-back code activate.
processing/src/main/java/org/apache/druid/query/policy/Policy.java
Outdated
Show resolved
Hide resolved
|
||
@Nullable | ||
@Override | ||
public <T> T as(@Nonnull Class<T> clazz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call as(QueryableIndex.class) first.
If that returns null, try as(BypassRestrictedSegment.class) followed by as(QueryableIndex.class)
I was thinking that maybe this should throw an exception targeted at developers to indicate that they should call as(BypassRestrictedSegment.class)
, since imagining in the future if this were rolled out with some policies then the .as/asQueryableIndex methods would go from working to suddenly returning null.
Along with throwing an exception, would also maybe a helper method to indicate if the segment supports BypassRestrictedSegment
, though the same could also work if we just return null from .as, so I don't feel too strongly either way.
|
||
private final boolean allowed; | ||
private final String message; | ||
// A policy restriction on top of table-level read access. It should be empty if there are no policy restrictions | ||
// or if access is requested for an action other than reading the table. | ||
private final Optional<Policy> policy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re other comments about lists, this should also be a list of Policy
server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/Policy.java
Outdated
Show resolved
Hide resolved
…icy class. Updated AuthorizationResult class as well.
server/src/test/java/org/apache/druid/initialization/AuthorizationResultTest.java
Fixed
Show fixed
Hide fixed
server/src/test/java/org/apache/druid/initialization/AuthorizationResultTest.java
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did another pass, still thinking about some things so will keep looking over stuff
if (!authResult.isUserWithNoRestriction()) { | ||
throw new ForbiddenException(authResult.getErrorMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is correct, since it would prevent users with policies from cancelling their own queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird but it is difficult to do something better without deeper changes to cancellation authorization. Ideally the way it works is that people are only allowed to cancel their own queries, unless they have STATE WRITE
, in which case they should be able to cancel any query. But that's not how cancellation works today, because we don't retain info about which user issued a query.
IMO, it'd be good to adjust this in a follow up, so we can change things so it works like the above ☝️. For the time being I think it is OK to err on the secure side, which means denying cancellation to people with policies.
server/src/main/java/org/apache/druid/server/QueryResource.java
Outdated
Show resolved
Hide resolved
* <li> no policy found | ||
* <li> the user has a no-restriction policy | ||
*/ | ||
public boolean isUserWithNoRestriction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming nit, there is no such thing as a "user" in Druid core, that is a construct that may or may not exist in specific auth extensions.
Beyond that, I'm not quite sure this is the correct model, is it overly restrictive since many of the former callers of isAllowed
are calling this, it seems like if there are any policies involved at all then the only thing that will be allowed is query calls, every other API would be disallowed, which doesn't seem the best behavior to me.
public class RestrictedDataSource implements DataSource | ||
{ | ||
private final TableDataSource base; | ||
private final Policy policy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this should be a List, otherwise auth extensions are going to have to some work to what might logically be multiple policies (in the general sense at least). Like imagine an auth setup like ldap or something where there are users and groups that allows defining policies at both levels, if we don't allow a list here than that extension would have to combine all of that stuff into a single Policy.
Maybe that is fine I guess, but it does seem like it would make extensions have to do more work instead of just mapping different logical types of policies such as row filtering and column access into a single CursorBuildSpec
transformation. I guess the benefit of a single policy would be that if there were multiple row filters in place for example it could be combined into a flat and
statement instead of like a = 'x' && (b = 'y' && c = 'z')
.
…est for SegmentMetadataQuery.
…uld work with basic access.
processing/src/main/java/org/apache/druid/query/policy/NoRestrictionPolicy.java
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/policy/Policy.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/policy/RowFilterPolicy.java
Show resolved
Hide resolved
processing/src/test/java/org/apache/druid/query/policy/RowFilterPolicyTest.java
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/policy/RowFilterPolicy.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
public Map<String, Optional<Policy>> getPolicy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 notes:
- It would be clearer to name this
getPolicyMap()
. - Please include javadoc, since it's not obvious what the keys are.
- The value should be a
List<Policy>
rather thanOptional<Policy>
.
@@ -155,10 +162,6 @@ public <T> QueryResponse<T> runSimple( | |||
final QueryResponse<T> queryResponse; | |||
try { | |||
preAuthorized(authenticationResult, authorizationResult); | |||
if (!authorizationResult.isAllowed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this? What happens now if a request is not authorized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is moved to preAuthorized
, if it didn't pass error throw DruidException
.
@@ -2184,7 +2184,7 @@ private TestHttpStatement( | |||
@Override | |||
protected void authorize( | |||
DruidPlanner planner, | |||
Function<Set<ResourceAction>, Access> authorizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some tests that use Policies to this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also include tests for the SqlTaskResource
endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added test for SqlResource, there's also test coverage in CalciteSelectQueryTest. For async query, there's also test coverage in MSQSelectTest and DartSqlResourceTest
@Override | ||
public boolean isCacheable(boolean isBroker) | ||
{ | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that this return false
here means that when policies are in play, caching won't be possible. This is OK for this patch. We'll want to address it in a follow up by adding getCacheKey()
to Policy
and using those to compute the cache key for RestrictedDataSource
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted
Add more tests for policy package, also updated the classes a bit. Added tests for restricted datasource in SqlResourceTest
This PR adds the ability to attach restrictions (e.x. row filters) to a query, thus restrict row-level data access for given users. Note that this feature is not directly available, and requires additional work in
Authorizer
.This PR has been tested in a local druid cluster with additional
Authorizer
changes.Description
A query follows these steps: initialize -> authorize -> execute. In the authorize step, the permissions are checked for all the required resources in the query. Before this PR, the authorize step only returns allow or deny access on a table. Granting access to a table means a user can see all data in this table. After this PR, the authorize step can return allow access along with restrictions (i.e. a row filter that must be applied to the table ), which restrict users' data access at row level. For example, customers can only see rows relevant to their company.
The
authorizeAllResourceActions
now returns aAuthorizationResult
instead ofAccess
. The main difference betweenAuthorizationResult
andAccess
is that the former contains a map of table with restriction (i.e.Policy
)In the authorize step of
QueryLifecycle
, it would add thePolicy
to tables in the datasource tree, transformTableDataSource
toRestrictedDataSource
. In the execute step,Policy
is enforced throughRestrictedSegment
andRestrictedCursorFactory
, the filter insidePolicy
would be attached to theCursorHolder
.Key changed/added classes in this PR
Policy
interface and subclasses.Policy
represents granular restrictions when reading tables.NoRestrictionPolicy
should be used by druid-internal and superuser.RowFilterPolicy
is a basic row filter restriction.Access
. AddedOptional<Policy> policy
field, which represents a restrictions returned fromauthorizer
. Also updated constructor.AuthorizationResult
. The class should be used for all the authorization calls, while theAccess
class is still used inAuthorizer
interface. It has an static variableALLOW_NO_RESTRICTION
, which should be used for all internal calls. The class contains:AuthorizationUtils
. It now consolidates all restrictions for authorizing resource actions into a restrictions map, which is included inAuthorizationResult
. Also updated javadoc for all public methods.RestrictedDataSource
, which wraps aTableDataSource
with aPolicy
.RestrictedSegment
, which represents a segment with a policy.BypassRestrictedSegment
, a backdoor to retrieve the wrapped segment and policy fromRestrictedSegment
.RestrictedCursorFactory
, enforces the policy onCursor
.DataSource
interface, added a sub type ofrestrict
, added a default methodmapWithRestriction
. Also:TableDataSource
, added the implementation ofmapWithRestriction
.JoinDataSource
can acceptRestrictedDataSource
as left-hand side datasource.QueryLifeCycle
, replacebaseQuery
withbaseQuery.withPolicyRestrictions
(except forSegmentMetadataQuery
).Caveats
UnionDataSource
doesn't work withRestrictedDataSource
, planning to fix that later.DartQueryMaker
andMSQTaskQueryMaker
, for now they would throw an error if there's any policy restrictions.This PR has: