Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add row-level security filter in query #17564

Open
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

cecemei
Copy link
Contributor

@cecemei cecemei commented Dec 13, 2024

This PR adds the ability to attach restrictions (e.x. row filters) to a query, thus restrict row-level data access for given users. Note that this feature is not directly available, and requires additional work in Authorizer.

This PR has been tested in a local druid cluster with additional Authorizer changes.

Description

A query follows these steps: initialize -> authorize -> execute. In the authorize step, the permissions are checked for all the required resources in the query. Before this PR, the authorize step only returns allow or deny access on a table. Granting access to a table means a user can see all data in this table. After this PR, the authorize step can return allow access along with restrictions (i.e. a row filter that must be applied to the table ), which restrict users' data access at row level. For example, customers can only see rows relevant to their company.

The authorizeAllResourceActions now returns a AuthorizationResult instead of Access. The main difference between AuthorizationResult and Access is that the former contains a map of table with restriction (i.e. Policy)

In the authorize step of QueryLifecycle, it would add the Policy to tables in the datasource tree, transform TableDataSource to RestrictedDataSource. In the execute step, Policy is enforced through RestrictedSegment and RestrictedCursorFactory, the filter inside Policy would be attached to the CursorHolder.

Key changed/added classes in this PR
  1. Policy interface and subclasses.
    • a Policy represents granular restrictions when reading tables.
    • NoRestrictionPolicy should be used by druid-internal and superuser.
    • RowFilterPolicy is a basic row filter restriction.
  2. Access. Added Optional<Policy> policy field, which represents a restrictions returned from authorizer. Also updated constructor.


  3. AuthorizationResult. The class should be used for all the authorization calls, while the Access class is still used in Authorizer interface. It has an static variable ALLOW_NO_RESTRICTION, which should be used for all internal calls. The class contains:
    • permission, an enum defines the permission check result.
    • failure message if authorization deny access. This is null when auth is allowed, and is the error message of the first resource action authorization failure (there might be more failures, but we don't try further)
    • a map of table name with optional policy.
  4. AuthorizationUtils. It now consolidates all restrictions for authorizing resource actions into a restrictions map, which is included in AuthorizationResult. Also updated javadoc for all public methods.


  5. RestrictedDataSource, which wraps a TableDataSource with a Policy.
  6. RestrictedSegment, which represents a segment with a policy.
  7. BypassRestrictedSegment, a backdoor to retrieve the wrapped segment and policy from RestrictedSegment.
  8. RestrictedCursorFactory, enforces the policy on Cursor.


  9. DataSource interface, added a sub type of restrict, added a default method mapWithRestriction. Also:
    • TableDataSource, added the implementation of mapWithRestriction.
    • JoinDataSource can accept RestrictedDataSource as left-hand side datasource.
  10. QueryLifeCycle, replace baseQuery with baseQuery.withPolicyRestrictions (except for SegmentMetadataQuery).


Caveats

  • The restrictions don't apply to VIEWs.
  • UnionDataSource doesn't work with RestrictedDataSource, planning to fix that later.
  • The restrictions don't apply to DartQueryMaker and MSQTaskQueryMaker, for now they would throw an error if there's any policy restrictions.
  • In some places the access check might be overly-restrictive (e.x. cancel query).

This PR has:

  • been self-reviewed.
  • [] added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@cecemei cecemei marked this pull request as ready for review December 13, 2024 20:45
…it as the default interface for checking permission
Comment on lines 613 to 617
resourceAction,
authorizerMapper
);
if (!authResult.isAllowed()) {

authResult.getPermissionErrorMessage(true).ifPresent(error -> {

Check failure

Code scanning / CodeQL

User-controlled bypass of sensitive method High

Sensitive method may not be executed depending on a
this condition
, which flows from
user-controlled value
.
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review includes some high level design comments.

I haven't checked the tests yet (will do so in a follow up review). The things I'd be looking for in the tests are:

  • we should have tests for the resources, including negative tests for resources (such as the Dart resource) that don't support policies yet. The negative tests should verify that we get an error like "this endpoint doesn't support policies".
  • we should also have tests for the lower level pieces like QueryLifecycle, the DataSource mapping, and RestrictedSegment.

@@ -798,6 +799,20 @@

}

@MethodSource("data")
@ParameterizedTest(name = "{index}:with context {0}")
public void testSelectRestricted(String contextName, Map<String, Object> context)

Check notice

Code scanning / CodeQL

Useless parameter Note test

The parameter 'contextName' is never used.
Comment on lines 77 to 81
if (!(base instanceof TableDataSource)) {
throw new IAE("Expected a TableDataSource, got [%s]", base.getClass());
}
if (Objects.isNull(policy)) {
throw new IAE("Policy can't be null for RestrictedDataSource");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passer-by nits:

  1. It is better to use DruidExceptions instead of IAE/ISE. That would inform the readers as to whom the error message is intended for. It also guides the wording of the error message. Currently, it is not actionable. It is fine if the check is defensive, but if it is a user-facing error it would be nice to word it in a way that's easier for users to understand/take action upon.
  2. The interpolations [] can be in a different manner, where the exception is still understandable even if the interpolated phrases are removed, like: "Incorrect datasource [%s] received, expected a TableDataSource"

throw new ForbiddenException(authResult.toString());
}
final AuthorizationResult authResult = authorizeAccess(resource, key, action, request);
authResult.getPermissionErrorMessage(true).ifPresent(error -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its weird that we throw a 401 if there is a permission error message. There should be a function available in AuthorizationResult to directly return a boolean indicating whether user is authorized or not.

public class RestrictedDataSource implements DataSource
{
private final TableDataSource base;
private final Policy policy;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work better as a List<Policy> rather than a single Policy. And Policy should be an interface with the contract that each type of policy is never refined to have more security-relevant information. Instead, a new subclass should be created.

The reason is that one day we will want to add new kinds of policies, such as column-level filters. It's not going to work well to add them to this existing Policy object, because it'll be difficult to ensure they are actually checked in all the correct places. (Some of those places may even be query types in third-party extensions.) The design needs to "blow up" if a new policy type is added that existing callers don't know about.

The list-of-policies achieves this, because if an existing caller encounters a Policy of a type it does not recognize, it can throw an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank i didn't think about that before - planning to change Policy to be an interface and add RowFilterPolicy and NoRestrictionPolicy subclass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should be a List, otherwise auth extensions are going to have to some work to what might logically be multiple policies (in the general sense at least). Like imagine an auth setup like ldap or something where there are users and groups that allows defining policies at both levels, if we don't allow a list here than that extension would have to combine all of that stuff into a single Policy.

Maybe that is fine I guess, but it does seem like it would make extensions have to do more work instead of just mapping different logical types of policies such as row filtering and column access into a single CursorBuildSpec transformation. I guess the benefit of a single policy would be that if there were multiple row filters in place for example it could be combined into a flat and statement instead of like a = 'x' && (b = 'y' && c = 'z').

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Policies may come from multiple sources, and all need to be combined together by Authorizers. So, this either needs to be List<Policy> or we need to have an AndPolicy that can wrap multiple policies.

IMO, List<Policy> is cleaner since it makes it easier to write code that combines policies and code that walks through policies. With a List, that's just adding elements to a list or iterating a list. With Policy, combining two policies would require creating a new AndPolicy (and checking the input policies to see if they are already AndPolicy; & if so, flattening them.) Walking through policies would require an instanceof AndPolicy check.

So my vote is to go with List<Policy> here.

{
protected final SegmentReference delegate;
@Nullable
private final DimFilter filter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a List<Policy> just like in RestrictedDataSource, and for the same reasons: #17564 (comment)


@Nullable
@Override
public <T> T as(@Nonnull Class<T> clazz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a problem here: some query engines use QueryableIndex for direct access to columns and indexes, and we can't really make a clean restricted wrapper around QueryableIndex (we can't anticipate what columns and indexes may occur in extensions). So we need some way for query engines to say "give me the QueryableIndex, I know what I'm doing and I will apply the policies directly".

IMO, a good way to do that is to offer an as(BypassRestrictedSegment.class) which returns one of these:

class BypassRestrictedSegment implements SegmentReference
{
  private final Segment segment;
  private final List<Policy> policies;

  public List<Policy> getPolicies()
  {
    return policies;
  }

  @Override
  public CursorFactory asCursorFactory()
  {
    // Bypass policies
    return delegate.asCursorFactory();
  }

  @Override
  public <T> T as(Class<T> clazz)
  {
    // Bypass policies
    return segment.as(clazz);
  }
}

Policy-aware query engines that need to use QueryableIndex directly are expected to:

  • Call as(QueryableIndex.class) first.
  • If that returns null, try as(BypassRestrictedSegment.class) followed by as(QueryableIndex.class).
  • Call getPolicies() and apply the policies when using the QueryableIndex.
  • Throw an exception if any unrecognized policies are found.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call as(QueryableIndex.class) first.
If that returns null, try as(BypassRestrictedSegment.class) followed by as(QueryableIndex.class)

I was thinking that maybe this should throw an exception targeted at developers to indicate that they should call as(BypassRestrictedSegment.class), since imagining in the future if this were rolled out with some policies then the .as/asQueryableIndex methods would go from working to suddenly returning null.

Along with throwing an exception, would also maybe a helper method to indicate if the segment supports BypassRestrictedSegment, though the same could also work if we just return null from .as, so I don't feel too strongly either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exception targeted at developers to indicate that they should call as(BypassRestrictedSegment.class)

I was thinking that query engines might have fall-back code to do something cursor-based if QueryableIndex isn't available. They would need this anyway if they wanted to work on real-time data. Returning null in these cases lets the fall-back code activate.


@Nullable
@Override
public <T> T as(@Nonnull Class<T> clazz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call as(QueryableIndex.class) first.
If that returns null, try as(BypassRestrictedSegment.class) followed by as(QueryableIndex.class)

I was thinking that maybe this should throw an exception targeted at developers to indicate that they should call as(BypassRestrictedSegment.class), since imagining in the future if this were rolled out with some policies then the .as/asQueryableIndex methods would go from working to suddenly returning null.

Along with throwing an exception, would also maybe a helper method to indicate if the segment supports BypassRestrictedSegment, though the same could also work if we just return null from .as, so I don't feel too strongly either way.


private final boolean allowed;
private final String message;
// A policy restriction on top of table-level read access. It should be empty if there are no policy restrictions
// or if access is requested for an action other than reading the table.
private final Optional<Policy> policy;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re other comments about lists, this should also be a list of Policy

…icy class. Updated AuthorizationResult class as well.
Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did another pass, still thinking about some things so will keep looking over stuff

Comment on lines 161 to 162
if (!authResult.isUserWithNoRestriction()) {
throw new ForbiddenException(authResult.getErrorMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is correct, since it would prevent users with policies from cancelling their own queries

Copy link
Contributor

@gianm gianm Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird but it is difficult to do something better without deeper changes to cancellation authorization. Ideally the way it works is that people are only allowed to cancel their own queries, unless they have STATE WRITE, in which case they should be able to cancel any query. But that's not how cancellation works today, because we don't retain info about which user issued a query.

IMO, it'd be good to adjust this in a follow up, so we can change things so it works like the above ☝️. For the time being I think it is OK to err on the secure side, which means denying cancellation to people with policies.

* <li> no policy found
* <li> the user has a no-restriction policy
*/
public boolean isUserWithNoRestriction()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit, there is no such thing as a "user" in Druid core, that is a construct that may or may not exist in specific auth extensions.

Beyond that, I'm not quite sure this is the correct model, is it overly restrictive since many of the former callers of isAllowed are calling this, it seems like if there are any policies involved at all then the only thing that will be allowed is query calls, every other API would be disallowed, which doesn't seem the best behavior to me.

public class RestrictedDataSource implements DataSource
{
private final TableDataSource base;
private final Policy policy;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should be a List, otherwise auth extensions are going to have to some work to what might logically be multiple policies (in the general sense at least). Like imagine an auth setup like ldap or something where there are users and groups that allows defining policies at both levels, if we don't allow a list here than that extension would have to combine all of that stuff into a single Policy.

Maybe that is fine I guess, but it does seem like it would make extensions have to do more work instead of just mapping different logical types of policies such as row filtering and column access into a single CursorBuildSpec transformation. I guess the benefit of a single policy would be that if there were multiple row filters in place for example it could be combined into a flat and statement instead of like a = 'x' && (b = 'y' && c = 'z').

}
}

public Map<String, Optional<Policy>> getPolicy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 notes:

  • It would be clearer to name this getPolicyMap().
  • Please include javadoc, since it's not obvious what the keys are.
  • The value should be a List<Policy> rather than Optional<Policy>.

@@ -155,10 +162,6 @@ public <T> QueryResponse<T> runSimple(
final QueryResponse<T> queryResponse;
try {
preAuthorized(authenticationResult, authorizationResult);
if (!authorizationResult.isAllowed()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this? What happens now if a request is not authorized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is moved to preAuthorized, if it didn't pass error throw DruidException.

@@ -2184,7 +2184,7 @@ private TestHttpStatement(
@Override
protected void authorize(
DruidPlanner planner,
Function<Set<ResourceAction>, Access> authorizer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some tests that use Policies to this class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also include tests for the SqlTaskResource endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added test for SqlResource, there's also test coverage in CalciteSelectQueryTest. For async query, there's also test coverage in MSQSelectTest and DartSqlResourceTest

@Override
public boolean isCacheable(boolean isBroker)
{
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that this return false here means that when policies are in play, caching won't be possible. This is OK for this patch. We'll want to address it in a follow up by adding getCacheKey() to Policy and using those to compute the cache key for RestrictedDataSource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants