Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor sql authorization to get resource type from schema, resource type to be string #11692

Merged
merged 6 commits into from
Sep 17, 2021

Conversation

clintropolis
Copy link
Member

Description

This PR introduces the refactor detailed in proposal #11690, which consists of two main changes.

The first is that ResourceType enum is no longer used by Resource in favor of using a string. ResourceType has been repurposed into a registry of built-in types (and the set of known types can be extended by registering custom types with this registry).

The second is that NamedSchema has been expanded to include a new method, getSchemaResourceType which returns the newly string resource type that should be used to authorize queries against that schema, and this mapping is pushed down into SqlResourceCollectorShuttle, which has been modified to construct the set of Resource to authorize by checking for table schemas which have a non-null type instead of explicitly only looking for the 'druid' and 'view' schema. Currently only NamedDruidSchema and NamedViewSchema implement this method with a non-null return value, so the behavior should be all-around unchanged at this time.

A follow-up PR will add NamedSystemSchema to this list so that the system schema tables can be directly authorized (on top of the per row filtering that occurs based on the other permissions of the user), and NamedLookupSchema could similarly be added in the future to provide authorization against lookup queries (though additional work would need to be done to fully authorize lookups used via expressions or extraction functions).


This PR has:

  • been self-reviewed.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@@ -92,7 +93,7 @@ public void testPermissionSerdeIsChillAboutUnknownEnumStuffs() throws JsonProces
"resourceAction",
ImmutableMap.of(
"resource",
ImmutableMap.of("name", "bar", "type", "UNKNOWN"),
ImmutableMap.of("name", "bar", "type", "CUSTOM"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not use customRead here as you have used fooRead online 89 above?

@@ -179,6 +185,12 @@ public DateTimeZone getTimeZone()
return localNow.getZone();
}

@Nullable
public String getSchemaResourceType(String schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think there would be a use case to have multiple resource types for a given schema?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I can't really think of a use case off the top of my head, but I guess it would be easy enough to support it. The change would go here https://github.com/apache/druid/pull/11692/files#diff-efcf39975403e19080cc5cd35af9a573608177c5ead5420ba3390c2268de0da4R38 to make it return a set instead, and then Map<String, Set> through to https://github.com/apache/druid/pull/11692/files#diff-630eb92856c1b8dd3034980b486124e31952c8eef0a9195bbacb9bbe310a3023R71 which would just add all of the items in the set. https://github.com/apache/druid/pull/11692/files#diff-902bbc6a46aa1001ed51739a48b9847682c9450d7c122ce2d917370c98d4e3f9R482 would also need updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no use case, then no worries, I couldnt think of a case either. Think ok for now, up to you if you want to add support for that here now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to return the resourceType of the resources in the schema, rather than the resourceType of the schema itself. This behavior is currently correct because all resources should be the same type in the same schema. However, I can imagine that the resources of different types can be in the same schema in the future as what other databases do. In that case, we will want to check the resourceType per resource. How about adding something like getResourceType(String schema, String resource) instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like this, changed 👍

} else if (NamedViewSchema.NAME.equals(schema)) {
resources.add(new Resource(qualifiedNameParts.get(1), ResourceType.VIEW));
final String resourceType = plannerContext.getSchemaResourceType(schema);
if (resourceType != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before it seemed that only DATASOURCE and VIEW being mapped to resources. How do we ensure this old behavior if desired?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, the default implementation for the other schema types will return null resource type. nvm, all good.

if (druidSchemaName.equals(subSchema.getName())) {
// The "druid" schema's tables represent Druid datasources which require authorization
final NamedSchema schema = namedSchemaMap.get(subSchema.getName());
if (schema != null && schema.getSchemaResourceType() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this functions code and getAuthorizedFunctionNamesFromSubSchema look very similar. Could do a slight refactor and reduce code here it seems. Not necessary though.

@@ -32,6 +34,12 @@
*/
String getSchemaName();

@Nullable
default String getSchemaResourceType()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add javadoc here and describe where this is used and the importance of returning non-null if authorization checks are needed on the schema, if my understanding is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like getResourceType(String resource) as I suggested in my other comment? Even though most implementations will ignore the resource parameter for now, it will be useful once we allow different resource types in the same schema.

@zachjsh
Copy link
Contributor

zachjsh commented Sep 15, 2021

Looks good overall, just a few questions in comments left. Also, does this framework allow authorization for only SQL based queries, or also native queries too? Also will it work with JDBC?

@clintropolis
Copy link
Member Author

Also, does this framework allow authorization for only SQL based queries, or also native queries too? Also will it work with JDBC?

Yeah, this only impacts how the set of Resource for SQL queries are computed (including JDBC), native queries still compute this from the DataSource.

It doesn't actually change anything about the Druid authorization model though, so SQL and JSON queries should still require the same set of permissions for the same query, though SQL has some additional things like system tables and views that don't have a native equivalent (yet at least).

Any additional authorization we add in the future will need to be sure to be hooked into both SQL and native queries however, since they compute their Resource sets to check differently - SqlLifecycle.validateAndAuthorize vs QueryLifecycle.authorize (SqlLifecycle actually has one or more QueryLifecycle to run the native queries that the SQL planned to, but since they have already been authorized when validating the SQL, they use a pre-authorized path through QueryLifecycle)

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -65,12 +65,13 @@ public SqlNode visit(SqlIdentifier id)
// this should not probably be null if the namespace was not null,
if (validatorTable != null) {
List<String> qualifiedNameParts = validatorTable.getQualifiedName();
// 'schema'.'identifier'
// 'schema'.'identifier' d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof i bet i started typing something before switching active windows, will fix heh

Copy link
Contributor

@zachjsh zachjsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚢

Copy link
Contributor

@zachjsh zachjsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚢

@clintropolis clintropolis merged commit 392f0ca into apache:master Sep 17, 2021
@clintropolis clintropolis deleted the extensible-resource-type branch September 17, 2021 16:53
@clintropolis
Copy link
Member Author

thanks for review @zachjsh and @jihoonson!

@abhishekagarwal87 abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants