Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

authorize query datasources with schema information #11675

Closed
clintropolis opened this issue Sep 8, 2021 · 2 comments
Closed

authorize query datasources with schema information #11675

clintropolis opened this issue Sep 8, 2021 · 2 comments

Comments

@clintropolis
Copy link
Member

clintropolis commented Sep 8, 2021

Motivation

Currently Druid authorization is handled by Authorizer implementations checking if a Resource, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we have ResourceType.DATASOURCE, and the name is the datasource name. Additionally, #10812 added ResourceType.VIEW to allow SQL view implementations to also be authorized.

However, Druid SQL also has additional schemas, sys, lookup, INFORMATION_SCHEMA, and extensions can define any number of schemas. Adding a new ResourceType for each schema is not a scalable solution.

Proposed changes

Resource will be extended with a nullable schema string property to capture the schema that a name belongs to. For SQL query authentication, we already have access to the schema because during SQL validation the table identifiers are fully qualified, we are just discarding the schema information when constructing the Resource. For native JSON query authentication, everything with a non-druid schema can be inferred from the type of datasource (e.g. lookups) and set accordingly. Query paths will be lightly modified to include datasource schema information for all schemas when authorizing resources, instead of just druid and view schemas which are current checked. The possible exception to this is perhaps INFORMATION_SCHEMA, which I think still makes sense to be always accessible and filtered on authorization to the other schemas and tables it provides information on due to its special role in providing instructions on how to construct queries.

Authorizer implementations will need to be updated to consider this parameter. Druid just supplies the Resource to validate, so no new global setting will be necessary to enable this behavior, since it will be up to any specific Authorizer implementation to act on this additional information. Additionally, a new global configuration parameter will be added, druid.auth.enableResourceSchemaAuthroization will also be added to make authorization of the newly authorized with DATASOURCE resources such as system tables and lookups opt-in. I think this should default to true, but can be disabled for existing clusters that do not have role permissions that easily map to include access to resources which they were allowed to view prior to this change, until the permissions can be migrated to include the newly authorized resources.

For druid-basic-security, this means BasicAuthorizerPermission will be updated to allow specifying a schema matching pattern, and a new configuration parameter will be added to set the default schema parameter for permissions which are defined with a null schema (which will include all existing permissions), druid.auth.authorizer.{authorizerName}.defaultResourceSchemaPattern, which should default to druid to match the existing behavior.

I am less familiar with druid-ranger-security, but i think it would make the most sense to add an additional property to the request with the schema value of the Resource along with an extension configuration to control whether or not this schema is included on the request or not.

This does not conflict with or contradict #9380 as far as I can tell, though having a separate LOOKUP resource type seems perhaps no longer necessary, as this modification can also deprecate the VIEW resource type.

Rationale

I considered if there was a way of just capturing the schema in the name property of the Resource, however, there is not a very good upgrade/downgrade path, because authorizer implementations such as druid-basic-security use pattern matching against the name. Having a separate schema property avoids encoding this information in the name in a backwards incompatible way, and allows authorizer extensions to handle the schema property in the most appropriate way, unambiguously.

Operational impact

Cluster operators will gain additional means to secure datasources, including queries on lookup tables and system tables. Operators will however need to be considerate of their authorizer implementation and plan upgrades accordingly.

Test plan (optional)

It will be important to focus on testing upgrade and downgrade paths to ensure that authorization is able to function correctly using configurations of both new and existing behavior models.

@zachjsh
Copy link
Contributor

zachjsh commented Sep 9, 2021

@clintropolis thanks for proposal, just had a few questions:

  1. Just to clarify, how to we authorize access to the other schemas like sys and lookups today?

  2. In the case that the druid.auth.enableResourceSchemaAuthroization is not enabled, what would be the behavior, if you have datasources with the same name that exist in multiple schemas? Would a user who is given datasource read access for datasource with that name affectively be given read access to all datasources with that name in all schemas? If so is this the existing behavior?

  3. One issue that we talked about that I just wanted to get recorded into comment is that with this design, the user can enable druid.auth.enableResourceSchemaAuthroization property, but really it is up to the specific authorizer implementation of whether to use this, and depending on whether the authorizer has been upgraded or not, it may or may not be taking the schema into consideration? Anyway for a user to be notified in the case that the authorizer extension doesnt support this new feature?

@clintropolis
Copy link
Member Author

closing in favor of #11690

@zachjsh I will make sure #11690 addresses any questions which are still relevant, sorry for the proposal churn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants