You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently Druid authorization is handled by Authorizer implementations checking if a Resource, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we have ResourceType.DATASOURCE, and the name is the datasource name. Additionally, #10812 added ResourceType.VIEW to allow SQL view implementations to also be authorized.
However, Druid SQL also has additional schemas, sys, lookup, INFORMATION_SCHEMA, and extensions can define any number of schemas. Adding a new ResourceType for each schema is not a scalable solution.
Proposed changes
Resource will be extended with a nullable schema string property to capture the schema that a name belongs to. For SQL query authentication, we already have access to the schema because during SQL validation the table identifiers are fully qualified, we are just discarding the schema information when constructing the Resource. For native JSON query authentication, everything with a non-druid schema can be inferred from the type of datasource (e.g. lookups) and set accordingly. Query paths will be lightly modified to include datasource schema information for all schemas when authorizing resources, instead of just druid and view schemas which are current checked. The possible exception to this is perhaps INFORMATION_SCHEMA, which I think still makes sense to be always accessible and filtered on authorization to the other schemas and tables it provides information on due to its special role in providing instructions on how to construct queries.
Authorizer implementations will need to be updated to consider this parameter. Druid just supplies the Resource to validate, so no new global setting will be necessary to enable this behavior, since it will be up to any specific Authorizer implementation to act on this additional information. Additionally, a new global configuration parameter will be added, druid.auth.enableResourceSchemaAuthroization will also be added to make authorization of the newly authorized with DATASOURCE resources such as system tables and lookups opt-in. I think this should default to true, but can be disabled for existing clusters that do not have role permissions that easily map to include access to resources which they were allowed to view prior to this change, until the permissions can be migrated to include the newly authorized resources.
For druid-basic-security, this means BasicAuthorizerPermission will be updated to allow specifying a schema matching pattern, and a new configuration parameter will be added to set the default schema parameter for permissions which are defined with a null schema (which will include all existing permissions), druid.auth.authorizer.{authorizerName}.defaultResourceSchemaPattern, which should default to druid to match the existing behavior.
I am less familiar with druid-ranger-security, but i think it would make the most sense to add an additional property to the request with the schema value of the Resource along with an extension configuration to control whether or not this schema is included on the request or not.
This does not conflict with or contradict #9380 as far as I can tell, though having a separate LOOKUP resource type seems perhaps no longer necessary, as this modification can also deprecate the VIEW resource type.
Rationale
I considered if there was a way of just capturing the schema in the name property of the Resource, however, there is not a very good upgrade/downgrade path, because authorizer implementations such as druid-basic-security use pattern matching against the name. Having a separate schema property avoids encoding this information in the name in a backwards incompatible way, and allows authorizer extensions to handle the schema property in the most appropriate way, unambiguously.
Operational impact
Cluster operators will gain additional means to secure datasources, including queries on lookup tables and system tables. Operators will however need to be considerate of their authorizer implementation and plan upgrades accordingly.
Test plan (optional)
It will be important to focus on testing upgrade and downgrade paths to ensure that authorization is able to function correctly using configurations of both new and existing behavior models.
The text was updated successfully, but these errors were encountered:
@clintropolis thanks for proposal, just had a few questions:
Just to clarify, how to we authorize access to the other schemas like sys and lookups today?
In the case that the druid.auth.enableResourceSchemaAuthroization is not enabled, what would be the behavior, if you have datasources with the same name that exist in multiple schemas? Would a user who is given datasource read access for datasource with that name affectively be given read access to all datasources with that name in all schemas? If so is this the existing behavior?
One issue that we talked about that I just wanted to get recorded into comment is that with this design, the user can enable druid.auth.enableResourceSchemaAuthroization property, but really it is up to the specific authorizer implementation of whether to use this, and depending on whether the authorizer has been upgraded or not, it may or may not be taking the schema into consideration? Anyway for a user to be notified in the case that the authorizer extension doesnt support this new feature?
Motivation
Currently Druid authorization is handled by
Authorizer
implementations checking if aResource
, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we haveResourceType.DATASOURCE
, and the name is the datasource name. Additionally, #10812 addedResourceType.VIEW
to allow SQL view implementations to also be authorized.However, Druid SQL also has additional schemas,
sys
,lookup
,INFORMATION_SCHEMA
, and extensions can define any number of schemas. Adding a newResourceType
for each schema is not a scalable solution.Proposed changes
Resource
will be extended with a nullableschema
string property to capture theschema
that aname
belongs to. For SQL query authentication, we already have access to the schema because during SQL validation the table identifiers are fully qualified, we are just discarding the schema information when constructing theResource
. For native JSON query authentication, everything with a non-druid schema can be inferred from the type of datasource (e.g. lookups) and set accordingly. Query paths will be lightly modified to include datasource schema information for all schemas when authorizing resources, instead of justdruid
andview
schemas which are current checked. The possible exception to this is perhapsINFORMATION_SCHEMA
, which I think still makes sense to be always accessible and filtered on authorization to the other schemas and tables it provides information on due to its special role in providing instructions on how to construct queries.Authorizer implementations will need to be updated to consider this parameter.
Druid just supplies theAdditionally, a new global configuration parameter will be added,Resource
to validate, so no new global setting will be necessary to enable this behavior, since it will be up to any specificAuthorizer
implementation to act on this additional information.druid.auth.enableResourceSchemaAuthroization
will also be added to make authorization of the newly authorized with DATASOURCE resources such as system tables and lookups opt-in. I think this should default to true, but can be disabled for existing clusters that do not have role permissions that easily map to include access to resources which they were allowed to view prior to this change, until the permissions can be migrated to include the newly authorized resources.For
druid-basic-security
, this meansBasicAuthorizerPermission
will be updated to allow specifying a schema matching pattern, and a new configuration parameter will be added to set the default schema parameter for permissions which are defined with a null schema (which will include all existing permissions),druid.auth.authorizer.{authorizerName}.defaultResourceSchemaPattern
, which should default todruid
to match the existing behavior.I am less familiar with
druid-ranger-security
, but i think it would make the most sense to add an additional property to the request with theschema
value of theResource
along with an extension configuration to control whether or not this schema is included on the request or not.This does not conflict with or contradict #9380 as far as I can tell, though having a separate
LOOKUP
resource type seems perhaps no longer necessary, as this modification can also deprecate theVIEW
resource type.Rationale
I considered if there was a way of just capturing the schema in the
name
property of theResource
, however, there is not a very good upgrade/downgrade path, because authorizer implementations such asdruid-basic-security
use pattern matching against the name. Having a separateschema
property avoids encoding this information in thename
in a backwards incompatible way, and allows authorizer extensions to handle theschema
property in the most appropriate way, unambiguously.Operational impact
Cluster operators will gain additional means to secure datasources, including queries on lookup tables and system tables. Operators will however need to be considerate of their authorizer implementation and plan upgrades accordingly.
Test plan (optional)
It will be important to focus on testing upgrade and downgrade paths to ensure that authorization is able to function correctly using configurations of both new and existing behavior models.
The text was updated successfully, but these errors were encountered: