Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine grained config and state resources #9380

Open
pjain1 opened this issue Feb 20, 2020 · 26 comments
Open

Fine grained config and state resources #9380

pjain1 opened this issue Feb 20, 2020 · 26 comments

Comments

@pjain1
Copy link
Member

pjain1 commented Feb 20, 2020

Motivation

This proposal is to introduce fine grained config and state resources for better access controls. For example, for a user we may want to allow them the permission to view the status of a node but do not want to allow by default access to all other state resources like load queue, list of servers or list of tasks for a worker etc. Another example, we may want to give permissions to a user to just query the datasources but not read/change any state or config of the cluster, however even for showing the datasource list on console, a read on config is required to fetch compaction status for the datasource.

Proposed changes

The current design has the same resource name for all STATE and CONFIG resources which are state and config respectively. To achieve the desired results, different resource names will need to be set for these resources. For example, for druid/coordinator/v1/config/compaction the resource name can be COMPACTION and resource type is CONFIG so a GET on this endpoint will be a READ on COMPACTION CONFIG instead of READ on CONFIG CONFIG. Similarly appropriate names will be chosen for resources. The design always had this provision for future improvement, see here and here.

  • StateResourceFilter and ConfigResourceFilter will be changed to introspect the requested url and set the resource name accordingly before performing authorization.
  • Endpoints that do authorization in the method itself will be changed to set the resource names appropriately before authorizing.

Rationale

Having fine grained access controls.

Operational impact

If the security extension uses state and config as resource name for STATE and CONFIG resource types for auth checks then they would need to change the implementation to allow any name as valid and just check the resource type to preserve the old behaviour.

Backwards Compatibility

Its backward compatible as long as exact resource name match is not done for CONFIG and STATE resource types in the security extension.

@jon-wei
Copy link
Contributor

jon-wei commented Feb 25, 2020

StateResourceFilter and ConfigResourceFilter will be changed to introspect the requested url and set the resource name accordingly before performing authorization.

Can you describe the introspection checks in more detail?

@tsmethurst
Copy link

I'm interested in implementing this for a use case we have at Klarrio, where we want to expose the Druid UI in a multi-tenant environment. Did anyone else start working on this already? What design steps does this need to go through to reach approval? Or should I just start working on it already and make a PR?

@pjain1
Copy link
Member Author

pjain1 commented Mar 28, 2020

@jon-wei sorry for the delayed response, I will try to put some basic code out describing the checks soon.

@tsmethurst I have been working on defining resource names for the state and config resources here - https://docs.google.com/spreadsheets/d/1fXD5n9gHIL0RbAoiFnu9s_2So2c2K4W8dtQZVeCaux8/edit?usp=sharing any comments/contribution to it would be welcome and speed things up.

@tsmethurst
Copy link

@pjain1 Thanks for sharing. I'm still looking through and familiarising myself with the code, but beginning to see where the necessary changes will have to be made. The google doc is very useful in terms of finding my bearings, but before I can make any really constructive suggestions I still need to work through and compare that against the API and the http part of the server code. Looks promising though, I think this is doable.

@pjain1
Copy link
Member Author

pjain1 commented Mar 31, 2020

@jon-wei @gianm @tsmethurst I have completed the google docs sheet with what I think would be the finer permissions, any comments would be useful. I have also added possible improvement in a column.

From implementation point of view, as you can see in the sheet there is a column listing new Resource Names, so I am thinking that there would be Resource Filters corresponding to these names and they would be applied to the endpoints. These new filters will extend the existing Config and StateResourceFilter but will pass in the resource name while doing the filter. This way any extension can also extend them and provide their own resource name if needed. Here's a prototype of what I am thinking - https://github.com/apache/druid/compare/master...pjain1:finer_permissions_poc?expand=1

One question from my side is do we need to keep it backwards compatible or is it ok ?

@tsmethurst
Copy link

tsmethurst commented Apr 1, 2020

@pjain1 I've gone ahead and implemented most of the changes for STATE on this branch of our Klarrio fork:
https://github.com/apache/druid/compare/master...Klarrio:granular_permissions?expand=1
Sharing this in the interest of getting feedback. Is that what you had in mind?

Note that my preferred way of doing this would not be to create a whole bunch of new ResourceFilter implementations with just the resource name changed. Suggestions on the best way of doing this are very welcome.

@pjain1
Copy link
Member Author

pjain1 commented Apr 1, 2020

Thanks @tsmethurst ! this looks good, that's what I had in mind. Apart from these, STATE permissions are also checked while accessing system schema through sql, so will have to apply similar fix here and also check if there are any other places like these (probably not).

However, generally just waiting for feedback from others in the community that this is the right way to go.

@tsmethurst
Copy link

@pjain1 quick update before the weekend: I've added the CONFIG stuff as well, and made a few other changes to avoid code duplication and hardcoded resource name strings. Feel free to have a look and see if you have any thoughts on how I've done it :)

https://github.com/apache/druid/compare/master...Klarrio:granular_permissions?expand=1

Enjoy the weekend!

@pjain1
Copy link
Member Author

pjain1 commented Apr 14, 2020

@tsmethurst hey this looks good, lets wait for a few days for comments from @jon-wei or any other community members, otherwise lets just a raise PR and get comments there.

@tsmethurst
Copy link

@pjain1 I've opened a Work-In-Progress PR here:

#9725

@jon-wei
Copy link
Contributor

jon-wei commented Apr 29, 2020

I took some time to review the proposal, currently documenting my thoughts and I'll have something to share soon

@jon-wei
Copy link
Contributor

jon-wei commented Apr 30, 2020

I've also been thinking about the Druid security model recently and reviewed the proposal and linked spreadsheet (the breakdown of existing endpoints is really helpful!), I have the following thoughts.

I think the security model should achieve the following goals:

  • Support clear boundaries between common "personas" (described more below)
  • Have an easy-to-understand permission model

For the first goal, from my experience working with customers that use Druid security features, users think in terms of "personas" when setting up permissions, for example:

  • "Datasource reader": needs permissions to query some set of datasources and nothing else
  • "Data engineer": Someone whose role involves loading data and managing it, they need access to ingestion, compaction, and other related APIs
  • "Cluster Admin": Highest level of access

For the second goal, I think we should keep the permissions model as simple as we can in a way that still supports the first goal.

The ideal result I would hope to achieve with those goals is to make it easy for a user to answer a question like "What permissions are needed for a given persona?"

Let me know if you think those goals make sense, or if there are other goals that you had in mind.


From the perspective above, I like that this proposal introduces more resource types.

I have two high level comments on the resources:

  • We should think about dropping the STATE/CONFIG distinction
  • It could make sense to condense some of the resource types

I think it's worth considering whether we want to keep STATE/CONFIG as high level resource categories going forward.

In my experience with the current model, there's sometimes significant overlap between STATE and CONFIG, for example, enabling/disabling workers is a CONFIG action, but checking whether a worker is enabled is a STATE action, and so it makes sense to assign those permissions together generally.

Another example could be checking the cluster properties, which I feel is arguably both STATE and CONFIG (i.e., the configuration is a kind of state).

The users I've worked with have also found that distinction somewhat hard to understand.


In some cases, I think we could simplify the model by pruning some of the resource types.

The example I had in mind was the coordinator load rules;

  • If the load rules are datasource specific, this could be represented as a SET_LOAD_RULES action on the DATASOURCE resource type instead.
  • If the load rules are cluster-wide, this could fall under a base admin-esque resource, similar areas might include coordinator dynamic configs related to segment balancing.

Structuring the resources that way could also make it easier to enumerate what all the possible actions on a datasource are.

Other areas for potential simplification:

  • Compaction settings, similar to the load rule example
  • WORKERS and SERVERS (since they're all Druid cluster processes)

There might be better approaches to these specific areas, I'm using the examples here more to show my thought process.


In general, I'm thinking about the permissions with an approach where I'm focusing more on the underlying resources/associated personas+workflows and less on the specific API.

Maybe it makes sense to start by defining the "core" resource types and then mapping the various endpoints to those core types.

==================================

Additionally, do you have any thoughts on how the new security model should handle input sources for ingestion?

Currently, someone who has DATASOURCE WRITE access can submit ingestion tasks with InputSources or Firehoses that can read from any resources that the Druid server is able to access (e.g., S3 buckets). This means you can't use permissions to create different "data engineer" roles that each have different input sources they're allowed to read from.

I think it'd be good to enable this somehow, maybe via input source whitelisting.

@pjain1
Copy link
Member Author

pjain1 commented May 2, 2020

@jon-wei thanks for the comments. Before doing this proposal I had exact same goal to enable users to setup different roles/personas as per their requirement which is currently not possible because of lack of finer permission model.

The permissions listed out in spreadsheet is for gathering feedback. I agree that we should strive to simplify the permission model. You mentioned that this proposal introduces more resource types, do you mean resource names ? as resource types are still the same.

To your point

Maybe it makes sense to start by defining the "core" resource types and then mapping the various endpoints to those core types.

New resource names were defined in the spreadsheet with this goal in mind. May be having resource type viz. STATE/CONFIG along with them is redundant ? I need to think more about this, it would be helpful if we start by taking one persona like Data engineer - write down types and names for the APIs that he will be using and define permissions for him in format Aciton:Type:Name(if we are keeping the format same).

For input source security, there can be two approaches -

  1. Somehow allow input source specific user credentials to be passed to downstream system from which data is read to enable do as functionality. This seems complex and intrusive.
  2. Use Druid permission model by introducing INPUTSOURCE Resource Type and corresponding Resource Name will be provided by implementations of InputSource interface. On task submission input source permission will be checked. I have a prototype code here - https://github.com/apache/druid/compare/master...pjain1:ingestion_security?expand=1#diff-40bbee918ab4b9fe487c02c4dfd1989d

@jon-wei
Copy link
Contributor

jon-wei commented May 7, 2020

For resource types and names, I'm thinking about them as such:

  • Resource types would be a category of entities/information being accessed, like DATASOURCE or SERVER
  • If the resource type has sub-items like DATASOURCE, then additionally it'd have resource names.

Using the examples in the spreadsheet, I would probably go with something like:

  • Removing STATE/CONFIG as resource types, I think they're redundant
  • Keep DATASOURCE as a resource type
  • What were formerly resource names under the STATE/CONFIG types would now just be top-level resource types
  • Maybe we could have a single blanket resource type for INTERNAL without finer-grained resource names, I don't think there'd be much reason to set fine-grained permissions on that level given that the internal APIs aren't meant to be called by users

I need to think more about this, it would be helpful if we start by taking one persona like Data engineer - write down types and names for the APIs that he will be using and define permissions for him in format Aciton:Type:Name(if we are keeping the format same).

That sounds good, it would probably help guide our thoughts on what the core resource types should be.

Somehow allow input source specific user credentials to be passed to downstream system from which data is read to enable do as functionality. This seems complex and intrusive.

Yeah, that does seem like it could be tricky to handle.

Use Druid permission model by introducing INPUTSOURCE Resource Type and corresponding Resource Name will be provided by implementations of InputSource interface. On task submission input source permission will be checked. I have a prototype code here - https://github.com/apache/druid/compare/master...pjain1:ingestion_security?expand=1#diff-40bbee918ab4b9fe487c02c4dfd1989d

That approach seems reasonable to me at a high level, I'll take a look at the prototype.

@suneet-s suneet-s mentioned this issue Jun 9, 2020
4 tasks
@pjain1
Copy link
Member Author

pjain1 commented Jun 12, 2020

@jon-wei I have removed config and state and added following resource types - Server, Lookup, Compaction, Internal and Inputsource.

  • Endpoints which druid nodes uses to talk to each other are covered by INTERNAL
  • Endpoints related to changing node config or getting some status of server are covered by SERVER
  • Securing Inputsource at druid side can be done using INPUTSOURCE resource type, however I am not sure of this approach yet, I was thinking we should somehow expose do-as functionality where user's credentials can be used. I believe reindexing from druid should be covered by read on datasource permission.
  • Lookup and Compaction configs are protected by LOOKUP and COMPACTION resources.

Please have a look at Proposed State resources and Proposed Config resources tab in sheet here - https://docs.google.com/spreadsheets/d/1fXD5n9gHIL0RbAoiFnu9s_2So2c2K4W8dtQZVeCaux8/edit#gid=301120055

@pjain1
Copy link
Member Author

pjain1 commented Jun 15, 2020

Changed few things regarding how Inputsources are secured and compaction endpoints in the google sheet

@jon-wei
Copy link
Contributor

jon-wei commented Jun 23, 2020

@pjain1 Sorry I missed your recent comments, I was on vacation at the time, I will review the updates tomorrow.

@jon-wei
Copy link
Contributor

jon-wei commented Jul 14, 2020

@pjain1

I had a few specific comments on the spreadsheet items:

  • For treating Lookups as Datasources, I could see one day that these concepts have less distinction, but I think we currently allow for lookups and datasources to share the same name, so I think they'd need to be separate resources for now
  • The GET for GET /druid/coordinator/v1/config/compaction is filtered based on READ DATASOURCE permissions, but I think we may want to somehow represent that differently. The compaction stuff is more of a write/datasource management thing, and I think a read-datasource-only user shouldn't necessarily have access to compaction configs.
  • Similarly, I think system schema tasks/supervisors and coordinator load rule GETs shouldn't be represented as READ DATASOURCE.
  • For LOOKUP INTERNAL type resources, could that just be INTERNAL/INTERNAL?

I also have a more general question: What would the migration process to a version with the new resource definitions look like for users?

@pjain1
Copy link
Member Author

pjain1 commented Jul 17, 2020

For both compaction and load rule I thought it more intuitive to bind read with read and write with write. Should we just add a new resource type like DATASOURCE-CONFIG for representing both compaction and load/drop rules ?

Regarding lookup, they can be changed to INTERNAL INTERNAL I think.

Migration plan doesn't seem very seamless will need to think more about it.

@pjain1
Copy link
Member Author

pjain1 commented Aug 19, 2020

@jon-wei for migration we can have a flag to switch between older and newer security model, once they are ready they flip the switch. What do you think ? Can you also reply on previous comment.

@jon-wei
Copy link
Contributor

jon-wei commented Aug 20, 2020

@pjain1

For both compaction and load rule I thought it more intuitive to bind read with read and write with write. Should we just add a new resource type like DATASOURCE-CONFIG for representing both compaction and load/drop rules ?

I think a new resource type for that is fine, perhaps DATASOURCE WRITE could also be rolled into this new combined permission (ingest, set up compaction, and manage the segment distribution, is there anything else that would make sense here?).

for migration we can have a flag to switch between older and newer security model, once they are ready they flip the switch.

Prior to the switch, how would the old and new model coexist (for example, how the user would define permissions for the new model)? Maybe we could separate the old and new authorization DB (like use a separate metadata table or some kind of namespacing), I imagine it could be confusing for the user if permissions for both models exist at the same time and place.

@pjain1
Copy link
Member Author

pjain1 commented Sep 7, 2020

@jon-wei I think its a good suggestion to namespace new permissions perhaps like prepending v1 so for example read datasource will look like - v1:read:datasource:*. The new permissions can stay in namespace or other depending on the auth implementation, so the users will define the new permissions in their auth system and assign it to users. Once all the new permissions are defined they can set a flag in properties file which will tell Druid to use new permission model. This will make it seamless for users to upgrade, the only condition would be to flip the switch only after all the nodes in cluster are running the latest code.

@pjain1
Copy link
Member Author

pjain1 commented Sep 7, 2020

@jon-wei Upon thinking more it seems like why do we even need to namespace, users can add new permissions in the auth system that they are using and add them to roles as well. Then they can upgrade the code which will still keep on using the older set of permissions as the users will now have both old and new ones. Once they flip the switch the code will only consider newer set of permissions and then the older ones can be removed from auth system.

@pjain1
Copy link
Member Author

pjain1 commented Oct 27, 2020

@jon-wei read your comment again, your concern is that it might be confusing if both models exists at same place but creating namespaces totally depend on the extension implementation. Druid core can just indicate to the extension that either newer or older model to be used, basic extension in core can be modified to use namespaces.

@jon-wei
Copy link
Contributor

jon-wei commented Oct 28, 2020

@pjain1 Ah yeah, the potential confusion was my main concern.

creating namespaces totally depend on the extension implementation. Druid core can just indicate to the extension that either newer or older model to be used, basic extension in core can be modified to use namespaces.

That's a good point, I think that plan makes sense.

@pjain1 pjain1 mentioned this issue Nov 10, 2020
5 tasks
@pjain1
Copy link
Member Author

pjain1 commented Nov 10, 2020

@jon-wei #10571

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants