-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REST: honor OAuth config sent by the server #10256
Conversation
As mandated by the spec, in case the config endpoint overrides any auth-specific property, the overridden values should be taken into account when creating the catalog auth session instance.
credential = mergedProps.get(OAuth2Properties.CREDENTIAL); | ||
scope = mergedProps.getOrDefault(OAuth2Properties.SCOPE, OAuth2Properties.CATALOG_SCOPE); | ||
oauth2ServerUri = | ||
mergedProps.getOrDefault(OAuth2Properties.OAUTH2_SERVER_URI, ResourcePaths.tokens()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea to get certain configs from server. Thanks for doing this. Should we add these into config endpoint spec if we are doing this? Without something in the spec, we are relying on the convention. There is a minor concern that the config response from the server side may happen to have the same name with different meaning. For example, a property named scope
in the config response may mean different things. cc @danielcweeks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @flyrain thank you so much for reviewing!
Regarding the risk of config name clashes, most properties have a prefix such as s3.
, adls.
, http-client.
, ecs.
, etc. This prefix makes name clashes rather unlikely, while also clearly indicating the "scope" the property belongs to.
Unfortunately, none of the properties in OAuth2Properties
have such a prefix, which makes them rather prone to misinterpretation indeed.
This is imho a bit out of the scope if this PR, but I agree with you that we could probably do something about it, e.g. introduce a prefix auth.
that the client would then automatically "map" to properties in OAuth2Properties
. But we'd still need to support "unprefixed" properties for a long time I'm afraid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a blocker for me. The current config response is a flat key-value pairs. I think it works well since they are only for client side catalog configs. We can bring more structures into to it in the future, as you suggested, e.g., a prefix auth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think allowing override of the the auth server uri is a good idea. The default should either be the spec'ed auth server path or something that is controlled by the client. This would allow the client to be redirected to a server that they would not expect or be in control of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default should (...) be the spec'ed auth server path
I strongly disagree with this, the /v1/oauth/tokens
should, as I mentioned above, have really never become part of the Iceberg REST spec:
- Each client passes the just base64 encoded credentials to that endpoint - that proxies requests to something you don't know - and gets some token back. This means, the resource knows your credentials.
- It is the opposite of what OAuth addresses:
OAuth addresses these issues by introducing an authorization layer and separating the role of the client from that of the resource owner.
- see RFC 6749.
@@ -215,6 +215,12 @@ public void initialize(String name, Map<String, String> unresolved) { | |||
this.paths = ResourcePaths.forCatalogProperties(mergedProps); | |||
|
|||
String token = mergedProps.get(OAuth2Properties.TOKEN); | |||
// re-resolve these variables in case they were overridden by the config endpoint | |||
credential = mergedProps.get(OAuth2Properties.CREDENTIAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is weird to me that the client gets the credential from server's config endpoint, which doesn't need authentication. Does that mean any client can visit the REST catalog? I think we still need a client to provide its own credential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that the client can already get a token
from the server's config endpoint, so why not a credential
? :-)
Taking a step back, I'd like this PR to focus on what the client is supposed to do with configuration received from the server, not what the server is supposed to send. For that, there is an ongoing discussion here where @jbonofre and others suggest many changes, e.g. the introduction of a "handshake" or "pre-flight" endpoint to facilitate client auto-configuration, including auth properties. But server-side changes will take a lot more time.
And back to the client side of things: I think it makes sense for the client to apply all configuration received from the server. It's then up to the server to decide which options make sense to send, and which don't.
One important property I'd like to see properly handled is oauth2-server-uri
. I'd also like to see the map optionalOAuthParams
correctly handled, since it may contain more important config properties in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think line 217 is to get the token from client config other than from the server config response. mergedProps
mixes properties from both client config and server config response.
Also I believe a client should only get tokens either from users' config or from token server URI response. Config endpoint shouldn't be used for passing tokens. Otherwise, every client may use the same token/credentials, which doesn't make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would second @flyrain's original comment that the client should not be exchanging credentials via this path. The client credentials flow is for auth "previously arranged with the authorization server". The token should be used if returned by the server, but credentials should client controlled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The /v1/config
endpoint - titled as "List all catalog configuration settings" - seems to require authentication. Catalog configuration settings can already contain sensitive information, or information that you do not want everybody to know (warehouse location for example) - so it makes a lot of sense to require authentication for that endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the config endpoint seems to already return sensitive information - in line 217 above (String token = mergedProps.get(OAuth2Properties.TOKEN);
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to require authentication.
There is no requirement that use of the REST protocol requires auth. For people operating their own internal REST implementations, they can choose to support auth or not.
the config endpoint seems to already return sensitive information
The config endpoint may return sensitive information or not, but I won't recommend returning sensitive information if the communication channel is not secure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource (here: Iceberg REST endpoint) is (but should have really never been) the de-facto authentication server, because of the existence of the existence of the /v1/oauth/tokens
endpoint. So the argument that the config-resource should not return credentials feels odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yet, that is exactly what is being done here, but from the tables endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I don't think credentials
or oauthServerUri
should be allowed to be overridden by the server. These should be client controlled configurations, which only leaves scope
. The current reference implementation only includes the catalog
scope, but I'm open to that being configurable (though I'm not convinced it's really necessary as a server side control).
Maybe the confusion is due to the I understand @snazy points, but related to the current implementation (as @danielcweeks said), it's confusing. If I may, I would propose to:
Thoughts ? |
+1 |
Not a fan of leaving Also what about If |
I'm surprised by this PR because I don't think that the auth properties should be overridden by a REST service. I'm not sure about it, but it sounds like @snazy seems to agree when he says that the OAuth2 endpoint shouldn't be part of the REST spec, when @flyrain points out that you should authenticate before hitting the config endpoint, and @danielcweeks points out that credentials are specifically called out in OAuth2 as client controlled. @adutra, what are you trying to accomplish here? Is this needed for some use case that you could, perhaps, explain? Right now, I see the assertion that the REST spec requires this, which I don't agree with (but maybe?) and also @adutra saying things like "I think it makes sense for the client to apply all configuration received" -- is that the basis for this? I think I need to understand the motivation. Otherwise, I think the argument that auth (including the auth endpoint) should be client-side configuration only. This could introduce a way to get the client's credential -- that is, secret -- and that's something we should be very careful about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should send back the credential
to the client via the config
endpoint by default (as already discussed, we should have a dedicated endpoint like handshake
).
Correct me if I'm wrong, but it's already possible to get client credentials via v1/oauth/token
endpoint (and token).
So basically client should provide only credentials (from client to server), and server returns token. Then client should only use token.
I understand the concern about the server returning the credentials, but it should not be the default.
IMHO, the main issue is that REST Catalog itself should not do authentication by itself, it should support auth token, but the authentication should be dedicated to another layer (okta, specific endpoint, ...).
We have to be very careful in exchange credentials token introduced in #4771 as it introduced a way to get the client's credentials.
So, I would propose:
- Can we evaluate the effort to "decoupling" authentication from token use ? Like introducing a configuration with location of the authentication service (it could be the
oauth/token
endpoint or third party like okta) - Can we remove getting client's credentials from token from
oauth/token
endpoint ? It can introduce bad usage/oauth flow imho. - If 1 and 2 are do-able, we don't need this PR as it can be tackled. If not do-able (please provide reasons), then we can introduce a scope configuration in this PR to returns credentials (even if I'm not a big fan 😄 ).
Thoughts ?
@rdblue I have no other intention than the one of bringing the spec in line with the current implementation. There are two areas where spec and implementation differ with respect to authentication:
As we've seen, this is currently not true for many authentication properties, including
Note that All of this tends to suggest that authentication is not handled consistently across the board. The Such inconsistencies are confusing and make it hard to reason about the system. My intention was to bring the spec in Let me finish by addressing the security concerns you raised. While I would generally agree with you that authentication properties should not be shared between client and resource This brings me to the tacit conclusion that client and server must form a sort of trusted pair, functioning in a trusted |
I don't believe this is a safe and raises a huge security concern for me. A client should never be in a situation where it is sending credentials or valid tokens to an server that the client did not explicitly designate. This change allows redirecting the auth server which should expose sensitive information to the wrong party. |
Why should a (malicious) Iceberg REST endpoint do the more complex redirect-dance, if it can get the nearly clear-text credentials due to the |
@snazy Users can override the oauth server, so for example if they set the auth server to be an Okta endpoint and they have credentials to authenticate with Okta, the resulting token from the client credential flow would go to the REST server (this is ok). However, if the REST server then redirects the client somewhere else, any subsequent operations (included additional credential flows) would send those credentials to a second party (this is not ok). The rest server should not be redirecting a client configured auth server. That's not safe. The client should be fully in control of which auth server it uses. |
@adutra, thanks for noticing and pointing out the inconsistencies. I do think it's important to address those. In particular, I did not realize that the table session that was introduced recently used the (I also understand your reading of the We need to address these cases, but I think that this PR goes the wrong direction: instead of allowing auth properties from the config and table routes, we actually need to be more strict. A table session should never use a credential from the catalog service. I think it would help if we had a common understanding of the difference between a token and a credential. A credential is a pre-established ID and secret pair that is used for authentication. A credential is controlled and owned by the client and is typically long-lived, like a username/password. A token is similar because it can be used for authentication (i.e. bearer auth) but tokens are short-lived and can contain a lot more information added by the service that is validated by signing the token. In the REST protocol with OAuth2, Another reason why tokens are handled differently is that tokens can contain additional information that is signed by the service. That additional information can be used to check something once and use the result for the lifetime of the token. When designing the REST protocol, we thought that it was likely that tokens would be used for this purpose, so we built a couple of ways to exchange an auth token (from OAuth2) for a more specific or narrow token. For example, the service could verify that the client is not coming from a blocked IP range. Checking IP ranges is expensive, so you'd want to do it once and produce a token that shows the incoming IP has been validated. As long as the token's IP claim is the same as the incoming IP, you can trust that the validation was done and not repeat the expensive check. As you can see, there's a good reason to handle For this PR, the question is how to handle
In the end, allowing these to be overridden by catalog endpoints would weaken security and introduce extra requests (keep requirements simple!). I don't think it is a good idea to move forward with this PR. That said, good job catching that |
Hi @rdblue, thank you for your detailed answer. I am really sorry that this PR, that I thought would be a fairly consensual one, eventually cracked open a can of worms that I did not intend to open. I'm pleased to realize that, in spite of some apparent divergences, we all seem to agree that the current handling of authentication suffers from a few loopholes – such as the That being said, I agree to close this PR and open another one to fix the table routes instead. There remains, however, the broader question, mentioned a few times above, of a better separation of concerns between the conceptual OAuth 2.0 roles of "authorization" and "resource" servers. But this PR has never had the ambition of tackling that, so I'd like to request that we resume that discussion later in a more appropriate place, like our mailing list. |
See apache#10256 for context. This change disallows overriding the "credential" property in table sessions, by introducing an allow-list of auth-related properties that can be overridden in such situations. Only the "token" property and properties used to exchange one token for another ("urn:ietf:params:oauth:token-type:*") are now allowed.
See apache#10256 for context. This change disallows overriding the "credential" property in table sessions, by introducing an allow-list of auth-related properties that can be overridden in such situations. Only the "token" property and properties used to exchange one token for another ("urn:ietf:params:oauth:token-type:*") are now allowed.
See #10256 for context. This change disallows overriding the "credential" property in table sessions, by introducing an allow-list of auth-related properties that can be overridden in such situations. Only the "token" property and properties used to exchange one token for another ("urn:ietf:params:oauth:token-type:*") are now allowed.
See apache#10256 for context. This change disallows overriding the "credential" property in table sessions, by introducing an allow-list of auth-related properties that can be overridden in such situations. Only the "token" property and properties used to exchange one token for another ("urn:ietf:params:oauth:token-type:*") are now allowed.
As mandated by the spec, in case the config endpoint overrides any auth-specific property, the overridden values should be taken into account when creating the catalog auth session instance.