Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Assume issued_token_type is access_token to fully comply with RFC 6749 #10314

Merged
merged 3 commits into from
Jul 4, 2024

Conversation

adutra
Copy link
Contributor

@adutra adutra commented May 11, 2024

The REST client wrongly assumes that the issued_token_type field is present in all OAuth responses, but that isn't true: e.g. in the client_credentials flow, this field is undefined. See RFC 6749, section 4.4.3:

https://datatracker.ietf.org/doc/html/rfc6749#section-4.4.3

This causes the client to crash when creating a tokens exchange request, since the issued token type becomes the request's subject token type, which is mandatory.

This has been verified against a Keycloak 24.0 server.

This change fixes this issue by assuming that the issued token type is an access token, if the response did not specify any token type.

This change also fixes RESTCatalogAdapter: it was incorrectly including the issued_token_type field in client_credentials responses, thus masking many test failures, e.g. in testCatalogTokenRefresh.

@@ -763,11 +763,19 @@ private static AuthSession fromTokenResponse(
long startTimeMillis,
AuthSession parent,
String credential) {
// issued token type is not present in every OAuth2 response:
// assume type is access token if none provided.
// See https://datatracker.ietf.org/doc/html/rfc6749#section-4.4.3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adutra do you maybe know if Keycloak fully supports RFC 8693? The token exchange follows https://www.rfc-editor.org/rfc/rfc8693#name-successful-response, where issued_token_type is required

Copy link
Contributor

@nastra nastra May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found keycloak/keycloak#26502, which states unfortunately that Keycloak deviates from the standard.
I can see that we might add the null check as a workaround for such cases where the auth server doesn't send back an issued_token_type but I'd like to first see what other people in the community think about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, Keycloak does not fully implement it. Quoting from their Securing Applications and Services Guide:

Token exchange in Keycloak is a very loose implementation of the OAuth Token Exchange specification at the IETF. We have extended it a little, ignored some of it, and loosely interpreted other parts of the specification. It is a simple grant type invocation on a realm’s OpenID Connect token endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the problem here is wider: issued_token_type is only defined for the token exchange flow (RFC 8693). This field does not exist for standard flows from RFC 6749. So the following scenario can happen even with fully-compliant servers:

  1. client uses client_credentials to authenticate initially;
  2. server sends a successful response similar to this one; note that it does not contain issued_token_type, which is normal since it's not defined for this flow.
  3. client attempts to refresh the token using the token exchange flow;
  4. client throws IAE because subjectTokenType is null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra also see this comment which gives some context on why we have the problem today: #4771 (comment)

@@ -254,7 +254,6 @@ private static OAuthTokenResponse handleOAuthRequest(Object body) {
case "client_credentials":
return OAuthTokenResponse.builder()
.withToken("client-credentials-token:sub=" + request.get("client_id"))
.withIssuedTokenType("urn:ietf:params:oauth:token-type:access_token")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably leave this as-is, since issued_token_type is required according to RFC 8693

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree: this flow is not governed by RFC 8693, this is RFC 6749 section 4.4.

@adutra
Copy link
Contributor Author

adutra commented May 27, 2024

Hi @nastra any updates on this PR?

FYI here is an example of error from a Spark SQL session connected to REST catalog + external auth server:

24/05/26 21:39:19 WARN Tasks: Retrying task after failure: Invalid token type: null
java.lang.IllegalArgumentException: Invalid token type: null
        at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:218)
        at org.apache.iceberg.rest.auth.OAuth2Util.tokenExchangeRequest(OAuth2Util.java:244)
        at org.apache.iceberg.rest.auth.OAuth2Util.tokenExchangeRequest(OAuth2Util.java:235)
        at org.apache.iceberg.rest.auth.OAuth2Util.refreshToken(OAuth2Util.java:140)

@nastra
Copy link
Contributor

nastra commented May 27, 2024

@adutra I'm OOO this week, maybe @amogh-jahagirdar gets a chance to look at this PR

@adutra
Copy link
Contributor Author

adutra commented Jun 4, 2024

@nastra or @amogh-jahagirdar is it possible for you to have a another look here please? Thanks 🙏

The REST client wrongly assumes that the `issued_token_type`
field is present in all OAuth responses, but that isn't
true: e.g. in the `client_credentials` flow, this field is
undefined. See RFC 6749, section 4.4.3:

https://datatracker.ietf.org/doc/html/rfc6749#section-4.4.3

This causes the client to crash when creating a tokens
exchange request, since the issued token type becomes the
request's subject token type, which is mandatory.

This has been verified against a Keycloak 24.0 server.

This change fixes this issue by assuming that the issued
token type is an access token, if the response did not
specify any token type.

This change also fixes `RESTCatalogAdapter`: it was
incorrectly including the `issued_token_type` field in
`client_credentials` responses, thus masking many test
failures, e.g. in `testCatalogTokenRefresh`.
@adutra
Copy link
Contributor Author

adutra commented Jul 3, 2024

@nastra @amogh-jahagirdar This PR has been rebased. Could I get a review please? 🙏

@nastra
Copy link
Contributor

nastra commented Jul 3, 2024

@adutra IMO https://github.com/apache/iceberg/pull/10314/files#r1601220154 still needs to be reverted. It's ok to add a workaround for Keycloak but I don't see a good enough reason to adjust what the server (aka RESTCatalogAdapter is sending back)

@adutra
Copy link
Contributor Author

adutra commented Jul 3, 2024

@adutra IMO https://github.com/apache/iceberg/pull/10314/files#r1601220154 still needs to be reverted. It's ok to add a workaround for Keycloak but I don't see a good enough reason to adjust what the server (aka RESTCatalogAdapter is sending back)

@nastra I'm sorry but I really don't agree. This has nothing to do with Keycloak. This is not a workaround. This has to do with complying with the OAuth 2.0 spec.

Have you read the 5.1 section of RFC 6749? Quoting here so that we are on the same page:

   The authorization server issues an access token and optional refresh
   token, and constructs the response by adding the following parameters
   to the entity-body of the HTTP response with a 200 (OK) status code:

   access_token
         REQUIRED.  The access token issued by the authorization server.

   token_type
         REQUIRED.  The type of the token issued as described in
         [Section 7.1](https://datatracker.ietf.org/doc/html/rfc6749#section-7.1).  Value is case insensitive.

   expires_in
         RECOMMENDED.  The lifetime in seconds of the access token.  For
         example, the value "3600" denotes that the access token will
         expire in one hour from the time the response was generated.
         If omitted, the authorization server SHOULD provide the
         expiration time via other means or document the default value.

   refresh_token
         OPTIONAL.  The refresh token, which can be used to obtain new
         access tokens using the same authorization grant as described
         in [Section 6](https://datatracker.ietf.org/doc/html/rfc6749#section-6).

   scope
         OPTIONAL, if identical to the scope requested by the client;
         otherwise, REQUIRED.  The scope of the access token as
         described by [Section 3.3](https://datatracker.ietf.org/doc/html/rfc6749#section-3.3).

As you can see there is no issued_token_type field. We therefore must comply with that, and be prepared to receive responses that do NOT contain that field.

A client that breaks because that field is missing is not a OAuth2-compliant client.

By returning an unspecified field for a client_credentials grant, RESTCatalogAdapter is also infringing the OAuth 2.0 spec.

@nastra
Copy link
Contributor

nastra commented Jul 3, 2024

@adutra I've read RFC 6749 but as I mentioned in #10314 (comment) I'm assuming that the token exchange is following RFC 8693 (and my assumption can be wrong). It would be good to get confirmation on that though.

@adutra
Copy link
Contributor Author

adutra commented Jul 3, 2024

@adutra I've read RFC 6749 but as I mentioned in #10314 (comment) I'm assuming that the token exchange is following RFC 8693 (and my assumption can be wrong). It would be good to get confirmation on that though.

Sure, but RFC 8693 is not applicable here. All depends on the grant type being used. If the grant type is urn:ietf:params:oauth:token-type:access_token then yes, RFC 8693 applies, and yes, there is an issued_token_type field in the response, and yes it is a required field.

But if the grant type is client_credentials then RFC 6749 applies instead, and the field is not part of it at all.

@adutra
Copy link
Contributor Author

adutra commented Jul 3, 2024

@rdblue since you originally authored this part (#4771 (comment)) would you mind having a look as well? Thank you!

@snazy
Copy link
Member

snazy commented Jul 3, 2024

@nastra Alex is 100% right. I do not understand why it takes so long to review this fix for an OAuth spec-compliance.

@danielcweeks
Copy link
Contributor

I think the application of extensions referenced in RFC 8693 are a little ambiguous due to the following:

RFC 6749 section 4.1 references the response described in section 5.1

RFC 8693 describe the extensions to what is defined in the response to that same section to expand upon the format of what the access_token field contains. It states in the section 2.2.1:

The identifier access_token is used for historical reasons and the issued token need not be an OAuth access token.

I believe that the intent includes that a client credential exchange could return any of the enumerated token types defined in section 3 and applies.

I don't think it's explicitly clear either way, but I would interpret it as the latter.

Either way, I don't think it's a huge issue to default to ensure RFC 6749 compatibility, but we should just add a note as to why we're defaulting (e.g. "defaulting issued_token_type to access_token for compatibility with RFC 6749 where the issued type is omitted).

@snazy
Copy link
Member

snazy commented Jul 3, 2024

@nastra requested to revert @adutra 's fix for the client-credentials flow, only RFC 6749 applies here. This part of the code has nothing to do with RFC 8693 (token-exchange). The request to revert is IMO not justified, because the flow in question does not specify it.

@adutra
Copy link
Contributor Author

adutra commented Jul 3, 2024

I believe that the intent includes that a client credential exchange could return any of the enumerated token types defined in section 3 and applies.

RFC 8693 builds on top of RFC 6749, but does not modify any of its structs. And how could it be otherwise? An RFC cannot modify another one's structs without officially superseding it. Imho it is wrong to read section 2.2.1 of RFC 8693 as a general rewrite of RFC 6749 section 5.1, valid for all grants. The correct reading is: section 2.2.1 expands section 5.1 by adding extra context, in the scope of a token exchange grant only.

And anyways, that's how all public OAuth 2.0 servers interpret it: none of them include the field issued_token_type in a client_credentials grant response, even if they support RFC 8693.

TLDR is: if you want Iceberg REST to be interoperable with any public OAuth 2.0 server, this PR needs to be in.

@danielcweeks
Copy link
Contributor

@adutra

An RFC cannot modify another one's structs without officially superseding it.
. . . RFC 8693 as a general rewrite of RFC 6749 section 5.1

I'm not suggesting that this is a general rewrite, but was more interpreting that RFC 8693 as an extension that can apply to any access_token grant type response.

. . . in the scope of a token exchange grant only.

I think this is the best argument and I agree it makes any argument for issued_token_type for other grant flows to be weak.

And anyways, that's how all public OAuth 2.0 servers interpret it: none of them include the field issued_token_type in a client_credentials grant response, even if they support RFC 8693.

Maybe you misinterpreted my message, but I actually support adding this. I couldn't find any examples and I also understand that RFC 8693 isn't widely adopted, so we may run into a number of issues with existing implementations out there.

Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
@nastra nastra changed the title REST: assume issued token type is access token Core: Assume issued_token_type is access_token to fully comply with RFC 6749 Jul 4, 2024
@nastra nastra merged commit 4048987 into apache:main Jul 4, 2024
49 checks passed
@adutra
Copy link
Contributor Author

adutra commented Jul 4, 2024

Thank you to all involved! I'm very glad that we could reach a consensus here 🙏

@adutra adutra deleted the issued-token-type branch July 4, 2024 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants