Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux oci support 2 #4932

Merged
merged 13 commits into from
Jun 22, 2022
Merged

Conversation

gfichtenholt
Copy link
Contributor

@gfichtenholt gfichtenholt commented Jun 17, 2022

This is an incremental checkin towards flux plug-in OCI support. I wanted to
checkpoint before the changes "get out of hand". All tests are passing
Working slower than usual due to poor health. Hope to improve that by end of next week.

Two serious unsolved questions remain:

  1. if repo is associated with SecretRef, my runtime code gets an error
Failed to read secret for repo due to: rpc error: code = PermissionDenied 
desc = Forbidden to get the secret 'helm-podinfo' due to 'secrets "helm-podinfo" 
is forbidden: User "system:serviceaccount:kubeapps:kubeapps-internal-kubeappsapis" 
cannot get resource "secrets" in API group "" in the namespace "default"'

This is true for both regular helm and helm OCI repos. Since the code runs in the context of
kubeapps-internal-kubeappsapis service account, to workaround I have a hack YAML
to put that service account into cluster admin role. Also, I see that flux source-controller itself does
not have this issue and I have yet to conclusively figure out why. Didn't spend enough
time debugging, just wanted to move on

Update: @dlaloue-vmware found the reason why flux source-controller does not run into the issue:
https://fluxcd.io/docs/security/#controller-permissions
Their solution is to run the controller as cluster-admin. Which maybe fine for them but not for us: #3899 (comment)
To be continued...

  1. OCI repository list (i.e. the catalog/GetAvailablePackageSummaries). OCI spec does
    not provide for it. No one's solved this yet to my knowledge, Sigh. Flux does not solve it, but rather
    very carefully dances around it which, honestly, left me disappointed. Also, the way it is done,
    it limits my options because the flux's HelmRepository CR does not have a place to hang a list even if we
    put the burden on the user to provide it (like we do with helm direct)
    @dlaloue-vmware suggested we solve it just for a few biggest OCI registry providers that we know of
    (e.g. harbor, github) using vendor-specific API endpoints while leaving the rest (Gcp, Azure, AWS ECR, OpenShift, Oracle OCI, kind, Rancher , etc. as an exercise for later.
    My research so far:
  • Harbor here. Verified
  • Amazon ECR
  • GitHub here. I verified it works. Too well, I think. By that I mean (pages and pages, thousands and thousands of repos) are returned. I am guessing all the repos my username (gfichtenholt) has read access to. See first page here The response MUST be filtered out on the server-side, but I don't see an API for it. Also, even if there were an API for it, say a query param or a transport header, we'd need to figure out where to hang it. Again, Flux CR HelmRepository doesn't have a place for it.

Don't know about others yet for sure. The general idea, if we have it working, that would be a really good value-add for kubeapps because no one else has it. Anyway, I am still thinking about how to do it cleanly.

Anybody and everybody is invite to review. Thanks in advance

@netlify
Copy link

netlify bot commented Jun 17, 2022

Deploy Preview for kubeapps-dev canceled.

Built without sensitive environment variables

Name Link
🔨 Latest commit d314d46
🔍 Latest deploy log https://app.netlify.com/sites/kubeapps-dev/deploys/62b26e178be3ec000807b962

@absoludity
Copy link
Contributor

This is an incremental checkin towards flux plug-in OCI support. I wanted to checkpoint before the changes "get out of hand". All tests are passing Working slower than usual due to poor health. Hope to improve that by end of next week.

Oh no - hope you're starting to feel better.

Two serious unsolved questions remain:

1. if repo is associated with SecretRef, my runtime code gets an error
Failed to read secret for repo due to: rpc error: code = PermissionDenied 
desc = Forbidden to get the secret 'helm-podinfo' due to 'secrets "helm-podinfo" 
is forbidden: User "system:serviceaccount:kubeapps:kubeapps-internal-kubeappsapis" 
cannot get resource "secrets" in API group "" in the namespace "default"'

This is true for both regular helm and helm OCI repos. Since the code runs in the context of kubeapps-internal-kubeappsapis service account, to workaround I have a hack YAML to put that service account into cluster admin role.

Yes, since the flux controller work began, we've chatted about the fact that up until now, Kubeapps has always been able to use the user's credential because actions were synchronous. But with packaging systems that reconcile in the background (asynchronously from user actions), such as Carvel and Flux, an extra credential is required - such as the service account that is associated with the installed package. That's all well and good, but for flux we also need to be able to manage package repositories asynchronously, so in the past you've added the extra RBAC and service account for flux which enables the controller to access Flux HelmRepositories (and CRDs) without users. This elevates the privileges the service has when configured with fluxv2.

So one option here would be to add the RBAC to read secrets there as well. Except, well, Flux isn't alone here: even the existing standard helm plugin needs to access AppRepositories asynchronously so always had RBAC to do so, and so when private app repositories were added, which reference secrets, we added a horrible hack to copy the secret so Kubeapps was able to index the repository, rather than enabling Kubeapps to read secrets generally across namespaces (we did this as a temporary work-around but planned longer term for the repository sync to happen in the user namespace). Now, in retrospect and with the flux use-case, rather than adding the read-secrets RBAC for flux specifically, perhaps that should instead be part of Kubeapps' default RBAC, given that it needs to access app repositories (in their different forms) across namespaces when they include secret credentials (and we could remove the horrible hack). What do you think? (not asking you to do that work - I'd create a separate issue for removing the hack, you could just add the RBAC so kubeapps-apis can read secrets across namespaces, if you agree).

2. OCI repository list (i.e. the catalog/GetAvailablePackageSummaries). OCI spec does
   not provide for it. No one's solved this yet to my knowledge, Sigh.

Yes, this has been an issue since the OCI spec was released without it solved, and everyone implemented their custom API around it :/

Flux does not solve it, but rather
very carefully dances around it which, honestly, left me disappointed. Also, the way it is done,
it limits my options because the flux's HelmRepository CR does not have a place to hang a list even if we
put the burden on the user to provide it (like we do with helm direct)

Not that you should, but an annotation could be (ab)used to hint or specify the repositories?

   @dlaloue-vmware  suggested we solve it just for a few biggest OCI registry providers that we know of
   (e.g. harbor, github) using vendor-specific API endpoints while leaving the rest (Gcp, Azure, AWS ECR, OpenShift,  Oracle OCI, kind, Rancher , etc. as an exercise for later.

...

Don't know about others yet for sure. The general idea, if we have it working, that would be a really good value-add for kubeapps because no one else has it. Anyway, I am still thinking about how to do it cleanly.

Yeah - it'd be neat, but it's a project in itself, providing that compatibility layer, with a potentially high maintenance burden. It was (from memory) one of the options at the time when the OCI support was added to Helm, but given a standard for listing repositories was not available at the time, the decision to simply specify the repositories that would be presented was taken.

I see that now the OCI Registry As a Storage (ORAS) project has a v2 experimental version with support for listing repositories - it may be worth a look. ORAS tends to develop the new features that then may be later added to the OCI distribution spec. As it is, it looks like the v2 experimental stuff still just that: experimental.

See what you think. Personally feels like it may not be worth investing in plugging that hole and instead using a work-around, but you might have other ideas.

Anybody and everybody is invite to review. Thanks in advance

I'll take a look now :)

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Jun 21, 2022

Thanks, Michael.
re: perhaps that should instead be part of Kubeapps' default RBAC, given that it needs to access app repositories (in their different forms) across namespaces when they include secret credentials
+1

re: Not that you should, but an annotation could be (ab)used to hint or specify the repositories?
That won't solve all use cases. We can do an annotation when kubeapps creates the CRD. But when the end user bypasses kubeapps and uses flux CLI or kubectl directly, we can't.

re: Yeah - it'd be neat, but it's a project in itself, providing that compatibility layer,
I don't mind it so far. I love an interesting challenge, and this one fits the bill. Credit to @dlaloue-vmware he argued very nicely to look at this as an opportunity to add value to kubeapps rather than be a minus.

re: Personally feels like it may not be worth investing in plugging that hole and instead using a work-around.
If there is a reasonable workaround, I would certainly consider it.

limitations under the License.
*/

// a copy of fluxcd source-controller internal/transport/transport.go
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's just this one file for the transport that's been copied in: I reckon it might be worth a comment here why we're deciding to copy this file into here (why the functionality is important enough - probably obvious to you, but won't be to other contributors until they dig in). Also, I think you already checked last time, but is there a way to convince the flux project to move this (one) package out of their internal packages?

Copy link
Contributor Author

@gfichtenholt gfichtenholt Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it implements pooling. I.e. it is goodness
I am certainly not trying to blaze any trails with this

// Code from Helm Registry Client. Copied here since it belonged to a internal package.

re: is there a way to convince the flux project to move this (one) package out of their internal packages
Egghh, how would I start this conversation? What can I offer to flux guys to make it worth the effort of changing what's there? There has to be something in it for flux guys in what I am suggesting. Let me sleep on it....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add some words to clarify why I chose to copy

// This Secret may either hold "username" and "password" fields or be of the
// apiv1.SecretTypeDockerConfigJson type and hold a apiv1.DockerConfigJsonKey field with a
// complete Docker configuration. If both, "username" and "password" are
// empty, a nil LoginOption and a nil error will be returned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering why it's not an error if the secret passed in does not contain the expected data? Or is it that you pass a number of secrets through without checking them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, at the moment my code is not checking the type of secret being passed in. So any kind of secret may be used. Whereas I am only interested in the basic auth-side of it (may be used with Opaque, basic-auth possibly other secret types). So the state when username and password == "" is NOT semantically an error. All it means is that I treat this case as if there was no secret at all. For now anyway

return false, nil
}

// ref https://github.com/distribution/distribution/blob/main/docs/spec/api.md#listing-repositories
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realise this was now supported, but yeah, the _catalog endpoint isn't quite what we'd normally want, since it's all repositories on the registry (well, also specific to the registry implementation). That is, it doesn't allow us to ask for the repositories on a certain path, afair. Found this from a couple of years ago: #1358 (comment) )

return nil, status.Errorf(codes.Internal, "failed to create registry client: %v", err)
}
if file != "" {
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a bit of debugging logging and commented out code here - not sure if you wanted it reviewed yet in its current state? I'm ignoring the commented out stuff, but wondering why it's here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

work in progress. No need to review commented out code, of course. These are just notes to myself. It'll be clean at the end, I pinky promise

@absoludity
Copy link
Contributor

I don't mind it so far. I love an interesting challenge, and this one fits the bill. Credit to @dlaloue-vmware he argued very nicely to look at this as an opportunity to add value to kubeapps rather than be a minus.

Excellent then :)

@gfichtenholt
Copy link
Contributor Author

I don't mind it so far. I love an interesting challenge, and this one fits the bill. Credit to @dlaloue-vmware he argued very nicely to look at this as an opportunity to add value to kubeapps rather than be a minus.

Excellent then :)

Plus I already started down this path and moved a "small sized mountain". So, the fallacy of sunk costs is playing a role
https://github.com/gfichtenholt/kubeapps/blob/7658abdcd55a7d64abacca6fad1075b6e03c4bf2/cmd/kubeapps-apis/plugins/fluxv2/packages/v1alpha1/oci_repo.go#L168 and
https://github.com/vmware-tanzu/kubeapps/pull/4932/files#diff-3d1086b8e96b9fb11e594b6d5bd478968a4c04af9cbac0c190f5e6ee01aaa20e
Thanks.

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Jun 21, 2022

Re: I see that now the OCI Registry As a Storage (ORAS) project has a v2 experimental version with support for listing repositories - it may be worth a look
I took a look, and looks a bit "wet" to me. This is way too simplistic. I mean its a step in the right direction, but I don't see any API for server-side filtering, which I think is a MUST so
it looks like the v2 experimental stuff still just that: experimental
100% Agreed

@gfichtenholt gfichtenholt merged commit 8842964 into vmware-tanzu:main Jun 22, 2022
@gfichtenholt gfichtenholt deleted the flux-oci-support-2 branch June 22, 2022 01:53
@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Jun 22, 2022

Re: I see that now the OCI Registry As a Storage (ORAS) project has a v2 experimental version with support for listing repositories - it may be worth a look I took a look, and looks a bit "wet" to me. This is way too simplistic. I mean its a step in the right direction, but I don't see any API for server-side filtering, which I think is a MUST so it looks like the v2 experimental stuff still just that: experimental 100% Agreed

I took a second look (at the implementation this time) and I take back the comment. I think IT might be possible to re-use most of what they have. So, I am going to try to adopt it provided I can somehow shoehorn a server side repository name filter in there.

The caveat is what they have only applies to vendors that implement "Docker Registry API V2 or
OCI Distribution Specification." (e.g. GutHub). Not covered here (today) are Harbor, Azure, etc. Just pointing it out

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Jun 23, 2022

Re: I see that now the OCI Registry As a Storage (ORAS) project has a v2 experimental version with support for listing repositories - it may be worth a look I took a look, and looks a bit "wet" to me. This is way too simplistic. I mean its a step in the right direction, but I don't see any API for server-side filtering, which I think is a MUST so it looks like the v2 experimental stuff still just that: experimental 100% Agreed

I took a second look (at the implementation this time) and I take back the comment. I think IT might be possible to re-use most of what they have. So, I am going to try to adopt it provided I can somehow shoehorn a server side repository name filter in there.

The caveat is what they have only applies to vendors that implement "Docker Registry API V2 or OCI Distribution Specification." (e.g. GutHub). Not covered here (today) are Harbor, Azure, etc. Just pointing it out

Gasp. I may have to wait a little while on this (server-side filtering of repo names):
oras-project/oras-go#196
Look like I was the first one asking...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Support for OCI registries in Flux plugin
3 participants