Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding GEP-709: Cross namespace references from Routes #711

Merged
merged 2 commits into from
Jul 15, 2021

Conversation

robscott
Copy link
Member

What type of PR is this?
/kind gep

What this PR does / why we need it:
This adds GEP #709 as the culmination of discussion around ReferencePolicy, cross namespace forwarding, and route inclusion.

Does this PR introduce a user-facing change?:

NONE

/cc @youngnick @stevesloka @hbagdi @jpeach @danehans

@robscott robscott added the kind/gep PRs related to Gateway Enhancement Proposal(GEP) label Jul 10, 2021
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 10, 2021
@k8s-ci-robot
Copy link
Contributor

@robscott: GitHub didn't allow me to request PR reviews from the following users: stevesloka.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?
/kind gep

What this PR does / why we need it:
This adds GEP #709 as the culmination of discussion around ReferencePolicy, cross namespace forwarding, and route inclusion.

Does this PR introduce a user-facing change?:

NONE

/cc @youngnick @stevesloka @hbagdi @jpeach @danehans

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 10, 2021
@robscott
Copy link
Member Author

site-src/geps/gep-709.md Outdated Show resolved Hide resolved
name: bar
namespace: bar
spec:
from:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend adding a spec.rules[], with each rule having a from and to.
That allows for defining multiple policies in a single resource.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think due to the additive nature of this API I'd rather have multiple resources than one large resource, but open to ideas. I'm concerned that a list of lists would become hard to manage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to encourage people to have more, smaller objects rather than less, larger objects as well.

I note that the from and two stanzas are slices here though, but we don't have any discussion on how they should be combined. I'll add more here on the structs themselves.

site-src/geps/gep-709.md Show resolved Hide resolved
site-src/geps/gep-709.md Outdated Show resolved Hide resolved

* Conceptually similar to NetworkPolicy.
* A separate resource enables admins to restrict who can allow cross namespace
references.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action required.
Do we need any specific commentary on our Security model that might be useful in future when we look back on this GEP? I don't think so. Food for thought for you and others.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do update anything, it's probably only to point out something like "the purpose of this is to allow the person who controls the referent object the requirement to accept that reference, explicitly, with the intent of making cross-namespace references more secure by default."

}

// ReferencePolicyTo describes trusted kinds.
type ReferencePolicyTo struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a policy per-service will likely be a bottleneck in some scenarios. For example, if we have an egress proxy, and have an external-namespace. I probably just want to say from: egress-gateway, to: any service in external-namespace. I am not sure that is re-presentable here, without 1 ReferencePolicy per service?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly would be from: anywhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both to and from structs don't have names. This just represents trust from resources of type a in namespace foo to resources of type b within the same namespace as the ReferencePolicy. So in this case I think the to: any service in external-namespace is quite straightforward.

On the other hand from: anywhere is intentionally not allowed. I'm not entirely convinced it's a use case we want to encourage, but if it were, I think using a namespace selector instead of a string would be the appropriate way to do it. That does feel like a potential extension point, but I'm pretty hesitant to start with it unless there are compelling use cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think one of the key parts about this proposal is that it allows access from a Group/Kind and namespace to a Group/Kind and namespace (where the to namespace is always "the namespace the object is in".) I think that if we want to make it more flexible than "named namespace", it should be a label selector across namespaces only.

As it stands, we can do "Allow from a group of things in a list of namespaces" using these structs, since from is a list.

### Benefits

* Conceptually similar to NetworkPolicy.
* A separate resource enables admins to restrict who can allow cross namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of when this would actually be useful? I am having a hard time thinking of a scenario where I would want to disallow a namespace from exposing themselves to other namespaces

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is conceptually similar to organizations who want to restrict the namespaces that can create Services of type LB and use quotas to achieve that. It's possible that you want users to be able to create Services without allowing them to expose them externally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case for not allowing things to expose themselves to other namespaces is any cluster where a namespace admin for one namespace is not totally trusted. In that case, allowing inbound references is a critical step in privilege escalation (another reason to only allow this interaction in one direction).

For example, I've worked before on CI clusters, where each namespace ran a step in a CI pipeline. Now, CI pipelines are essentaily remote-code-execution-as-a-service, so those workloads are very untrusted. Would someone here be able to create a Service record? Possibly, and if so, it's almost certainly good to ensure that they cannot create cross-namespace references to their namespace.

This is a little contrived, I know, but I think the principle of "relatively untrusted namespace owner" is a pretty sound usecase for this overall.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I am not sure I understand. Can you let me know where I am going wrong? This is my understanding:

Lets say we have a namespace ingress, which contains a Gateway. We have a namespace untrusted, which has a Service.

I think the statement here is we are trying to avoid letting untrusted expose their service through ingress' LoadBalancer. We are NOT trying to restrict untrusted from creating an HTTPRoute though (that is independent of this and driven by Gateway/Route selection agreement); rather, we are trying to restrict ingress from creating an HTTPRoute referencing untrusted.

So by restricting untrusted from creating a ReferencePolicy, we think we have achieved this. In reality, the ingress namespace can just point to something in its own namespace that forwards to untrusted (there are many ways to do this, for simplicity lets just say they set up a dumb TCP proxy that forwards all requests to untrusted). As a result, I would say that the use of restrictive RBAC policy on ReferencePolicy does not achieve the goal of not allowing a Service to expose itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the ReferencePolicy is to require the owner of the untrusted namespace to accept incoming references. If they can't create ReferencePolicy objects, then they can't accept incoming references, and so say, the admin can create a more-open HTTPRoute that may reference a lot of Services, but only allow some to complete the two-way handshake.

I agree that it's a bit convoluted, but the idea here is to put the control over incoming references into the hands of the person who wons the referent. The main advantage of doing this with a separate object is that it allows us to apply this pattern to things whose spec we can't change (like Service), while still keeping that control in the hands of the object's owner. I think that the RBAC advantage of having a separate object is a minor one at best compared to that, but it may prove useful to people who are extremely careful with their RBAC policies.

I probably overemphasised this usecase previously, sorry about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the ReferencePolicy is to require the owner of the untrusted namespace to accept incoming references. If they can't create ReferencePolicy objects, then they can't accept incoming references, and so say, the admin can create a more-open HTTPRoute that may reference a lot of Services, but only allow some to complete the two-way handshake.

I think I understand what you are discussing now, but I don't think its a good idea. The fact its possible to do this with the current design is no problem - obviously we cannot stop users from doing obscure things, but I wouldn't want to recommend this.

Instead of the admin "handshaking" every namespace, but then restricting their ability to handshake back, it seems like they should just not "handshake" in the first place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be getting a bit side tracked here though 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think you make a fair point, that it's a pretty unlikely scenario that we already provide betters tools to deal with. But I think that having the ability to RBAC the ReferencePolicy object separately is an advantage, just not that big of one.

cross-namespace references.
* The implementation clearly documents that ReferencePolicy is not honored.

This exception is very unlikely to apply to any ingress implementations of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally agree, but this could be plausible in a single-tenant ingress. For example ingress namespace has all Gateways and Routes for the ingress. Some routes reference other namespace. There is no security risk here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is there no security risk in that scenario? Would it be possible the implementation to know that it was always going to be deployed in that way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this a bit more, I think there are some instances where ingress implementations deployed in this way could potentially deserve an exception. Fundamentally the distinction is if the implementation would be subject to some other cross-namespace restrictions such as NetworkPolicy. I'll update this section to be a bit more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested a wording update here, I think that if we make that change, it covers this usecase already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely reworded this section, PTAL.

Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this API, and think it's a great evolution of the earlier work.

name: bar
namespace: bar
spec:
from:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to encourage people to have more, smaller objects rather than less, larger objects as well.

I note that the from and two stanzas are slices here though, but we don't have any discussion on how they should be combined. I'll add more here on the structs themselves.

site-src/geps/gep-709.md Show resolved Hide resolved
}

// ReferencePolicyTo describes trusted kinds.
type ReferencePolicyTo struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think one of the key parts about this proposal is that it allows access from a Group/Kind and namespace to a Group/Kind and namespace (where the to namespace is always "the namespace the object is in".) I think that if we want to make it more flexible than "named namespace", it should be a label selector across namespaces only.

As it stands, we can do "Allow from a group of things in a list of namespaces" using these structs, since from is a list.

//
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=253
Group string `json:"group"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because all three of these values are required, each from reference must fully specify a group, kind, and namespace. This is nice for implementers because it's very straightforward, but will this be flexible enough, or should we consider label selectors here?

The downside of including label selectors is that you then need two sets of fields that are both optional, so the zero values of the fields and slices of this struct really matter. This way is much simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been tempted to introduce selectors here but so far have avoided it due to the increased complexity they add. In v1alpha1 I think we have had a few too many selectors and it became difficult to keep track of everything. Direct references can still be confusing, but I think they're generally easier to understand. One of my primary concerns with selectors is that their default is so open. Interpreted literally, an empty selector would accept references from all namespaces.

So all of that to say that I agree with you. If we really need selectors, we can always consider adding them in the future.

### Benefits

* Conceptually similar to NetworkPolicy.
* A separate resource enables admins to restrict who can allow cross namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case for not allowing things to expose themselves to other namespaces is any cluster where a namespace admin for one namespace is not totally trusted. In that case, allowing inbound references is a critical step in privilege escalation (another reason to only allow this interaction in one direction).

For example, I've worked before on CI clusters, where each namespace ran a step in a CI pipeline. Now, CI pipelines are essentaily remote-code-execution-as-a-service, so those workloads are very untrusted. Would someone here be able to create a Service record? Possibly, and if so, it's almost certainly good to ensure that they cannot create cross-namespace references to their namespace.

This is a little contrived, I know, but I think the principle of "relatively untrusted namespace owner" is a pretty sound usecase for this overall.


* Conceptually similar to NetworkPolicy.
* A separate resource enables admins to restrict who can allow cross namespace
references.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do update anything, it's probably only to point out something like "the purpose of this is to allow the person who controls the referent object the requirement to accept that reference, explicitly, with the intent of making cross-namespace references more secure by default."

Comment on lines 171 to 174
ReferencePolicy. This should only be done if:
* Other mechanisms like NetworkPolicy can be used to effectively limit
cross-namespace references.
* The implementation clearly documents that ReferencePolicy is not honored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change to make this clearer:

Suggested change
ReferencePolicy. This should only be done if:
* Other mechanisms like NetworkPolicy can be used to effectively limit
cross-namespace references.
* The implementation clearly documents that ReferencePolicy is not honored.
ReferencePolicy. This may only be done if:
* Other mechanisms like NetworkPolicy are used to effectively limit
cross-namespace references.
* The implementation must clearly document that ReferencePolicy is not honored.

I think that "may" is okay here because we are following it up with clarifying "must" clauses straight afterwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAY, MUST, SHOULD should be capitalized so it's obvious we are using them in LOUD RFC TALK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a great way to do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the CVE is live, I've reworked this entire section, including "LOUD RFC TALK" :), PTAL.

cross-namespace references.
* The implementation clearly documents that ReferencePolicy is not honored.

This exception is very unlikely to apply to any ingress implementations of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested a wording update here, I think that if we make that change, it covers this usecase already.


## TLDR

This GEP attempts to tackle both cross namespace forwarding and route inclusion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we are actually defining route inclusion in this GEP.

We should probably say that this is: prepare the API for route inclusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW if there are references, it would be good to include to define what route inclusion means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we only have feature request issues that can be solved with Route inclusion (#634, #695) and a couple design docs. For now I've just clarified this sentence and added a brief description of Route inclusion.

- name: bar
namespace: bar
---
kind: ReferencePolicy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to call this "InboundReferencePolicy" or something make the direction obvious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. Any thoughts on this name change @youngnick @hbagdi @jpeach @howardjohn?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with it, although it's a little long. I had discounted it before because I don't see us adding an OutboundReferencePolicy at any point, so I didn't think the disambiguation was necessary. But I don't object if others think it makes things clearer, I'm a little deep into this design to be able to have a good idea of how the name looks at first glance.

Copy link
Contributor

@bowei bowei Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ReferencePolicy is fine. (per discussion)

site-src/geps/gep-709.md Outdated Show resolved Hide resolved
site-src/geps/gep-709.md Outdated Show resolved Hide resolved
site-src/geps/gep-709.md Outdated Show resolved Hide resolved
site-src/geps/gep-709.md Outdated Show resolved Hide resolved
* Issue URL: https://github.com/kubernetes-sigs/gateway-api/issues/709
* Status: Implementable

## TLDR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we say very explicitly what fields and kinds are intended to be covered as part of "core" behavior in this proposal?

Also do we say that implementations may cover more Kinds and fields, but that is Custom behavior and there is no conformance there. (is this a good idea?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this to the proposed spec. I think non-core kinds would fall under extended conformance since the expected behavior is clear, it's just a difference of which types we expect all implementations to support.

@robscott
Copy link
Member Author

Thanks for all the great feedback on this! I think I've responded to all the feedback now, let me know if I missed anything. I just pushed a round of updates that should get this closer. It also includes a reference to the relevant CVE now: kubernetes/kubernetes#103675.


## ReferencePolicy

Anytime we allow crossing a namespace boundary, we need to be very cautious. To
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[very cautious](cite CVE)

Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

I'll leave it up to others to make the call as to if InboundReferencePolicy or ReferencePolicy is clearer. I don't have a strong enough opinion to have it affect my call. :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2021
@k8s-ci-robot k8s-ci-robot merged commit 8791c4f into kubernetes-sigs:master Jul 15, 2021
@youngnick
Copy link
Contributor

uh, I didn't think my lgtm would be enough to merge this, oops. Sorry.

@bowei
Copy link
Contributor

bowei commented Jul 15, 2021

doh - rob you forgot to hold it

@robscott
Copy link
Member Author

robscott commented Jul 15, 2021

No worries, that's on me, meant to add a hold on these, will open another one with next round of changes.

@youngnick
Copy link
Contributor

I think we were just down to the name of the object anyway, pretty much.

@robscott
Copy link
Member Author

Yeah agreed, think this one was pretty close

@robscott robscott deleted the referencepolicy-gep branch January 8, 2022 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/gep PRs related to Gateway Enhancement Proposal(GEP) lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants