lb: endpointslice controller for externalTrafficPolicyLocal #291

BartVerc · 2024-03-12T12:36:47Z

What this PR does / why we need it:
ExternalTrafficPolicy Local requires traffic to route to specific endpoint nodes instead of routing traffic to all existing nodes. The EndpointSlice Controller looks at the Tenant cluster and updates Endpointslices in the infra cluster accordingly.

Which issue(s) this PR fixes:
Fixes #41

Release note:

Added enableEPSController load balancer config flag, to support ExternalTrafficPolicy Local.

kubevirt-bot · 2024-03-12T12:36:56Z

Hi @BartVerc. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot · 2024-03-12T12:37:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign qinqon for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

qinqon · 2024-03-12T15:23:53Z

@BartVerc can you split the PR into a pair of commits vendoring and the rest ?

davidvossel

This is a really neat PR. I have some concerns about how consistently this behavior of correlating the endpointslices in the tenant with the VM pod in the infra will work though.

This PR is using the tenant endpoint's NodeName to map an endpoint to a VM (excerpt below).

		for _, slice := range tenantSlices {
			for _, endpoint := range slice.Endpoints {
				// find all unique nodes that correspond to an endpoint in a tenant slice
				nodeSet.Insert(*endpoint.NodeName)
			}
		}

The issue here is that the NodeName is an optional field. Is it possible to encounter an endpointslice without a NodeName? For example, a selectorless service with custom endpointslices?

Also, we recently added a new feature to cloud provider kubevirt that allows for the Service that cloud-provider-kubevirt creates in the underlying infra cluster to be "selectorless". This was added as an escapehatch mechanism that allows people to implement their own controllers to manage the mirrored infra services similar to what you've done in this PR. The reason I point that out is, if we find that we can't reliably support the logic you're proposing here for some edge cases, it would be possible for you all to integrate your controller with cloud-provider-kubevirt and use the selectorLess config option. That would let you manage the endpointslices however you see fit and still use cloud-provider-kubevirt.

BartVerc · 2024-03-26T13:56:33Z

The issue here is that the NodeName is an optional field. Is it possible to encounter an endpointslice without a NodeName? For example, a selectorless service with custom endpointslices?

That is a valid concern, however, a selectorless LoadBalancer service could point to any endpoint, also outside the cluster. If it's inside the cluster it is up to the user to add the NodeName in my opinion.
With endpoints created via a Service with selector, the kubernetes endpointsliceController tries to find the NodeName via the pod pod.Spec.NodeName, which is set when it's running.
Also, when a tenant wants to create a selectorless LoadBalancer service, it can always decide to set the ExternalTrafficPolicy to Cluster instead of Local.

So I don't think an endpointslice without a NodeName is a concern.

davidvossel · 2024-03-29T16:37:55Z

That is a valid concern, however, a selectorless LoadBalancer service could point to any endpoint, also outside the cluster. If it's inside the cluster it is up to the user to add the NodeName in my opinion.
With endpoints created via a Service with selector, the kubernetes endpointsliceController tries to find the NodeName via the pod pod.Spec.NodeName, which is set when it's running.
Also, when a tenant wants to create a selectorless LoadBalancer service, it can always decide to set the ExternalTrafficPolicy to Cluster instead of Local.

Yeah, I think I get what you're saying.

NodeName is optional, but not setting NodeName on an endpointslice when externalTrafficPolicy is set to Local wouldn't make a lot of sense, so it doesn't seem to impact your logic in practice.

qinqon · 2024-04-04T08:03:08Z

Hey @BartVerc, you can reduce the code a lot if you use controller-runtime lib for this, since we are not really using anything specific from cloud-provider infra, what do you think ?

davidvossel

Thanks for splitting the PR into multiple commits. That made it way easier to review.

I did a quick pass to try and grok the general idea in more detail. My primary concern from looking at the new controller is that I think it's possible for us to have stale EPS in the infra cluster that do not map to EPS in the tenant.

We have no guarantees that the delete callback will get called when we're watching tenant EPS. If we miss a delete (which definitely happens), then it's possible that we'll have an EPS on the infra that sticks around longer than it should.

any thoughts on how that might be improved?

Another high level comment. Before we consider merging something like this, i'd like to see a basic e2e test case that exercises this controller. something similar to what we have here test/e2e/cloud_provider_kubevirt/load_balancer_test.go that exercises local traffic policy.

davidvossel · 2024-04-10T20:39:25Z

pkg/controller/kubevirteps/kubevirteps_controller.go

+			if c.tenantEPSTracker.contains(eps) {
+				c.tenantEPSTracker.remove(eps)
+				klog.Infof("get Infra Service for Tenant EndpointSlice: %v/%v", eps.Namespace, eps.Name)


the informer callbacks shouldn't be used to track objects like this. the issue is here is that we're not guaranteed that the DeleteFunc callback will be executed (the informer list/watch is not guaranteed to see the delete)

this would result in a memory leak.

davidvossel · 2024-04-10T20:47:19Z

pkg/controller/kubevirteps/kubevirteps_controller_utils.go

+	for _, n := range t.register {
+		if n == name {
+			return true
+		}
+	}


Since it looks like you're primarily using the t.register slice as a way of looking up if an endpoint slice exists, a map would be a more efficient data structure. I bet looping through all these EPS could get kind crazy in a cluster with a lot of endpoints

davidvossel · 2024-04-10T20:55:35Z

pkg/controller/kubevirteps/kubevirteps_controller.go

+			if c.tenantEPSTracker.contains(eps) {
+				klog.Infof("get Infra Service for Tenant EndpointSlice: %v/%v", eps.Namespace, eps.Name)
+				infraSvc, err := c.getInfraServiceFromTenantEPS(context.TODO(), eps)
+				if err != nil {
+					klog.Errorf("Failed to get Service in Infra cluster for EndpointSlice %s/%s: %v", eps.Namespace, eps.Name, err)
+					return
+				}
+				klog.Infof("EndpointSlice added: %v/%v", eps.Namespace, eps.Name)
+				c.queue.Add(newRequest(AddReq, infraSvc, nil))
+			}


I'm not sure how accurate changes to tenant endpointSlices are being tracked here.

the c.tenantEPSTracker object will only contain EPS from the last time the service was reconciled. If new tenant EPS are created, this will not trigger an update on the infra side because the new EPS won't exist in the c.tenantEPSTracker yet.

kubevirt-bot · 2024-07-09T21:32:09Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot · 2024-08-08T22:12:30Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

Signed-off-by: Bart Vercoulen <bartv@kumina.nl>

BartVerc · 2024-09-06T10:04:15Z

quick update: I found some time to work on this again, but I am having some trouble rebasing it on the main branch.
I started a new branch from main which seems to work better. Not sure if this will end up correctly in this PR, so maybe I need to open a new one.

kvaps · 2024-09-26T14:17:52Z

@BartVerc hi, I refactored your pr and prepared #330 and #331

kubevirt-bot · 2024-10-07T13:13:52Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Mar 12, 2024

kubevirt-bot added the size/XXL label Mar 12, 2024

kubevirt-bot requested review from davidvossel and mfranczy March 12, 2024 12:37

davidvossel reviewed Mar 18, 2024

View reviewed changes

BartVerc force-pushed the endpoint-controller branch from 75de3ba to 0e00a40 Compare March 26, 2024 13:58

kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Mar 26, 2024

BartVerc force-pushed the endpoint-controller branch from 0e00a40 to f6379e6 Compare March 26, 2024 14:04

kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Mar 26, 2024

kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 5, 2024

davidvossel reviewed Apr 10, 2024

View reviewed changes

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 9, 2024

kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 8, 2024

lb: endpointslice controller for External Traffic Policy Local.

d292e75

Signed-off-by: Bart Vercoulen <bartv@kumina.nl>

BartVerc force-pushed the endpoint-controller branch from f6379e6 to d292e75 Compare September 5, 2024 09:40

kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 5, 2024

kvaps mentioned this pull request Sep 26, 2024

Intoduce endpointslice controller #330

Open

kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lb: endpointslice controller for externalTrafficPolicyLocal #291

lb: endpointslice controller for externalTrafficPolicyLocal #291

BartVerc commented Mar 12, 2024

kubevirt-bot commented Mar 12, 2024

kubevirt-bot commented Mar 12, 2024

qinqon commented Mar 12, 2024

davidvossel left a comment

BartVerc commented Mar 26, 2024

davidvossel commented Mar 29, 2024

qinqon commented Apr 4, 2024

davidvossel left a comment

davidvossel Apr 10, 2024

davidvossel Apr 10, 2024

davidvossel Apr 10, 2024

kubevirt-bot commented Jul 9, 2024

kubevirt-bot commented Aug 8, 2024

BartVerc commented Sep 6, 2024

kvaps commented Sep 26, 2024

kubevirt-bot commented Oct 7, 2024

lb: endpointslice controller for externalTrafficPolicyLocal #291

Are you sure you want to change the base?

lb: endpointslice controller for externalTrafficPolicyLocal #291

Conversation

BartVerc commented Mar 12, 2024

kubevirt-bot commented Mar 12, 2024

kubevirt-bot commented Mar 12, 2024

qinqon commented Mar 12, 2024

davidvossel left a comment

Choose a reason for hiding this comment

BartVerc commented Mar 26, 2024

davidvossel commented Mar 29, 2024

qinqon commented Apr 4, 2024

davidvossel left a comment

Choose a reason for hiding this comment

davidvossel Apr 10, 2024

Choose a reason for hiding this comment

davidvossel Apr 10, 2024

Choose a reason for hiding this comment

davidvossel Apr 10, 2024

Choose a reason for hiding this comment

kubevirt-bot commented Jul 9, 2024

kubevirt-bot commented Aug 8, 2024

BartVerc commented Sep 6, 2024

kvaps commented Sep 26, 2024

kubevirt-bot commented Oct 7, 2024