Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priority expander relies now on listers instead of direct watch api #1843

Conversation

piontec
Copy link
Contributor

@piontec piontec commented Mar 29, 2019

This is a rework of this PR: #1801
The changes are as follows:

  • it uses lister now
  • unittests rely on already existing tooling
  • app is never crashed, just error is logged

Please review @MaciekPytel

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 29, 2019
@piontec
Copy link
Contributor Author

piontec commented Mar 29, 2019

/assign @MaciekPytel


func getPriorityExpandersConfigMapLister(kubeClient kube_client.Interface, stopChannel <-chan struct{},
namespace string) v1lister.ConfigMapLister {
// FIXME: how to be sure the reflector completed at least one run and switch to the line below?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaciekPytel I have a problem here: how to make sure the reflector completed at least 1 sync loop before I access the lister? I want to validate the configmap before starting.
If this can't be done easily, I probably have to give up on startup time validation and assume reflector will be ready when first call for Best comes. But even that smells like race condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CA loop generally takes a few seconds before it gets to expander and other listers must have already synced for a scale-up to have any chance of happening. Since you have a fallback strategy and the race is quite unlikely I'm ok to merge this without logic to explicitly make sure lister has initialized.
If you think you really need to make sure: I think the LastSyncResourceVersion() you do below is the best way unfortunately (at least the best way I know of). I don't like creating lister in here though. This file is really not the place for all those client-go internals. If you really want to sync, I'd rather you created your own configmap lister interface in utils/kubernetes/listers.go (like we do for pod listers) and expose a HasSynced() method.

return nil, errors.ToAutoscalerError(errors.InternalError, err)
}
return priority.NewStrategy(initialPriorities, priorityChangesChan, autoscalingKubeClients.LogRecorder)
// TODO: how to get proper termination info? It seems other listers do the same here
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other listers seems to be doing the same - please make sure it's ok to leave the termination channel like that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine. Currently you can't change expander at runtime and you can't stop autoscaler main loop without killing the process, so we don't have any need to ever stop listers.

@piontec
Copy link
Contributor Author

piontec commented Mar 29, 2019

And one important question here: is there any comprehensive doc arund here about how to use the client-go lib properly? Either I can't find it or it's just a bunch of blog entries of kind "do this and that". The API is pretty big, I was running around different packages like crazy trying to figure out what's what and how to use it.

@piontec piontec force-pushed the feature/priority_expander_remake_squashed branch from 142e057 to 1516d42 Compare March 29, 2019 12:37
@MaciekPytel
Copy link
Contributor

is there any comprehensive doc arund here about how to use the client-go lib properly?

I'm not aware of any such doc. I'm not an expert on client-go though, it may be worth asking directly on that repo or in kubernetes slack sig-apimachinery channel. As far as I know Kubernetes documentation is mostly limited to how to use it. Documentation for developing is minimal and the entry point for contributing is quite high.

return nil, errors.ToAutoscalerError(errors.InternalError, err)
}
return priority.NewStrategy(initialPriorities, priorityChangesChan, autoscalingKubeClients.LogRecorder)
// TODO: how to get proper termination info? It seems other listers do the same here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine. Currently you can't change expander at runtime and you can't stop autoscaler main loop without killing the process, so we don't have any need to ever stop listers.


func getPriorityExpandersConfigMapLister(kubeClient kube_client.Interface, stopChannel <-chan struct{},
namespace string) v1lister.ConfigMapLister {
// FIXME: how to be sure the reflector completed at least one run and switch to the line below?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CA loop generally takes a few seconds before it gets to expander and other listers must have already synced for a scale-up to have any chance of happening. Since you have a fallback strategy and the race is quite unlikely I'm ok to merge this without logic to explicitly make sure lister has initialized.
If you think you really need to make sure: I think the LastSyncResourceVersion() you do below is the best way unfortunately (at least the best way I know of). I don't like creating lister in here though. This file is really not the place for all those client-go internals. If you really want to sync, I'd rather you created your own configmap lister interface in utils/kubernetes/listers.go (like we do for pod listers) and expose a HasSynced() method.

@piontec
Copy link
Contributor Author

piontec commented Apr 4, 2019

OK, I didn't want to leave the code there, just make sure first how to solve this. In this case I'm in favor of not running configmap validation at startup and leave that until the first BestOption() call happens - then any potential problems will go to logs and CM's events. Let me know if you're OK with that and I can quickly change it.

@piontec
Copy link
Contributor Author

piontec commented Apr 8, 2019

ping @MaciekPytel

@MaciekPytel
Copy link
Contributor

Sorry for delay. I'm ok with what you proposed in #1843 (comment).

@piontec piontec force-pushed the feature/priority_expander_remake_squashed branch from 1516d42 to 17e4c89 Compare April 12, 2019 07:29
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 12, 2019
@piontec
Copy link
Contributor Author

piontec commented Apr 12, 2019

@MaciekPytel Done, on-init config check is removed. Please review again.

Copy link
Contributor

@MaciekPytel MaciekPytel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, otherwise lgtm (though the nature of comment suggests you haven't actually run the code, if that's true please test before merging).

// NewConfigMapListerForNamespace builds a configmap lister for the passed namespace (including all).
func NewConfigMapListerForNamespace(kubeClient client.Interface, stopchannel <-chan struct{},
namespace string) v1lister.ConfigMapLister {
listWatcher := cache.NewListWatchFromClient(kubeClient.AppsV1().RESTClient(), "configmaps", namespace, fields.Everything())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configmap is in core/v1, not apps/v1. Does that work for you? Seems like it shouldn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my bad, I ran tests actually before removing the test code here. I was sure it was exactly the same as in Listers - it wasn't. Fixing, and I'll re-run tests once more. Thanks for catching!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaciekPytel please check now

@piontec piontec force-pushed the feature/priority_expander_remake_squashed branch from 17e4c89 to 119dc75 Compare April 15, 2019 09:20
@piontec piontec force-pushed the feature/priority_expander_remake_squashed branch from 119dc75 to 856e58d Compare April 16, 2019 11:48
@MaciekPytel
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 16, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MaciekPytel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants