Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pod update checking to WaitForControlledPodsRunning #1035

Merged
merged 1 commit into from
Feb 11, 2020

Conversation

mborsz
Copy link
Member

@mborsz mborsz commented Feb 7, 2020

This implements an experimental support for waiting for pod updates in WaitForControlledPodsRunning.

- name: Starting measurement for waiting for pods
  measurements:
  - Identifier: WaitForRunningPods
    Method: WaitForControlledPodsRunning
    Params:
      action: start
      apiVersion: apps/v1
      kind: Deployment
      labelSelector: l1 = v1
      checkIfPodsAreUpdated: true
      operationTimeout: 1h
- name: Create deployments with envVar set to old
  phases:
  - namespaceRange:
      min: 1
      max: 1
    replicasPerNamespace: 1
    tuningSet: 100qps
    objectBundle:
    - basename: foo
      objectTemplatePath: deployment.yaml
      templateFillMap:
         envToSet: old
- name: Waiting for objects become created
  measurements:
  - Identifier: WaitForRunningPods
    Method: WaitForControlledPodsRunning
    Params:
      action: gather

The new logic is enabled by setting checkIfPodsAreUpdated: true flag (name TBD).
At this point, nothing special happens.

Let's trigger some update:

- name: Update Foos
  phases:
  - namespaceRange:
      min: 1
      max: 1
    replicasPerNamespace: 1
    tuningSet: 100qps
    objectBundle:
    - basename: foo
      objectTemplatePath: deployment.yaml
      templateFillMap:
         envToSet: new
- name: Waiting for objects become updated
  measurements:
  - Identifier: WaitForRunningPods
    Method: WaitForControlledPodsRunning
    Params:
      action: gather

The gather call will block here until replicas pods exist in given deployment that match deployment's spec.template.spec.
Without this change, we would wait only for any replicas pod exist in given deployment (regardless if they are updated or not) which is not correct.

/assign @wojtekt-t
/assign @mm4tt

@k8s-ci-robot
Copy link
Contributor

@mborsz: GitHub didn't allow me to assign the following users: wojtekt-t.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

This implements an experimental support for waiting for pod updates in WaitForControlledPodsRunning.

- name: Starting measurement for waiting for pods
 measurements:
 - Identifier: WaitForRunningPods
   Method: WaitForControlledPodsRunning
   Params:
     action: start
     apiVersion: apps/v1
     kind: Deployment
     labelSelector: l1 = v1
     checkIfPodsAreUpdated: true
     operationTimeout: 1h
- name: Create deployments with envVar set to old
 phases:
 - namespaceRange:
     min: 1
     max: 1
   replicasPerNamespace: 1
   tuningSet: 100qps
   objectBundle:
   - basename: foo
     objectTemplatePath: deployment.yaml
     templateFillMap:
        envToSet: old
- name: Waiting for objects become created
 measurements:
 - Identifier: WaitForRunningPods
   Method: WaitForControlledPodsRunning
   Params:
     action: gather

The new logic is enabled by setting checkIfPodsAreUpdated: true flag (name TBD).
At this point, nothing special happens.

Let's trigger some update:

- name: Update Foos
 phases:
 - namespaceRange:
     min: 1
     max: 1
   replicasPerNamespace: 1
   tuningSet: 100qps
   objectBundle:
   - basename: foo
     objectTemplatePath: deployment.yaml
     templateFillMap:
        envToSet: new
- name: Waiting for objects become updated
 measurements:
 - Identifier: WaitForRunningPods
   Method: WaitForControlledPodsRunning
   Params:
     action: gather

The gather call will block here until replicas pods exist in given deployment that match deployment's spec.template.spec.
Without this change, we would wait only for any replicas pod exist in given deployment (regardless if they are updated or not) which is not correct.

/assign @wojtekt-t
/assign @mm4tt

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 7, 2020
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 7, 2020
@mborsz
Copy link
Member Author

mborsz commented Feb 7, 2020

/hold
I still need to test this on some real clusterloader config.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 7, 2020
@mborsz
Copy link
Member Author

mborsz commented Feb 7, 2020

/assign @wojtek-t

@@ -108,6 +110,12 @@ func (w *waitForControlledPodsRunningMeasurement) Execute(config *measurement.Me
if err != nil {
return nil, err
}
// default value is set to true to let presubmit validate this change.
// TODO(mborsz): Change default to false before submitting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to myself - ensure it's removed before lgtm-ing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add a TODO to make the default true eventually.
Maybe we shouldn't do that on Friday, but once we get enough confidence we should enabled it by default as it's the right way to do.

clusterloader2/pkg/measurement/util/wait_for_pods.go Outdated Show resolved Hide resolved
@@ -108,6 +110,12 @@ func (w *waitForControlledPodsRunningMeasurement) Execute(config *measurement.Me
if err != nil {
return nil, err
}
// default value is set to true to let presubmit validate this change.
// TODO(mborsz): Change default to false before submitting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add a TODO to make the default true eventually.
Maybe we shouldn't do that on Friday, but once we get enough confidence we should enabled it by default as it's the right way to do.

clusterloader2/pkg/measurement/util/pods.go Outdated Show resolved Hide resolved
@mborsz
Copy link
Member Author

mborsz commented Feb 7, 2020

I finished manual tests of this. It seems to work.

FTR yamls I used to test this:

# config.yaml
name test:
automanagedNamespaces: 1
tuningSets:
- name: 100qps
  GlobalQPSLoad:
    qps: 100
    burst: 10
- name: RandomizedUpdateTimeLimited
  RandomizedTimeLimitedLoad:
    timeLimit: 2h
steps:
- name: Starting measurement for waiting for pods
  measurements:
  - Identifier: WaitForRunningMicroservices
    Method: WaitForControlledPodsRunning
    Params:
      action: start
      apiVersion: apps/v1
      kind: Deployment
      labelSelector: l1 = v1
      operationTimeout: 1h
- name: Create Foos
  phases:
  - namespaceRange:
      min: 1
      max: 1
    replicasPerNamespace: 1
    tuningSet: 100qps
    objectBundle:
    - basename: foo
      objectTemplatePath: foo.yaml
      templateFillMap:
        env: old
- name: Waiting for objects become updated
  measurements:
  - Identifier: WaitForRunningMicroservices
    Method: WaitForControlledPodsRunning
    Params:
      action: gather
- name: Update Foos
  phases:
  - namespaceRange:
      min: 1
      max: 1
    replicasPerNamespace: 1
    tuningSet: 100qps
    objectBundle:
    - basename: foo
      objectTemplatePath: foo.yaml
      templateFillMap:
        env: new
- name: Waiting for objects become updated
  measurements:
  - Identifier: WaitForRunningMicroservices
    Method: WaitForControlledPodsRunning
    Params:
      action: gather

# foo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-foo
  labels:
    l1: v1
spec:
  selector:
    matchLabels:
      l1: v1
  replicas: 3
  template:
      metadata:
          labels:
              l1: v1
      spec:
          containers:
          - image: k8s.gcr.io/pause:3.1
            name: test
            env:
            - name: XX
              value: {{.env}}
            resources:
                limits:
                    cpu: 10m

The most interesting log entries:

I0207 14:24:00.529875   33550 wait_for_controlled_pods.go:179] WaitForControlledPodsRunning: waiting for controlled pods measurement...
I0207 14:24:05.679880   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (0 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:10.680236   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (0 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:15.680592   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (0 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:20.680859   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (0 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
E0207 14:24:25.681181   33550 wait_for_pods.go:89] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): 0 pods appeared:
I0207 14:24:25.681238   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (1 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 1 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:30.682538   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (1 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:35.682899   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (1 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:40.683325   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (1 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
E0207 14:24:45.683725   33550 wait_for_pods.go:89] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): 1 pods appeared: foo-0-69ccd484bc-cshzb
I0207 14:24:45.683782   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (2 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:50.684177   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 4 out of 3 created, 3 running (2 updated), 1 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady
I0207 14:24:55.684578   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 3 out of 3 created, 3 running (3 updated), 0 pending scheduled, 0 not scheduled, 0 inactive, 1 terminating, 0 unknown, 0 runningButNotReady
I0207 14:25:00.684972   33550 wait_for_pods.go:92] WaitForControlledPodsRunning: namespace(test-94gxoi-1), labelSelector(l1=v1): Pods: 3 out of 3 created, 3 running (3 updated), 0 pending scheduled, 0 not scheduled, 0 inactive, 0 terminating, 0 unknown, 0 runningButNotReady

mborsz added a commit to mborsz/perf-tests that referenced this pull request Feb 7, 2020
This is unnecessary (daemoset controller adds this toleration anyway) and breaks kubernetes#1035
mborsz added a commit to mborsz/perf-tests that referenced this pull request Feb 7, 2020
This is unnecessary (daemoset controller adds this toleration anyway) and breaks kubernetes#1035
@mborsz
Copy link
Member Author

mborsz commented Feb 7, 2020

/hold cancel

I tested this, changed default back to false.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 7, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Feb 7, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Feb 7, 2020

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 7, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mborsz, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mborsz
Copy link
Member Author

mborsz commented Feb 10, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 10, 2020
@mborsz
Copy link
Member Author

mborsz commented Feb 11, 2020

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants