-
Notifications
You must be signed in to change notification settings - Fork 39.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding restart kubelet flag on e2e test #97028
Conversation
/test pull-kubernetes-e2e-gce-ubuntu-containerd |
@@ -221,6 +221,8 @@ type NodeTestContextType struct { | |||
// the node e2e test. If empty, the default one (system.DefaultSpec) is | |||
// used. The system specs are in test/e2e_node/system/specs/. | |||
SystemSpecName string | |||
// RestartKubelet restarts Kubelet unit when the process is killed. | |||
RestartKubelet bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we should make the framework test context aware of such an option.
in fact a lot of options should be trimmed out of it if we eventually want to treat it as an API.
/cc @oomichi @alejandrox1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neolit123 As you stated, this still can be applied to other options, only used on Node E2E scope, inside the NodeTestContextType
and even TestContextType
directly. I don't see any other immediate place (used in the Node E2E suite) to set up and share the variable without reading the flag on init(), any suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given a number of flags already use the framework context for storage, this one can go there as well.
kubernetes/test/e2e_node/e2e_node_suite_test.go
Lines 81 to 87 in 00da68d
flags.BoolVar(&framework.TestContext.NodeConformance, "conformance", false, "If true, the test suite will not start kubelet, and fetch system log (kernel, docker, kubelet log etc.) to the report directory.") | |
flags.BoolVar(&framework.TestContext.PrepullImages, "prepull-images", true, "If true, prepull images so image pull failures do not cause test failures.") | |
flags.BoolVar(&framework.TestContext.RestartKubelet, "restart-kubelet", true, "If true, restart Kubelet unit when the process is killed.") | |
flags.StringVar(&framework.TestContext.ImageDescription, "image-description", "", "The description of the image which the test will be running on.") | |
flags.StringVar(&framework.TestContext.SystemSpecName, "system-spec-name", "", "The name of the system spec (e.g., gke) that's used in the node e2e test. The system specs are in test/e2e_node/system/specs/. This is used by the test framework to determine which tests to run for validating the system requirements.") | |
flags.Var(cliflag.NewMapStringString(&framework.TestContext.ExtraEnvs), "extra-envs", "The extra environment variables needed for node e2e tests. Format: a list of key=value pairs, e.g., env1=val1,env2=val2") | |
flags.StringVar(&framework.TestContext.SriovdpConfigMapFile, "sriovdp-configmap-file", "", "The name of the SRIOV device plugin Config Map to load.") |
but long term this should be decoupled.
ideally the PR should pass review by SIG Node / e2e_node owners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the ln#49-76,
TestContextType
contains test settings and global state. It is due to historical reasons that it is a mixture of items managed by the test framework itself, cloud providers and individual tests, the end goal is to move anything not required by the framework into the code which uses the settings.
The recommendation for those settings is:
- They are stored in their own context structure or local variables. (In
test/e2e_node/e2e_node_suite_test.go
as suggested) - The standard
flag
package is used to register them. - The flag name should follow the pattern
<part1>.<part2>....<partn>
where the prefix is unlikely to conflict with other tests or standard packages and each part is in lower camel case. (eg:node.kubelet.restartOnExit
?)
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeyKanzhelev this is an open discussion, do reusing the TestContext for node tests a topic to be reviewed? Should we start on this PR?
I think @alejandrox1 started a document for this refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the suggested refactoring, but I agree it is outside the scope of this PR. I think @neolit123 agreed with this. @ike-ma any concerns with making refactoring separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeyKanzhelev this is an open discussion, do reusing the TestContext for node tests to be reviewed, starting on this PR?
I think @alejandrox1 started a document for this refactoring.
@alejandrox1 @knabben Do we have an issue for the refactoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -80,6 +80,7 @@ func registerNodeFlags(flags *flag.FlagSet) { | |||
// It is hard and unnecessary to deal with the complexity inside the test suite. | |||
flags.BoolVar(&framework.TestContext.NodeConformance, "conformance", false, "If true, the test suite will not start kubelet, and fetch system log (kernel, docker, kubelet log etc.) to the report directory.") | |||
flags.BoolVar(&framework.TestContext.PrepullImages, "prepull-images", true, "If true, prepull images so image pull failures do not cause test failures.") | |||
flags.BoolVar(&framework.TestContext.RestartKubelet, "restart-kubelet", true, "If true, restart Kubelet unit when the process is killed.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems fine as long as sig node are OK with the change.
@kubernetes/sig-node-pr-reviews
Adding
--restart-kubelet
flag on E2E Node test suite
please explain what the flag does in the release note too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the flag was already approved here:
#96775 (review)
@@ -302,6 +302,7 @@ func (e *E2EServices) startKubelet() (*server, error) { | |||
} | |||
|
|||
cmd := exec.Command(cmdArgs[0], cmdArgs[1:]...) | |||
restartOnExit := framework.TestContext.RestartKubelet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can another variable be used to store the flag state instead of framework.TestContext.RestartKubelet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the flag from the file init is one option, I'm not sure if is a pattern we want to propagate here:
func init() {
flag.Var(&kubeletArgs, "kubelet-flags", "Kubelet flags passed to kubelet, this will override default kubelet flags in the test. Flags specified in multiple kubelet-flags will be concatenate.")
flag.BoolVar(&genKubeletConfigFile, "generate-kubelet-config-file", true, "The test runner will generate a Kubelet config file containing test defaults instead of passing default flags to the Kubelet.")
flag.BoolVar(&restartKubelet, "restart-kubelet", true, "If true, restart Kubelet unit when the process is killed.")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be the recommended way to do init()
here and use a local variable for the restart-kubelet
flag based on test/e2e/framework/test_context.go
ln#49-76?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is to decouple specific node flags from the main e2e framework context. But adding these flags on the init() here without further context it's only augmenting the mess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neolit123 we opened an issue to move this TestContext.RestartKubelet to a proper place, for this specific PR, we are keeping in the global one, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
/retest |
/reopen |
/assign @ike-ma |
@SergeyKanzhelev: GitHub didn't allow me to assign the following users: ike-ma. Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Friendly ping? :) |
Ack. Review in progress. |
/test pull-kubernetes-node-e2e-containerd What's the current status of this PR? |
/assign @cynepco3hahue to help @ike-ma to review |
I currently do not see a lot of issues adding the flag under the |
/lgtm |
Sorry for my late response. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: knabben, oomichi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-kubernetes-e2e-gce-ubuntu-containerd |
@knabben: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This flag was added to the test suite in kubernetes/kubernetes#97028 However as it turns out, in order to test the exit on lock contention, it is not needed, hence removing the flag from the test job. Signed-off-by: Imran Pochi <imran@kinvolk.io>
This flag was added to the test suite in kubernetes/kubernetes#97028 However as it turns out, in order to test the exit on lock contention, it is not needed, hence removing the flag from the test job. Signed-off-by: Imran Pochi <imran@kinvolk.io>
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adding a
--restart-kubelet
flag on Node E2E suite, the flag can be used to control the restart/not of Kubelet when it is killed.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Kubelet --exit-on-lock-contention will exit Kubelet when a new lock file is created, this flag is used in the suite to not bring this up via systemctl.
Does this PR introduce a user-facing change?: