Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM Instance Initialization: Driver.InitializeMachine #898

Merged
merged 10 commits into from
Apr 3, 2024

Conversation

elankath
Copy link
Contributor

@elankath elankath commented Jan 31, 2024

What this PR does / why we need it:

See #871

Enhance the Machine Controller triggerCreationFlow to correctly handle post-creation instance initialization steps with appropriate retry handling.

Which issue(s) this PR fixes:
Fixes part of #871

Special notes for your reviewer:

Release note:

 New provider method Driver.InitializeMachine added for Post-Creation VM Instance Initialization steps.

@elankath elankath requested a review from a team as a code owner January 31, 2024 08:16
@gardener-robot gardener-robot added the needs/review Needs review label Jan 31, 2024
@elankath elankath self-assigned this Jan 31, 2024
@gardener-robot gardener-robot added the size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) label Jan 31, 2024
@elankath elankath added do-not-merge/work-in-progress and removed size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Jan 31, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 31, 2024
pkg/util/provider/driver/driver.go Outdated Show resolved Hide resolved
pkg/util/provider/driver/driver.go Outdated Show resolved Hide resolved
pkg/util/provider/driver/driver.go Outdated Show resolved Hide resolved
pkg/util/provider/driver/fake.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machine.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machine.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machine.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added the size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) label Mar 5, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 5, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 5, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 5, 2024
Copy link
Contributor

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some changes requested. please address them

pkg/util/provider/driver/driver.go Outdated Show resolved Hide resolved
pkg/util/provider/driver/driver.go Outdated Show resolved Hide resolved
@@ -407,6 +410,7 @@ func (c *controller) triggerCreationFlow(ctx context.Context, createMachineReque
klog.Errorf("Error while creating machine %s: %s", machine.Name, err.Error())
return c.machineCreateErrorHandler(ctx, machine, createMachineResponse, err)
}
newlyCreatedMachine = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I would suggest to move newlyCreatedMachine = true to the end of the block.

@@ -407,6 +410,7 @@ func (c *controller) triggerCreationFlow(ctx context.Context, createMachineReque
klog.Errorf("Error while creating machine %s: %s", machine.Name, err.Error())
return c.machineCreateErrorHandler(ctx, machine, createMachineResponse, err)
}
newlyCreatedMachine = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove the newlyCreatedMachine variable and just use uninitializedMachine. Set uninitializedMachine to true after the VM creation is successful.

pkg/util/provider/machinecontroller/machine.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machine.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Mar 8, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 13, 2024
@gardener-robot gardener-robot added size/l Size of pull request is large (see gardener-robot robot/bots/size.py) needs/second-opinion Needs second review by someone else and removed size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Mar 14, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 14, 2024
Copy link
Contributor

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor changes needed. Rest looks good.

pkg/util/provider/machinecontroller/machine.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/l Size of pull request is large (see gardener-robot robot/bots/size.py) labels Mar 26, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 26, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 26, 2024
Copy link
Contributor

@unmarshall unmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

docs/development/machine_error_codes.md Show resolved Hide resolved
docs/development/machine_error_codes.md Outdated Show resolved Hide resolved
docs/development/machine_error_codes.md Show resolved Hide resolved
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 3, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 3, 2024
Copy link
Contributor

@unmarshall unmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@unmarshall unmarshall self-requested a review April 3, 2024 04:15
@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/changes Needs (more) changes needs/review Needs review needs/second-opinion Needs second review by someone else labels Apr 3, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 3, 2024
Copy link
Contributor

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rishabh-11 rishabh-11 merged commit 0e30203 into gardener:master Apr 3, 2024
8 checks passed
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Apr 3, 2024
@@ -267,6 +325,7 @@ If the conditions defined below are encountered, the provider MUST return the sp
| 13 INTERNAL | Major error | Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. | Needs manual intervension to fix this | N |
| 14 UNAVAILABLE | Not Available | Unavailable indicates the service is currently unavailable. | Retry operation after sometime | Y |
| 16 UNAUTHENTICATED | Missing provider credentials | Request does not have valid authentication credentials for the operation | Fix the provider credentials | N |
| 17 UNINITIALIZED | Failed Initialization| VM Instance could not be initializaed | Initialization is reattempted in next reconcile cycle | N |
Copy link
Contributor

@kon-angelo kon-angelo Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initializaed

initialized

Comment on lines +199 to +200
| 13 INTERNAL | Major error | Means some invariants expected by underlying system has been broken. | Needs investigation and possible intervention to fix this | Y |
| 17 UNINITIALIZED | Failed Initialization| VM Instance could not be initializaed | Initialization is reattempted in next reconcile cycle | Y |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| 13 INTERNAL | Major error | Means some invariants expected by underlying system has been broken. | Needs investigation and possible intervention to fix this | Y |

In practice not many will make a distinction between uninitialized and internal - why would they since both errors are retryable. If there is a difference in the requeue time maybe but. Saying Needs investigation and possible intervention to fix this and still allowing retries does not make sense. I think we could live without the "uninitialized" error per the spec.

Copy link
Contributor Author

@elankath elankath Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Konstantinos, We need the Uninitialized error code to disambiguate the specific case of an un-initialized machine. The Internal error code is more like a generic error code in the MCM presently.

Maybe you are right and we can remove codes.Internal from the response error codes for InitializeMachine to be more strict and not lenient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how should I, as a extension maintainer differentiate ? What is an example use-case ? Maybe it can be added in the description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants