Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if nodepool created before returning error #10137

Conversation

modular-magician
Copy link
Collaborator

Current implementation of GKE_STOCKOUT not being captured in state issue from #6287 doesn't seems to work.
Error during apply is as follows:
Error: Error waiting for creating GKE NodePool: Google Compute Engine: Not all instances running in IGM after 1m6.219216581s. Expected 40, running 1, transitioning 39. Current errors: [GCE_STOCKOUT]: Instance 'instance' creation failed: The zone 'projects/redacted/zones/us-west1-b' does not have enough resources available to fulfill the request. '(resource type:compute)'.; (truncated).

GCP behavior is to create a node pool in error state and keep waiting for resources to show up. This cause the node pool to be created but this fact is not captured in the state since error is returned.

In order to fix this bug I'm proposing to re-check if node pool exist instead of simply assuming that it doesn't. This approach will prevent any type of situation like that, since the whole flow will be as follow:

  • ensure node pool doesn't exist
  • create the node pool
  • if error, check if exist
  • if exist - capture that in state
  • if doesn't - return error

If this PR is for Terraform, I acknowledge that I have:

  • Searched through the issue tracker for an open issue that this either resolves or contributes to, commented on it to claim it, and written "fixes {url}" or "part of {url}" in this PR description. If there were no relevant open issues, I opened one and commented that I would like to work on it (not necessary for very small changes).
  • Generated Terraform, and ran make test and make lint to ensure it passes unit and linter tests.
  • Ensured that all new fields I added that can be set by a user appear in at least one example (for generated resources) or third_party test (for handwritten resources or update tests).
  • Ran relevant acceptance tests (If the acceptance tests do not yet pass or you are unable to run them, please let your reviewer know).
  • Read the Release Notes Guide before writing my release note below.

Release Note Template for Downstream PRs (will be copied)

container: fixed an issue where a node pool created with error (eg. GKE_STOCKOUT) would not be captured in state

Derived from GoogleCloudPlatform/magic-modules#5225

Signed-off-by: Modular Magician <magic-modules@google.com>
@modular-magician modular-magician merged commit 25e1e8a into hashicorp:master Sep 22, 2021
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant