Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error when AWS_EC2_METADATA_DISABLED is true #1027

Closed
sjenning opened this issue Sep 24, 2024 · 4 comments · Fixed by #1028
Closed

Fatal error when AWS_EC2_METADATA_DISABLED is true #1027

sjenning opened this issue Sep 24, 2024 · 4 comments · Fixed by #1028
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@sjenning
Copy link
Contributor

#990 introduced a new log line that tries to get information from the instance metadata service. However, when AWS_EC2_METADATA_DISABLED is true, GetInstanceIdentityDocument() returns an fatal error.

F0924 09:21:48.998037       1 main.go:104] Cloud provider could not be initialized: could not init cloud provider "aws": error creating AWS metadata client: "unable to get instance identity document: EC2MetadataRequestError: failed to get EC2 instance identity document\ncaused by: RequestCanceled: EC2 IMDS access disabled via AWS_EC2_METADATA_DISABLED env var"

This variable is commonly set is situations where the AWS CCM is running on an instance in an AWS account different from the account in which it operates/acts.

There are two fixes that come to mind:

  1. Make the logging best-effort by ignoring any error from GetInstanceIdentityDocument() or
  2. Skip the logging block if the AWS_EC2_METADATA_DISABLED is true

I'm not that confident that AWS_EC2_METADATA_DISABLED being true is the only situation in which GetInstanceIdentityDocument() fails, so I'm leaning toward option 1.

cc @dims

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 24, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sjenning added a commit to sjenning/cloud-provider-aws that referenced this issue Sep 24, 2024
@ialidzhikov
Copy link
Contributor

I can only add that #990 prevents running the CCM out-of-cluster. CCMs should be able to run out-of-cluster and should not rely running in-cluster.

Hence, +1 for making this logging optional in #1028.

@ialidzhikov
Copy link
Contributor

@dims @cartermckinnon can you cut v1.31.1 with #1028?

@ialidzhikov
Copy link
Contributor

/kind bug
/kind regression

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. labels Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants