Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdk synth/deploy: tries to describe default CDKToolkit bootstrap stack even when a qualifier and custom toolkit stack name are used #26588

Closed
polarizeme opened this issue Aug 1, 2023 · 2 comments · Fixed by #26925
Assignees
Labels
bug This issue is a bug. effort/medium Medium work item – several days of effort p1 package/tools Related to AWS CDK Tools or CLI

Comments

@polarizeme
Copy link

polarizeme commented Aug 1, 2023

Describe the bug

If your cdk.json contains the following config line:
"@aws-cdk/core:bootstrapQualifier": "example-qualifier"

And you bootstrapped an account with:
cdk bootstrap aws://<accountID>/us-east-1 --toolkit-stack-name "some-custom-stack-name" --qualifier "example-qualifier" --cloudformation-execution-policies "<policyARN>"

When you do a cdk synth or a cdk deploy, the action will still look for the default CDKToolkit bootstrap/toolkit stack, which is entirely unnecessary in a situation where it never existed in the first place. In some cases, like ours, it completely breaks the ability to deploy anything.

Expected Behavior

If you're using a custom qualifier and a custom stack name, cdk commands should NOT be trying to run anything against a stack that never existed, even though it's the default; the action should check for a custom qualifier FIRST, use that to determine the bootstrap/toolkit stack name, and THEN run the necessary actions.

Current Behavior

Synth start example:

[14:23:01] [trace] SdkProvider#withAwsCliCompatibleDefaults()
[14:23:01] Toolkit stack: CDKToolkit

Deploy start example:

[13:58:12] [trace] SdkProvider#withAwsCliCompatibleDefaults()
[13:58:12] Toolkit stack: CDKToolkit

Deploy ending example:

 deploying... [1/1]
[13:58:27] [trace] SdkProvider#resolveEnvironment()
[13:58:27] [trace]   SdkProvider#defaultAccount()
[13:58:27] [trace] SdkProvider#baseCredentialsPartition()
[13:58:27] [trace]   SdkProvider#resolveEnvironment()
[13:58:27] [trace]   SdkProvider#obtainBaseCredentials()
[13:58:27] [trace]     SdkProvider#defaultAccount()
[13:58:27] [trace]     SdkProvider#defaultCredentials()
[13:58:27] [trace]   SDK#currentAccount()
[13:58:27] [trace]     SDK#forceCredentialRetrieval()
[13:58:27] Retrieved account ID 303699070191 from disk cache
[13:58:27] [trace] SDK#cloudFormation()
[13:58:27] [trace]   SDK#wrapServiceErrorHandling()
[13:58:27] Waiting for stack CDKToolkit to finish creating or updating...
[13:58:27] [AWS cloudformation 400 0.406s 0 retries] describeStacks({ StackName: 'CDKToolkit' })
[13:58:27] [trace] SDK#makeDetailedException()
[13:58:27] Call failed: describeStacks({"StackName":"CDKToolkit"}) => Stack with id CDKToolkit does not exist (code=ValidationError)
[13:58:27] Stack CDKToolkit does not exist
[13:58:27] The environment aws://<accountID>/us-east-1 doesn't have the CDK toolkit stack (CDKToolkit) installed. Use cdk bootstrap "aws://<accountID>/us-east-1" to setup your environment for use with the toolkit.
[13:58:27] [trace] SDK#ssm()
[13:58:27] [trace]   SDK#wrapServiceErrorHandling()
[13:58:28] [AWS ssm 200 0.408s 0 retries] getParameter({ Name: '/cdk-bootstrap/example-qualifier/version' })

Now, from what I can gather, in most cases it doesn't seem this will matter. The process will determine that there's a custom stack name (likely based on the qualifier and the SSM parameter for it that points to the cfn stack specified when bootstrapping), BUT there are cases, like ours, where it completely breaks the ability to deploy anything.

We have a large organization with lots of roles that are guarded by a combination of SCPs and Permission Boundaries. In this specific case, we have multiple teams bootstrapping each account w/ their own custom qualifier and custom stack name so that we don't step on each other's toes. There are things in place to ensure that folks from team A cannot touch a bootstrap stack created by Team B, and this extends to the role(s) created by team A's bootstrap. And specifically, we have rules in place so that no one can do anything with CDKToolkit as a cfn stack so that we stay entirely away from that default stack name.

And now we arrive at our issue. Whenever we try to run a deploy, we get this error:

[20:17:23] Waiting for stack CDKToolkit to finish creating or updating...
--
423 | [20:17:23] [AWS cloudformation 403 0.05s 0 retries] describeStacks({ StackName: 'CDKToolkit' })
424 | [20:17:23] [trace] SDK#makeDetailedException()
425 | [20:17:23] Call failed: describeStacks({"StackName":"CDKToolkit"}) => User: arn:aws:sts::<accountID>:assumed-role/<cdk deploy role> is not authorized to perform: cloudformation:DescribeStacks on resource: arn:aws:cloudformation:us-east-1:<accountID>:stack/CDKToolkit with an explicit deny in a service control policy (code=AccessDenied)

Which makes it seem like the SCP is the issue, except for the fact that we literally have no stack with this name to begin with, so cdk should NOT be trying to describe it in the first place.

Reproduction Steps

  1. Bootstrap an account with:
    cdk bootstrap aws://<accountID>/us-east-1 --toolkit-stack-name "some-custom-stack-name" --qualifier "example-qualifier" --cloudformation-execution-policies "<policyARN>"

  2. Make sure the context in your cdk.json contains the following config line:
    "@aws-cdk/core:bootstrapQualifier": "example-qualifier"

  3. try to use cdk deploy to the bootstrapped account; use the -vvv flag if you want to see the debug output that contains the attempts to use the default toolkit stack name.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.86.0

Framework Version

No response

Node.js Version

18

OS

MacOS 13.2

Language

Python

Language Version

3.11.3

Other information

No response

@polarizeme polarizeme added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 1, 2023
@github-actions github-actions bot added the package/tools Related to AWS CDK Tools or CLI label Aug 1, 2023
@pahud pahud added needs-review p1 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Aug 2, 2023
@pahud
Copy link
Contributor

pahud commented Aug 2, 2023

Thank you for the report. We'll discuss this with the team.

@pahud pahud removed the needs-review label Aug 2, 2023
@mergify mergify bot closed this as completed in #26925 Sep 7, 2023
mergify bot pushed a commit that referenced this issue Sep 7, 2023
…#26925)

The CLI always looks up the default bootstrap stack, for backwards compatibility reasons: in case the attributes introduced by the V2 `DefaultStackSynthesizer` that tell it what SSM parameter to use and what bucket to write assets to are not present, it needs to fall back to the default bootstrap stack found in CloudFormation.

The code happily survives a `StackNotFound` error, but is not prepared to deal with an `AccessDenied` error, that a customer in #26588 had configured their AWS account for.

The essence of the fix here is to catch all errors when looking up the toolkit stack, because they only become relevant if any of the properties of the toolkit stack are ever accessed. 

The customer also made the point that the lookup didn't even need to happen in the first place, because all information was already there. This is fair, and the organization of the code in this area has been a thorn in my side for a while now. There is some code that doesn't need to be on `ToolkitInfo` (which is the ancient name for the Bootstrap Stack), but is there for legacy reasons.

This PR introduces a refactor, where we introduce a new class `EnvironmentResources`, that manages interacting with the bootstrap resources in a particular environment. We can now pass `EnvironmentResources` everywhere we used to pass `ToolkitInfo`, and the actual lookup of the Bootstrap Stack is only triggered if the need arises (which hopefully should be never).

Closes #26588.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

github-actions bot commented Sep 7, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

mikewrighton pushed a commit that referenced this issue Sep 14, 2023
…#26925)

The CLI always looks up the default bootstrap stack, for backwards compatibility reasons: in case the attributes introduced by the V2 `DefaultStackSynthesizer` that tell it what SSM parameter to use and what bucket to write assets to are not present, it needs to fall back to the default bootstrap stack found in CloudFormation.

The code happily survives a `StackNotFound` error, but is not prepared to deal with an `AccessDenied` error, that a customer in #26588 had configured their AWS account for.

The essence of the fix here is to catch all errors when looking up the toolkit stack, because they only become relevant if any of the properties of the toolkit stack are ever accessed. 

The customer also made the point that the lookup didn't even need to happen in the first place, because all information was already there. This is fair, and the organization of the code in this area has been a thorn in my side for a while now. There is some code that doesn't need to be on `ToolkitInfo` (which is the ancient name for the Bootstrap Stack), but is there for legacy reasons.

This PR introduces a refactor, where we introduce a new class `EnvironmentResources`, that manages interacting with the bootstrap resources in a particular environment. We can now pass `EnvironmentResources` everywhere we used to pass `ToolkitInfo`, and the actual lookup of the Bootstrap Stack is only triggered if the need arises (which hopefully should be never).

Closes #26588.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. effort/medium Medium work item – several days of effort p1 package/tools Related to AWS CDK Tools or CLI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants