Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): cdk rollback #31684

Merged
merged 4 commits into from
Oct 7, 2024
Merged

feat(cli): cdk rollback #31684

merged 4 commits into from
Oct 7, 2024

Conversation

rix0rrr
Copy link
Contributor

@rix0rrr rix0rrr commented Oct 7, 2024

This is a re-draft of #31407. All description and motivation of the previous PR still apply.

The previous PR caused a regression because some CREATE_IN_PROGRESS events for CloudFormation do not have a PhysicalResourceId.

Fix that issue in this PR. Update the existing unit test that was supposed to catch this issue previously, it did not set a ResourceStatus which caused the event to be skipped for the wrong reason.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

Add a CLI feature to roll a stuck change back.

This is mostly useful for deployments performed using `--no-rollback`: if a failure occurs, the stack gets stuck in an `UPDATE_FAILED` state from which there are 2 options:

- Try again using a new template
- Roll back to the last stable state

There used to be no way to perform the second operation using the CDK CLI, but there now is.

`cdk rollback` works in 2 situations:

- A paused fail state; it will initiating a fresh rollback (on `CREATE_FAILED`, `UPDATE_FAILED`).
- A paused rollback state; it will retry the rollback, optionally skipping some resources (on `UPDATE_ROLLBACK_FAILED` -- it seems there is no way to continue a rollback in `ROLLBACK_FAILED` state).

`cdk rollback --orphan <logicalid>` can be used to skip resource rollbacks that are causing problems.

`cdk rollback --force` will look up all failed resources and continue skipping them until the rollback has finished.

This change requires new bootstrap permissions, so the bootstrap stack is updated to add the following IAM permissions to the `deploy-action` role:

```
                  - cloudformation:RollbackStack
                  - cloudformation:ContinueUpdateRollback
```

These are necessary to call the 2 CloudFormation APIs that start and continue a rollback. 

Relates to (but does not close yet) #30546.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions github-actions bot added the p2 label Oct 7, 2024
@aws-cdk-automation aws-cdk-automation requested a review from a team October 7, 2024 11:52
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Oct 7, 2024
Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.

A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.

@aws-cdk-automation aws-cdk-automation added the pr/needs-cli-test-run This PR needs CLI tests run against it. label Oct 7, 2024
@rix0rrr rix0rrr added pr-linter/exempt-integ-test The PR linter will not require integ test changes pr-linter/cli-integ-tested Assert that any CLI changes have been integ tested labels Oct 7, 2024
@aws-cdk-automation aws-cdk-automation dismissed their stale review October 7, 2024 12:24

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@aws-cdk-automation aws-cdk-automation removed the pr/needs-cli-test-run This PR needs CLI tests run against it. label Oct 7, 2024
@mrgrain
Copy link
Contributor

mrgrain commented Oct 7, 2024

Can you point out the fix in this PR compared to the original failed attempt?

Edit: This is it https://github.com/aws/aws-cdk/pull/31684/files/efdf9b59bdda923b4814fdb102f72ead17479992..7bc575f45b5f301f76af55e312bb9161aaf123cd

Copy link
Contributor

mergify bot commented Oct 7, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 9b52116
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify mergify bot merged commit 3e40edc into main Oct 7, 2024
11 of 12 checks passed
@mergify mergify bot deleted the huijbers/cdk-rollback-2 branch October 7, 2024 13:23
Copy link
Contributor

mergify bot commented Oct 7, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

Copy link

github-actions bot commented Oct 7, 2024

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 7, 2024
@rix0rrr rix0rrr self-assigned this Oct 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
contribution/core This is a PR that came from AWS. p2 pr-linter/cli-integ-tested Assert that any CLI changes have been integ tested pr-linter/exempt-integ-test The PR linter will not require integ test changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants