Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3|ecr: auto-delete-[objects|images] breaks on cloudformation rollback (suspected) #27199

Closed
kaizencc opened this issue Sep 19, 2023 · 2 comments · Fixed by #29581
Closed

s3|ecr: auto-delete-[objects|images] breaks on cloudformation rollback (suspected) #27199

kaizencc opened this issue Sep 19, 2023 · 2 comments · Fixed by #29581
Assignees
Labels
@aws-cdk/aws-ecr Related to Amazon Elastic Container Registry @aws-cdk/aws-s3 Related to Amazon S3 @aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. effort/small Small work item – less than a day of effort p1

Comments

@kaizencc
Copy link
Contributor

kaizencc commented Sep 19, 2023

Based on the way the custom resource is implemented, it is likely that unexpected behavior happens on Cloudformation rollback, i.e. the custom resource will prematurely delete the objects.

Consider the following scenario:

UPDATE target resource (replacement, creates a new resource)
UPDATE custom resource (old -> new, objects in old bucket are deleted)
(...stuff happens...)
ERROR, triggers a rollback
UPDATE custom resource (new -> old)
DELETE target resource (deletes the new resource, remembers the existing one)

We will have deleted objects in the bucket that has been rolled back to in this scenario.


The correct way to handle this is what was done in synthetics:

async function onUpdate(event: AWSLambda.CloudFormationCustomResourceEvent) {
const updateEvent = event as AWSLambda.CloudFormationCustomResourceUpdateEvent;
const newCanaryName = updateEvent.ResourceProperties?.CanaryName;
// If the name of the canary has changed, CloudFormation will delete the canary
// and create a new one with the new name. Returning a PhysicalResourceId that
// differs from the event's PhysicalResourceId will trigger a `Delete` event
// for this custom resource. Here, if `newCanaryName` differs from `event.PhysicalResourceId`
// then this will trigger a `Delete` event.
return { PhysicalResourceId: newCanaryName };
}

As opposed to

async function onUpdate(event: AWSLambda.CloudFormationCustomResourceEvent) {
const updateEvent = event as AWSLambda.CloudFormationCustomResourceUpdateEvent;
const oldBucketName = updateEvent.OldResourceProperties?.BucketName;
const newBucketName = updateEvent.ResourceProperties?.BucketName;
const bucketNameHasChanged = newBucketName != null && oldBucketName != null && newBucketName !== oldBucketName;
/* If the name of the bucket has changed, CloudFormation will try to delete the bucket
and create a new one with the new name. So we have to delete the contents of the
bucket so that this operation does not fail. */
if (bucketNameHasChanged) {
return onDelete(oldBucketName);
}
}

@kaizencc kaizencc self-assigned this Sep 19, 2023
@kaizencc kaizencc changed the title Fix Custom Resource update workflow for auto-delete-[objects|images] s3|ecr: auto-delete-[objects|images] breaks on cloudformation rollback (suspected) Sep 19, 2023
@kaizencc kaizencc added p1 @aws-cdk/aws-s3 Related to Amazon S3 @aws-cdk/aws-ecr Related to Amazon Elastic Container Registry effort/small Small work item – less than a day of effort labels Sep 19, 2023
@kaizencc kaizencc assigned kaizencc and unassigned kaizencc Sep 19, 2023
@pahud pahud added @aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. labels Mar 14, 2024
@GavinZZ GavinZZ self-assigned this Mar 21, 2024
@GavinZZ
Copy link
Contributor

GavinZZ commented Mar 21, 2024

I can confirm that this is a reproducible issue. Here's the cdk code I used

const bucket = new s3.Bucket(this, 'Bucket', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      bucketName: <a bucket name that's not used>,
      autoDeleteObjects: true,
    });

    // Intentionally failure since `mybucket-1` exists
    const bucket2 = new s3.Bucket(this, 'Bucket2', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      bucketName: <a bucket name that's not used>,
    });

    bucket2.node.addDependency(bucket);

Once the deployment is successful, add some random content to the bucket, then update the code so that the first bucket's bucketName is updated to another valid name. Update the second bucket's bucketName to be an existing bucket name, which will trigger a deployment failure hence roll back.

The stack is rolled back, but the content is the bucket is now deleted.

GavinZZ added a commit that referenced this issue Apr 19, 2024
…cloudformation rollback (#29581)

### Issue # (if applicable)

Closes #27199

### Reason for this change

Based on the way the custom resource is implemented, it is likely that
unexpected behavior happens on Cloudformation rollback, i.e. the custom
resource will prematurely delete the objects.

Consider the following scenario:

```
UPDATE target resource (replacement, creates a new resource)
UPDATE custom resource (old -> new, objects in old bucket are deleted)
(...stuff happens...)
ERROR, triggers a rollback
UPDATE custom resource (new -> old)
DELETE target resource (deletes the new resource, remembers the existing one)
```

We will have deleted objects in the bucket that has been rolled back to
in this scenario, but the content is now gone.

### Description of changes

Instead of deleting it right during update, we send back
`PhysicalResourceId` in the event handler which if the id changes, it
will let CFN to empty and delete the bucket at the end of the
deployment.

### Description of how you validated changes

New & updated tests. Also manually tested with deploying a template 
```
const bucket = new s3.Bucket(this, 'Bucket', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      bucketName: <a bucket name that's not used>,
      autoDeleteObjects: true,
    });

    // Intentionally failure since `mybucket-1` exists
    const bucket2 = new s3.Bucket(this, 'Bucket2', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      bucketName: <a bucket name that's not used>,
    });

    bucket2.node.addDependency(bucket);
```

Once the deployment is successful, add some random content to the
bucket, then update the code so that the first bucket's bucketName is
updated to another valid name. Update the second bucket's bucketName to
be an existing bucket name, which will trigger a deployment failure
hence roll back.

After the change, the content will stay there if a deployment failure
happens. The content & bucket will be deleted if deployment is
successful.


### Checklist
- [x] My code adheres to the [CONTRIBUTING
GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and
[DESIGN
GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache-2.0 license*
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ecr Related to Amazon Elastic Container Registry @aws-cdk/aws-s3 Related to Amazon S3 @aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. effort/small Small work item – less than a day of effort p1
Projects
None yet
3 participants