[WIP] Proposal to modify provider `meta` to enable specifying the `region` at each resource #31517

brittandeyoung · 2023-05-21T21:20:31Z

Description

The pains around multi-region deployments in terraform is an area where I feel a high impact could be made by improving this functionality. Currently today, you have to specify a separate provider per region, use provider aliases and define the same resource at least once per region you want to deploy it to (repeat code). I have been thinking about this issue from two sides.

This Pull request is a Proof of Concept with modifications to the core provider meta, tags_interceptor, and two resources to show functionality. These modifications will enable for resources to be updated to allow specifying the region at each resource level, with a default region when one is not specified.

This is accomplished by instead of having a single AWSClient inside of the provider meta, a map containing all regions and their AWSClient. This allows either defaulting to the region provided in the provider configuration, or allowing a region to be specified at the resource level to determine the region it will be deployed and managed in.

The map of regions can be limited by providing a list to the newly added allowed_regions provider setting. eg allowed_regions=["us-east-1", "us-east-2" ] . If this setting is not provided a client will be established for all available regions for the account.

This will accomplish the following:

Enable each resource to be modified to allow region to be specified at each resource and even allow for the same resource to be deployed to multiple regions using a for_each.
Allow current behavior of defaulting to the provider configured region as to not introduce a breaking change.
Allow for deploying to multiple regions with only the single default aws provider configuration.

This PR only has modified two resources for testing so far.

aws_lightsail_bucket - Has been modified to allow passing the region at the resource level and tests have been added to test that this works as intended when passing in a region other than the default.
aws_lightsail_disk - Has been modified to work with the new provider meta without adding support for region at the resource level, this allows testing and verifying that we can introduce this and still keep existing functionality for all resources and roll out the region at each resource level at its own pace.

I am not an expert on how the provider meta is configured so I am sure optimization could be made, but I feel a solution similar to this would provide the best end user experience when deploying to multiple regions. I am sure there may be more modifications needed before this proposal would be ready for merge. Additionally every resource would need to be updated as well to use the new provider meta ( I only did the two for testing, but this is as easy as a massive find replace).

Relations

Relates #25308

References

Output from Acceptance Testing

$ make testacc TESTARGS='-run=TestAccLightsailBucket_' PKG=lightsail     
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./internal/service/lightsail/... -v -count 1 -parallel 20  -run=TestAccLightsailBucket_ -timeout 180m
=== RUN   TestAccLightsailBucket_basic
=== PAUSE TestAccLightsailBucket_basic
=== RUN   TestAccLightsailBucket_region
=== PAUSE TestAccLightsailBucket_region
=== RUN   TestAccLightsailBucket_BundleId
=== PAUSE TestAccLightsailBucket_BundleId
=== RUN   TestAccLightsailBucket_disappears
=== PAUSE TestAccLightsailBucket_disappears
=== RUN   TestAccLightsailBucket_tags
=== PAUSE TestAccLightsailBucket_tags
=== CONT  TestAccLightsailBucket_basic
=== CONT  TestAccLightsailBucket_disappears
=== CONT  TestAccLightsailBucket_tags
=== CONT  TestAccLightsailBucket_BundleId
=== CONT  TestAccLightsailBucket_region
--- PASS: TestAccLightsailBucket_disappears (128.44s)
--- PASS: TestAccLightsailBucket_basic (162.29s)
--- PASS: TestAccLightsailBucket_BundleId (264.69s)
--- PASS: TestAccLightsailBucket_region (272.90s)
--- PASS: TestAccLightsailBucket_tags (362.18s)
PASS
ok      github.com/hashicorp/terraform-provider-aws/internal/service/lightsail  365.437s


$ make testacc TESTARGS='-run=TestAccLightsailDisk_' PKG=lightsail 
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./internal/service/lightsail/... -v -count 1 -parallel 20  -run=TestAccLightsailDisk_ -timeout 180m
=== RUN   TestAccLightsailDisk_basic
=== PAUSE TestAccLightsailDisk_basic
=== RUN   TestAccLightsailDisk_Tags
=== PAUSE TestAccLightsailDisk_Tags
=== RUN   TestAccLightsailDisk_disappears
=== PAUSE TestAccLightsailDisk_disappears
=== CONT  TestAccLightsailDisk_basic
=== CONT  TestAccLightsailDisk_disappears
=== CONT  TestAccLightsailDisk_Tags
--- PASS: TestAccLightsailDisk_disappears (160.14s)
--- PASS: TestAccLightsailDisk_basic (179.56s)
--- PASS: TestAccLightsailDisk_Tags (380.70s)
PASS
ok      github.com/hashicorp/terraform-provider-aws/internal/service/lightsail  383.867s

...

… at resource

…gion at resource tests

…ovider meta.

github-actions · 2023-05-21T21:20:44Z

Community Note

Voting for Prioritization

Please vote on this pull request by adding a 👍 reaction to the original post to help the community and maintainers prioritize this pull request.
Please see our prioritization guide for information on how we prioritize.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

For Submitters

Review the contribution guide relating to the type of change you are making to ensure all of the necessary steps have been taken.
For new resources and data sources, use skaff to generate scaffolding with comments detailing common expectations.
Whether or not the branch has been rebased will not impact prioritization, but doing so is always a welcome surprise.

AdamTylerLynch · 2023-05-22T12:34:12Z

In the case of pilot light multi-region deployments, I think the approach presented here would generally work.

There are some interesting concerns around resource types that inherently support multi-region via primary/replica attributes or global attributes.

I also would want to address how Terraform would destroy resources in a region if the practitioner removes a region. Say they want to leave ap-south-1, how could they ensure that all resources were removed?

See #27758 for some good comments.

brittandeyoung · 2023-05-22T14:05:44Z

@AdamTylerLynch In the case of resources that support multiple regions, I think that this will simplify those implementations. Just looking at the example of aws_dynamodb_table_replica. Currently the code is creating a new session to the target region, where with this update it could instead select the correct client for the target region and remove the need for configuring these clients within each multi-region resource.

If a user wanted to remove all resources from a region, they would need to remove the terraform configuration for those resources and allow terraform to destroy them, or run a targeted destroy.

If a user wanted to move from say ap-south-1 to us-west-1, they would simply update the region value for each resource (hopefully they would have the target region as a local so it only requires updating a single value), then terraform would handle destroying and recreating them all in the new region. Since the value is determined at the region, dynamic values can be used like local, for_each and count.

In the below basic example, a user would simply remove the region from the local.regions list, and terraform would handle destroying the resources.

locals {
  regions = [
  "us-east-1",
  "us-east-2"]
}


resource "aws_lb" "test" {
  for_each           = teset(local.regions)
  region              = each.key
  name               = "test-lb-tf"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.allow_tls.id]
  subnets            = data.aws_subnets.public.ids

  tags = {
    Environment = "production"
  }
}

In the event that the user updated the region value for the provider, terraform would behave as it does today and simply error out stating that it could not find the resources. A destroy would need to be performed before cleaning updating the default region for the provider.

One thing I am still trying to figure out how to do is in the event a user was providing a region value and decides to remove that value, The provider needs to be smart enough to know to default to the provider default region. Still working on how to technically make that work.

…ption, and remove extra new lines

…ng and checking of a resource region

… region

brittandeyoung · 2023-05-24T14:23:00Z

@AdamTylerLynch I have additionally added a central flex function in order to return a proper error to the client in the event they are attempting to deploy to a region that is not in the allowed_regions provider settings. This should help force the behavior of either updating the region or destroying the resource before the region is removed from the allowed list.

gdavison · 2023-05-29T20:24:30Z

Thanks for all the thought and work you've put into this, @brittandeyoung. We're going to have some discussions on our team around the approach to take with this.

We see some definite benefits to multiple regions with one providers, but also some areas for caution, especially around how regionally-separated AWS services are.

brittandeyoung · 2023-05-30T01:12:15Z

Thanks for all the thought and work you've put into this, @brittandeyoung. We're going to have some discussions on our team around the approach to take with this.

We see some definite benefits to multiple regions with one providers, but also some areas for caution, especially around how regionally-separated AWS services are.

@gdavison great! I will wait to hear from the provider team before I put any more work into this.

YakDriver · 2023-06-01T20:42:18Z

I personally like this idea.

It could help with performance since you wouldn't need multiple providers (especially for people who support a lot of regions)
It would simplify configurations 🎉 (e.g., replications, clusters, modules)
There may be other things that could work this way (I dunno but maybe assume_role, profile?)

apparentlymart · 2023-09-01T01:02:26Z

Hello! I work on Terraform Core rather than on the AWS provider so I'm just an interested onlooker here and not intending to speak on behalf of the AWS provider team. I've had a quiet hope for the AWS provider to adopt a model like this for a long time, so I'm very excited to see this proposal/prototype. 😀

This new approach could potentially also help with another long-standing oddity: if one changes the region argument in the provider configuration today after some remote objects already exists then the provider typically tries to refresh the existing objects in the wrong region, gets a "not found" error, and assumes that the objects have been deleted. Unless the user checks the provider's work and rejects the plan, they could potentially end up creating duplicate objects in the new region and leaving behind forgotten objects in the old region.

Here's one way I could imagine improving that region change situation:

The region argument in each resource type is declared as Optional: true, Computed: true, so the provider can populate it if the module author doesn't.
If per-resource region is unset in the configuration, the provider automatically populates it during planning (i.e. CustomizeDiff in the SDK) with the same region as configured in the corresponding provider "aws" block, which now serves as the "default" region for any resource that doesn't specify one.
- To allow upgrading from state objects that were created with earlier versions of the provider, each resource type would need an upgrade rule to retrofit a value for region into any resource instance that doesn't already have one, based on the current provider configuration. For correct results, the author would need to leave the provider-level region unchanged while resolving the upgrade so that the correct per-resource value can be retrofitted and then the configuration and state will match during the first plan after upgrading.
- I think this rule would effectively replace the empty-region-handling DiffSuppressFunc in the current PR, since there would no longer be any case where a resource would have no region populated except during the state upgrade step, which always runs before any planning steps.
When refreshing existing objects and when destroying, the provider uses the region attribute on the resource to decide which region to make the query to, ignoring the "default region" in the provider "aws" block.
When diffing, if a resource instance region in configuration is different than the prior state, signal that replacement is required ("Force new" in SDK terminology) so that Terraform will propose to delete the object in the old region and create a new object in the new region.
If the region in the provider "aws" block changes when remote objects already exist, the provider treats that as a change to desired state of all resources which don't explicitly set region, making it now possible to mass-migrate many objects from one region to another by replacement.

The above is just one design idea and I listed out these details only in case they are useful; some parts of this are probably challenging to implement with the current capabilities of the plugin SDK, but I think it is all valid within and consistent with the spirit of Terraform's resource instance change lifecycle. The main thing I'd love to see, details aside, is that changing region in the provider "aws" block would no longer cause the provider to be confused by "not found" errors and instead have it treat the region change like a normal change to desired state.

Again, I'm not speaking on behalf of the AWS provider team and so please don't take any actions based on this until the provider team has said something about it! I don't want anyone to spend time on implementing something like this if the provider team would eventually oppose it on principle anyway.

brittandeyoung added 9 commits May 19, 2023 15:17

Initial Proof Of Concept: Allowing region to be defined at the resource

5f90b66

provider: Amend intercept, update Tagging intercept to support region…

c19ae66

… at resource

provider: Amend defaultTags type assertion

15e81ac

provider: Amend tags_interceptor_test, Add GetOk method to support re…

cc25797

…gion at resource tests

lightsail: Amend find, remove id split on FindBucketById

b42430e

lightsail: Amend bucket, add support for region at resource

89ce86b

ec2: Amend ec2_availability_zones_data_source, add support for new pr…

ec84c96

…ovider meta.

lightsail: Amend disk, add support for new provider meta.

4065091

lightsail: Amend disk_test, add support for new provider meta.

6f244f4

brittandeyoung added the proposal Proposes new design or functionality. label May 21, 2023

brittandeyoung added 2 commits May 21, 2023 18:55

verify: Modify diff, add RegionDiffSuppress function

acbdae2

lightsail: Amend bucket, add RegionDiffSuppress function for region

a65f635

justinretzolk removed the needs-triage Waiting for first response or review from a maintainer. label May 22, 2023

brittandeyoung added 6 commits May 23, 2023 14:39

provider: Amend provider, remove comments, add allowed_regions descri…

db53397

…ption, and remove extra new lines

provider: Amend tags_interceptor, update type assertion meta

847264e

provider: Amend tags_interceptor_test, update type assertion meta

6790522

fwprovider: Amend provider, Add description to match provider config

96b95a5

lightsail: Amend bucket_test, testacc-lint-fix

4224da0

create: Amend errors, add ErrActionExpandingResourceRegion

db24eef

brittandeyoung added 5 commits May 24, 2023 09:57

flex: Amend flex, add ExpandResourceRegion to allow for central setti…

bc74fb1

…ng and checking of a resource region

flex: Amend, flex_test, add ExpandResourceRegion tests

339ac19

lightsail: Amend bucket, Update region to use central function

46f45c2

lightsail: Amend bucket_test, remove unneeded new line

8b4cad3

lightsail: Amend disk, move to new ExpandResourceRegion function

e283980

github-actions bot added create Pertains to generating names, hashcodes, etc. flex Pertains to FLatteners and EXpanders. labels May 24, 2023

flex: Amend flex, update ExpandResourceRegion to also check for empty…

14dc3c7

… region

brittandeyoung added 2 commits May 24, 2023 10:46

flex: Amend flex_test, fix imports

0dcf3fe

flex: Amend flex_test, resolve golanglint-ci and add test names

bdc7c3e

breathingdust mentioned this pull request Jun 30, 2023

[Enhancement]: Add region argument/parameter where appropriate #27758

Open

breathingdust added the external-maintainer Contribution from a trusted external contributor. label Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Proposal to modify provider `meta` to enable specifying the `region` at each resource #31517

[WIP] Proposal to modify provider `meta` to enable specifying the `region` at each resource #31517

brittandeyoung commented May 21, 2023

github-actions bot commented May 21, 2023

AdamTylerLynch commented May 22, 2023

brittandeyoung commented May 22, 2023 •

edited

Loading

brittandeyoung commented May 24, 2023

gdavison commented May 29, 2023

brittandeyoung commented May 30, 2023

YakDriver commented Jun 1, 2023 •

edited

Loading

apparentlymart commented Sep 1, 2023 •

edited

Loading

[WIP] Proposal to modify provider meta to enable specifying the region at each resource #31517

Are you sure you want to change the base?

[WIP] Proposal to modify provider meta to enable specifying the region at each resource #31517

Conversation

brittandeyoung commented May 21, 2023

Description

Relations

References

Output from Acceptance Testing

github-actions bot commented May 21, 2023

Community Note

AdamTylerLynch commented May 22, 2023

brittandeyoung commented May 22, 2023 • edited Loading

brittandeyoung commented May 24, 2023

gdavison commented May 29, 2023

brittandeyoung commented May 30, 2023

YakDriver commented Jun 1, 2023 • edited Loading

apparentlymart commented Sep 1, 2023 • edited Loading

[WIP] Proposal to modify provider `meta` to enable specifying the `region` at each resource #31517

[WIP] Proposal to modify provider `meta` to enable specifying the `region` at each resource #31517

brittandeyoung commented May 22, 2023 •

edited

Loading

YakDriver commented Jun 1, 2023 •

edited

Loading

apparentlymart commented Sep 1, 2023 •

edited

Loading