Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent mutating existing 'id' property on schema #521

Merged
merged 1 commit into from
May 24, 2022
Merged

Conversation

AaronFriel
Copy link
Contributor

Fixes pulumi/pulumi#9611 which was a bug caused by this roundabout sequence.

Bug

Examples code generation should be deterministic. Instead, for the Hetzner Cloud (hcloud) provider, the id properties on several functions flapped between a string or numeric representation in the docs.

Root cause analysis

The Hetzner Cloud hcloud_network & hcloud_networks upstream examples contain a bug, the latter docs refer to hcloud_network (singular) in the examples and not plural.

This is relevant because our own code has a data race. We intermingle schema binding (and mutating the schema) with example code generation. Depending on whether getExample or getExamples runs first, we either:

  1. Generate example code for getExample (correctly)
  2. Generate example code for getExamples (broken due to upstream)
  3. Mutate getExample to have an id: string output.

Or:

  1. Generate example code for getExamples (broken due to upstream)
  2. Mutate getExample to have an id: string output.
  3. Generate example code for getExample (incorrectly, due to mutation of id property)

The change in this PR ensures that if an "id" is present, it is not overridden. Here's the mutation:

schemas.TFRes.Schema().Set("id", (&schema.Schema{Type: shim.TypeString, Computed: true}).Shim())

While root causing this, I found this earlier code which seems to make that line redundant, because we also "make up" a string output if one doesn't prior to doing any code generation, up front:

// If the data source's schema doesn't expose an id property, make one up since we'd like to expose it for data
// sources.
if id, has := ds.Schema().GetOk("id"); !has || id.Removed() != "" {
cust := &tfbridge.SchemaInfo{}
rawdoc := "The provider-assigned unique ID for this managed resource."
idSchema := &schema.Schema{Type: shim.TypeString, Computed: true}
fun.rets = append(fun.rets,
propertyVariable("id", idSchema.Shim(), cust, "", rawdoc, true /*out*/, entityDocs))
}

As I'm new to this code base, I don't know why either of these blocks is needed, as I think they create phantom "id" properties on functions that do not have an "id" output, such as in the getNetworks example.

If either or both of the quoted code blocks can be safely removed, I think we should favor doing so over this PR.

@AaronFriel AaronFriel requested a review from pgavlin May 18, 2022 23:56
@AaronFriel
Copy link
Contributor Author

AaronFriel commented May 19, 2022

@stack72 @pgavlin was there a change to the upstream providers where "id" was once an implicit property that we had add on to the schema, but due to changes upstream it now explicit and we can expect it to be present?

Another example of that is cases where the id is a number not a string, such as in the getNetwork function:
https://www.pulumi.com/registry/packages/hcloud/api-docs/getnetwork/#id_yaml

I tested removing both code blocks I highlighted in my PR and running tfgen against the hcloud provider, and it removed all of the spurious and inaccurate "id" properties from the schema & SDKs like this one:

https://www.pulumi.com/registry/packages/hcloud/api-docs/getnetworks/#id_yaml

All of the plural getter functions in hcloud like getVolumes had the same issue, a phantom "id" property that I think would always be left undefined/unset.

@AaronFriel
Copy link
Contributor Author

CC @guineveresaenger, re:

@guineveresaenger
Copy link
Contributor

guineveresaenger commented May 21, 2022

Acceptance tests on hcloud using this change have passed: https://github.com/pulumi/pulumi-hcloud/actions/runs/2361216953

I agree with Friel though - would very much like to see if we can stop generating fake ID fields in the first place and clean up some of these logic twists.

@pgavlin
Copy link
Member

pgavlin commented May 23, 2022

TF has been pretty inconsistent in its treatment of id attributes. The best understanding I have is that the id attribute it always valid for a resource, even if it is not present in that resource's schema. That is why we synthesize an id property for resources: without it, binding fails and the HCL -> PCL conversion does not continue. For data sources, though, my understanding is murkier. I am not sure whether or not the id property should be synthesized for data sources. It may be that the id property used to be valid for arbitrary data sources, but not longer is--I'm honestly uncertain and the TF docs are not very illuminating. I think it is worth attempting to remove the id synthesis for data sources, but I believe that we need to keep it for resources.

@AaronFriel
Copy link
Contributor Author

@guineveresaenger I think based on Pat's answer, we merge this PR as is, then follow up with a 2nd PR to remove both blocks that create "id" properties on data sources. That seems riskier and more likely to affect our tier 1 providers, so I want to do it separately.

@stack72
Copy link
Contributor

stack72 commented May 23, 2022

In the new plug-in SDK it only injects an Id if d.SetId is used so we need to be careful here

@pgavlin
Copy link
Member

pgavlin commented May 23, 2022

In the new plug-in SDK it only injects an Id if d.SetId is used so we need to be careful here

I think that an HCL expression of the form foo.id is still valid even if SetId is not called by the resource provider. This code is only concerned with handling HCL expressions, so I think we're in the clear.

Copy link
Contributor

@stack72 stack72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guineveresaenger has checked this on pulumi-aws and make tfgen showed no schema changes

@stack72 stack72 merged commit c666b97 into master May 24, 2022
@stack72 stack72 deleted the friel/pulumi-9611 branch May 24, 2022 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indeterminate schemagen for typescript examples "description" key
4 participants