Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CloudSQL support for "create from clone" #2429

Merged

Conversation

jasonvigil
Copy link
Collaborator

Change description

Depends on #2393

  • Run make ready-pr to ensure this PR is ready for review.
  • Perform necessary E2E testing for changed resources.

Copy link
Collaborator

@maqiuyujoyce maqiuyujoyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM except for the potential reference fields. I assume you'll add test cases for this feature, right?

apis/sql/v1beta1/sqlinstance_types.go Outdated Show resolved Hide resolved

/* DatabaseNames: (SQL Server only) Clone only the specified databases from the
source instance. Clone all databases if empty. */
DatabaseNames []string `json:"databaseNames,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can potentially be references to SQLDatabase resources, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Thank you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we have gone back on this decision, and are thinking it's overkill to use refs here. Switched back to just a list of names. #2429 (comment)

Copy link
Collaborator

@maqiuyujoyce maqiuyujoyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm with some nits and open questions, i.e. none of the comment is a blocker.

Have you got the chance to test the clone locally? Do you plan to support test coverage and mockgcp coverage in this PR?

apis/sql/v1beta1/sqlinstance_types.go Outdated Show resolved Hide resolved
pkg/controller/direct/sql/mapping.go Show resolved Hide resolved
pkg/controller/direct/sql/mapping.go Show resolved Hide resolved
sqlDatabaseRefs := obj.Spec.CloneSource.SQLDatabaseRefs
sqlDatabases := make([]string, len(sqlDatabaseRefs))

for i, sqlDatabaseRef := range sqlDatabaseRefs {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but I believe we also want to check and ensure the list contains all external values or name values, but not a mix of both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I think it should be ok to mix

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuwenma proposed it and suggested it will help us avoid complicated edge cases. Yuwen, could you elaborate on it a bit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it is not really complicated to implement this, and customer may want the functionality. There could be some databases managed by kcc, others managed externally, but customer might want to clone all of them.

Copy link
Collaborator

@maqiuyujoyce maqiuyujoyce Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG!

Personally, I like the flexibility of supporting a mix of both. Implementation-wise, it's fine to start with a stricter rule (don't allow the mix) and then loosen it (allow the mix) if we think this provides a better UX, but ideally not the other way around.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more complicated to add a check to ensure that all references in the list are either external or kcc-managed. That requires more code. However, am not sure about the scenario you mentioned where it could cause "complicated edge cases". Do you remember the details?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maqiuyujoyce for raising this.
I put the rationale and some thoughts here. The no-mix is a recommendation ("shall" rather than "must", see RFC2119) . And that it is for GCP unique list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just spoke w/ @justinsb about this. We are thinking it is overkill to use refs for these SQLDatabases anyway, since they are associated with the parent SQLInstanceRef. Decided to make this field just a list of strings of database names.

@@ -380,6 +382,21 @@ func validateImmutableFieldsForLoggingLogMetricResource(oldSpec, spec map[string
return allowedResponse
}

func validateImmutableFieldsForSQLInstanceResource(oldSpec, spec map[string]interface{}) admission.Response {
ImmutableFields := []string{"cloneSource"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's out of scope, though we probably need to think about how to handle use cases when a user realizes they made a mistake and want to fix it in-place after the creation/clone failed. Making the field strictly immutable means a user needs to delete the object, edit the manifest, and reapply, and could be a bit troublesome for some GitOps workflow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should require delete + recreate, rather than build in some special other mechanism, but maybe we can wait for product feedback on this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it is worth debating and we should wait for more customer feedback. Starting with strict immutability is good.
We did receive UX feedback previously about troubles like this in general.

Copy link
Collaborator Author

@jasonvigil jasonvigil Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more idiomatic to require delete + recreate. But maybe this could be partially solved by educating customer about k8s best practices. Or, perhaps there is some k8s-native pattern here the customer had suggested that I am unaware of.

pkg/webhook/immutable_fields_validator.go Outdated Show resolved Hide resolved
@jasonvigil
Copy link
Collaborator Author

Overall lgtm with some nits and open questions, i.e. none of the comment is a blocker.

Have you got the chance to test the clone locally? Do you plan to support test coverage and mockgcp coverage in this PR?

Yes, tested locally. Added a basic mockgcp test.

Copy link
Collaborator

@maqiuyujoyce maqiuyujoyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold
I wondered if @yuwenma wants to take another look.

Besides, can we add a sample about it? https://github.com/GoogleCloudPlatform/k8s-config-connector/blob/master/README.Samples.md

type SQLInstanceSpec struct {
/* Create this database as a clone of a source instance. Immutable. */
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="spec.cloneSource is immutable"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A silly question: Does the type of the field matter for the validation? E.g. is it able to understand "equals" for object type here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, equality works for maps: https://github.com/google/cel-spec/blob/master/doc/langdef.md#equality. I also tested this manually.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little wary of introducing CEL (1) on a high profile resource and (2) accidentally i.e. as part of a wider PR. (2) is in case we need to revert and so that we can focus on CEL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the short term, it'll be up to the team to decide the preferred behavior of the types/CRDs. I personally am okay with either CEL or webhook checks as long as there is an immutability check. @yuwenma as you proposed the CEL option, do you have any thoughts on it?
In the long term, I feel it's less error-prone if we automate the type creation as much as possible including the CEL and have proper tests to guarantee that we are supporting the right behavior.

Copy link
Collaborator

@yuwenma yuwenma Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed using CEL for immutable check in the CR Validation doc and put the reasons on how the decision is made:

  • Kubernetes supports CEL since 1.26 which is the oldest GKE cluster version. Using CEL shall not cause GKE cluster version issues, but we shall reevaluate this when the condition changes.
  • Some KCC resources use webhook to validate immutable fields. We can continue using that, but for external contributors CEL is much easier for self-learning.
  • We only apply CEL on immutable fields. More complicated CEL validation requires further discussion.

Regarding Justin's concerns, +1 to split the immutable check in a separate PR. Because 1. we do want to use CEL for immutable field check and makes it as the trends. 2. users may hit the CEL-supportability problem when they install KCC in a non-GKE cluster and the k8s is very old. Though the possibility of this case itself is low, SQL is a high profile resource like Justin mentioned. So splitting the PR is a good idea, we can still use CEL and keep us with the clean rollback options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, moving this CEL validation to a separate PR.

apiVersion: sql.cnrm.cloud.google.com/v1beta1
kind: SQLInstance
metadata:
name: postgres-clone-${uniqueId}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's out of scope, so bringing it up just for discussion.

I personally like consistent naming for easier validation, maintenance, education (contributor may ask about how to name the resource), and potential automation. That's why we had this guide: https://github.com/GoogleCloudPlatform/k8s-config-connector/blob/master/docs/archive/README.NewResourceFromTerraform.md#naming-conventions-special-variables-and-constants-in-testdata

Now we are moving to direct controllers. Do you find any preferred way of naming the test resources? Or do you find any other purposes (other than consistency, maintenance easiness, etc) more important when coming up with the names?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it doesn't really matter that much, as long as the test resource names are unique and meaningful / understandable at first glance. Much like variable names.

backupConfiguration:
enabled: true
pointInTimeRecoveryEnabled: true
# Location preference is not actually a required field. However, setting it for tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

type SQLInstanceSpec struct {
/* Create this database as a clone of a source instance. Immutable. */
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="spec.cloneSource is immutable"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little wary of introducing CEL (1) on a high profile resource and (2) accidentally i.e. as part of a wider PR. (2) is in case we need to revert and so that we can focus on CEL.

@@ -626,7 +626,7 @@ func MergeDesiredSQLInstanceWithActual(desired *krm.SQLInstance, refs *SQLInstan
}

if desired.Spec.Settings.DiskAutoresize != nil {
if desired.Spec.Settings.DiskAutoresize != actual.Settings.StorageAutoResize {
if direct.ValueOf(desired.Spec.Settings.DiskAutoresize) != direct.ValueOf(actual.Settings.StorageAutoResize) {
// Change disk autoresize
updateRequired = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also be able to look at the updateTime...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt like this was a more reliable way to do it, but am planning to switch to use the mapper / generator code soon anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though maybe the mapper generator code wouldn't really help here. Can you elaborate a bit on what you mean by using updateTime? Not sure I understand completely.

}
sqlDatabases[i] = sqlDatabaseName
} else {
return nil, fmt.Errorf("must specify either spec.settings.cloneSource.sqlDatabaseRefs[].external or spec.settings.cloneSource.sqlDatabaseRefs[].name")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, k8s has fieldpath to help with this problem (writing annoyingly long field paths!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share a link to an example of this?

pkg/controller/direct/sql/sqlinstance_controller.go Outdated Show resolved Hide resolved
@yuwenma
Copy link
Collaborator

yuwenma commented Aug 20, 2024

/lgtm

Looks great! You may want to resolve the comments before merging.

Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, maqiuyujoyce

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [justinsb,maqiuyujoyce]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jasonvigil
Copy link
Collaborator Author

/unhold (everyone has reviewed at this point)

@jasonvigil
Copy link
Collaborator Author

/hold cancel

@google-oss-prow google-oss-prow bot merged commit 1d4eebe into GoogleCloudPlatform:master Aug 23, 2024
12 of 13 checks passed
@jasonvigil jasonvigil deleted the cloudsql-clone branch August 23, 2024 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants