Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DeleteSnapshot Retries in SnapshotWriter #8456

Merged
merged 4 commits into from
Jul 11, 2023

Conversation

sam-heilbron
Copy link
Contributor

@sam-heilbron sam-heilbron commented Jul 11, 2023

Description

Enhance the DeleteSnapshot method of the SnapshotWriter to consume the same retryOptions that the WriteSnapshot operation uses

Context

Why is this necessary?

When deleting resources in bulk, we may delete a resource and while gloo is processing that deleting, we delete another resource that references it. This will result in Gloo rejecting the deletion of the second resource. In a perfect world, Gloo could handle these bulk updates. However, for now, we mirror how we handle writing resources in bulk in tests, we just retry the whole operation. This retry gives time for Gloo to process the changes, and eventually accept the proposed deletion.

How is this tested?

I wrote a kube2e test which configures a resource that does not exist. In the past, this would have thrown an error. We now swallow "does not exist" errors, to ensure we can retry operations. The test demonstrates that we successfully swallow these errors and continue deleting the snapshot.

This does NOT test that we retry correctly. However, this logic is identical to the writeSnapshot logic, so I decided it was ok to not have full test coverage. Also, if we see that flakes are still existing, then it will be a signal that there is a gap in the functionality.

Checklist:

  • I included a concise, user-facing changelog (for details, see https://github.com/solo-io/go-utils/tree/main/changelogutils) which references the issue that is resolved.
  • If I updated APIs (our protos) or helm values, I ran make -B install-go-tools generated-code to ensure there will be no code diff
  • I followed guidelines laid out in the Gloo Edge contribution guide
  • I opened a draft PR or added the work in progress label if my PR is not ready for review
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@solo-changelog-bot
Copy link

Issues linked to changelog:
https://github.com/solo-io/solo-projects/issues/5162

@github-actions github-actions bot added the keep pr updated signals bulldozer to keep pr up to date with base branch label Jul 11, 2023
@github-actions
Copy link

github-actions bot commented Jul 11, 2023

Visit the preview URL for this PR (updated for commit 32355b1):

https://gloo-edge--pr8456-stability-snapshot-d-fdbcpb0n.web.app

(expires Tue, 18 Jul 2023 16:49:53 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 77c2b86e287749579b7ff9cadb81e099042ef677

@ashishb-solo
Copy link
Contributor

this also looks like it should fix https://github.com/solo-io/solo-projects/issues/5156

@soloio-bulldozer soloio-bulldozer bot merged commit 052564a into main Jul 11, 2023
13 checks passed
@soloio-bulldozer soloio-bulldozer bot deleted the stability/snapshot-delete-retries branch July 11, 2023 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keep pr updated signals bulldozer to keep pr up to date with base branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants