Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle 410 Gone response when watching resources #478

Merged
merged 5 commits into from
Aug 29, 2024

Conversation

florianvazelle
Copy link
Contributor

Closes #476

This MR:

  • raise a new exception GoneError when the watch request return a 410 code,
  • and refresh the object when a GoneError is raise, while we waiting conditions on the object

@github-actions github-actions bot added the kr8s label Aug 21, 2024
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

Attention: Patch coverage is 70.58824% with 5 lines in your changes missing coverage. Please review.

Project coverage is 94.98%. Comparing base (87063fc) to head (e5e35aa).
Report is 111 commits behind head on main.

Files with missing lines Patch % Lines
kr8s/_api.py 70.58% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #478      +/-   ##
==========================================
+ Coverage   94.61%   94.98%   +0.36%     
==========================================
  Files          29       30       +1     
  Lines        3141     3909     +768     
==========================================
+ Hits         2972     3713     +741     
- Misses        169      196      +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great thanks! Usually I would ask for tests but I can't think of a clean way to test this without mocking out the whole call.

I left one small nitpick comment, but otherwise this loks good to go.

kr8s/_api.py Outdated Show resolved Hide resolved
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking some more about this. I'm not confident we are handling this in the right place.

The guidance from Kubernetes is that clients should restart the watch from the most recent resource version when it gets a 410.

However, this proposal is to just raise an exception in watch() and handle it in wait(). This doesn't account for folks using watch() directly, they will just get a GoneError every now and then that they need to handle, but really we should be handling it for them.

I think we should move the while True into async_watch() and never raise a GoneError.

kr8s/_api.py Outdated Show resolved Hide resolved
kr8s/tests/test_api.py Outdated Show resolved Hide resolved
kr8s/_api.py Outdated Show resolved Hide resolved
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like tests are failing on older Python versions. I think you want to use anyio.TaskGroup() instead of asyncio.TaskGroup() for backward compatibility (and trio support).

@florianvazelle
Copy link
Contributor Author

@jacobtomlinson thanks for your help, it was more complicated than I thought 😅

@github-actions github-actions bot removed the tests label Aug 29, 2024
@jacobtomlinson jacobtomlinson changed the title fix: handle 410 response on watch request Handle 410 Gone response when watching resources Aug 29, 2024
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate you taking the time to get really deep with that test. However I'm worried that it digs so far into kr8s internals with mocking that it will become brittle in the future when we want to change things.

It's a little annoying that the default etcd behaviour is to cache for 5 mins, so we would need a test that runs for 5 mins to test this properly without mocks.

I think it's probably best to just merge this without a test to save maintenance issues in the future. I've also added some debug logging so that we can verify it is restarting for ourselves.

@jacobtomlinson jacobtomlinson marked this pull request as ready for review August 29, 2024 09:31
@jacobtomlinson jacobtomlinson merged commit 721dc09 into kr8s-org:main Aug 29, 2024
12 checks passed
@jacobtomlinson jacobtomlinson added the bug Something isn't working label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working kr8s
Projects
None yet
Development

Successfully merging this pull request may close these issues.

410 response while waiting for jobs
2 participants