-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add method to delete namespaced PVC in spawner base class #475
add method to delete namespaced PVC in spawner base class #475
Conversation
Thanks for submitting your first pull request! You are awesome! 🤗 |
@consideRatio @minrk If either of you have time, could you please review this PR? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's practical to add a test for this new feature, especially as it may not be tested routinely?
There's an existing test which checks that secrets are deleted
kubespawner/tests/test_spawner.py
Lines 278 to 299 in c215e24
# stop the pod | |
await spawner.stop() | |
# verify pod is gone | |
pods = kube_client.list_namespaced_pod(kube_ns).items | |
pod_names = [p.metadata.name for p in pods] | |
assert "jupyter-%s" % spawner.user.name not in pod_names | |
# verify service and secret are gone | |
# it may take a little while for them to get cleaned up | |
for i in range(5): | |
secrets = kube_client.list_namespaced_secret(kube_ns).items | |
secret_names = {s.metadata.name for s in secrets} | |
services = kube_client.list_namespaced_service(kube_ns).items | |
service_names = {s.metadata.name for s in services} | |
if secret_name in secret_names or service_name in service_names: | |
time.sleep(1) | |
else: | |
break | |
assert secret_name not in secret_names | |
assert service_name not in service_names |
so maybe that structure could be copied?
This comment has been minimized.
This comment has been minimized.
Hi @manics @consideRatio I have fixed the test case. Can you please review it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @nsshah1288 so much for working to make this supported in JupyterHub itself which was the proper way of implementing this in KubeSpawner!
This PR looks good to me as it is content wise!
Could you as a final change also squash/fixup the 36 commits into one or two commits so it becomes less complicated to inspect the history of changes later?
Hi @consideRatio, thank you much for the review! If this is not sufficient, please let me know and I am willing to try whatever you need. Thank you! |
I gave it a try as I was eager to learn more about this. It becomes complicated whenever there are multiple merges to various stages of the master. I tried various things that failed but then I got the following to work as I wanted. Want to give it a try? Below is a sequence of commands you can use. # Bonus detail:
# this makes your commit be associated with your username on GitHub
# but you could also add the email regn.net email to your user account
git config --global user.email "nsshah1288@users.noreply.github.com"
git remote add upstream https://github.com/jupyterhub/kubespawner
git fetch upstream
git reset --hard upstream/master
git merge --squash 0836260 # this is the latest commit in this PR
git commit -m "add delete_forever hook to cleanup user's PVC on user deletion"
# verify it seems right
git log
git diff HEAD~1
# update this branch
git push --force <name of fork remote> feature/shahn3_pvcDeletion |
0836260
to
84688f9
Compare
Amazing @consideRatio ! Thank you so much. |
7cc6933
to
889ab6e
Compare
8d7e1b2
to
7357b7e
Compare
kubespawner/spawner.py
Outdated
try: | ||
self.api.read_namespaced_persistent_volume_claim( | ||
self.pvc_name, self.namespace | ||
) | ||
except ApiException as e: | ||
if e.status == 404: | ||
self.log.warning( | ||
"Could not delete %s. This PVC does not exist.", self.pvc_name | ||
) | ||
else: | ||
raise | ||
else: | ||
self.api.delete_namespaced_persistent_volume_claim( | ||
self.pvc_name, self.namespace | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minrk do you have any suggested changes to this section?
I'm a bit uncertain about the use of async without any await
statement and that we don't have any timeouts defined for example.
For reference, this is the async stop() method:
async def stop(self, now=False):
delete_options = client.V1DeleteOptions()
if now:
grace_seconds = 0
else:
grace_seconds = self.delete_grace_period
delete_options.grace_period_seconds = grace_seconds
ref_key = "{}/{}".format(self.namespace, self.pod_name)
await exponential_backoff(
partial(
self._make_delete_pod_request,
self.pod_name,
delete_options,
grace_seconds,
self.k8s_api_request_timeout,
),
f'Could not delete pod {ref_key}',
timeout=self.k8s_api_request_retry_timeout,
)
try:
await exponential_backoff(
lambda: self.pod_reflector.pods.get(ref_key, None) is None,
'pod %s did not disappear in %s seconds!'
% (ref_key, self.start_timeout),
timeout=self.start_timeout,
)
except TimeoutError:
self.log.error(
"Pod %s did not disappear, restarting pod reflector", ref_key
)
self._start_watching_pods(replace=True)
raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit uncertain about the use of async without any await
This in principle is fine, and correct if you are implementing a generally async API but don't have anything to wait for. In this case, however, all k8s API requests should be async via self.asynchronize
, in which case there will be awaits. Following the timeout/conflict/error handling pattern in our other API requests makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @minrk! Okay let's follow the pattern used already for timeout/conflict/error.
Suggested changes
Thank you for your work on this @nsshah1288! For reliability and performance of this feature it would be good if we can do the following as well:
- I think we should stop making a get request for the PVC, and just try delete it straight away instead and use a 404 response as an indicator of that instead.
- Mimic the pattern for deleting a pod above using exponential_backoff, using the same timeouts
- Ignore the part about verifying the PVC was removed (a X reflector monitors and caches the information about X in the k8s cluster - we don't have a PVC reflector)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @consideRatio , thanks to you and @minrk for the suggestions! I tried to implement what you outlined above. I tested the changes again with sqlite-pvc and it worked well. Please let me know if any other changes are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @consideRatio , do the changes look OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsshah1288 thanks for your work on this! 🎉
If the tests pass and the following changes below LGTY I think we could be ready to merge this.
-
I cleaned up the merge commit (reset everything to the current master branch and then manually cherry picked all but the merge commit and solving some trivial merge conflicts).
-
By doing above I --force pushed to this PRs branch, backing up its previous state in https://github.com/consideRatio/kubespawner/tree/feature/shahn3_pvcDeletion-backup
-
I added ea12ea7 because the function didn't have the correct signature matching JupyterHub's Spawner base class
-
I opted to not default to delete_grace_period for the PVC deletion request because as it is explicitly described to be used for the pod deletion requests.
delete_grace_period = Integer( 1, config=True, help=""" Time in seconds for the pod to be in `terminating` state before is forcefully killed. Increase this if you need more time to execute a `preStop` lifecycle hook. See https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods for more information on how pod termination works. Defaults to `1`. """, )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your detailed help on this @consideRatio ! The changes look good to me.
524aad0
to
ea12ea7
Compare
Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>
to opt-out of new pvc deletion logic and avoid deleting shared PVCs when deleting named servers
I added a condition and clarified a docstring:
|
"No pvc %s to delete. Assuming already deleted.", | ||
pvc_name, | ||
) | ||
# If there isn't a PVC to delete, that's ok too! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got a question about this and the warning a couple of lines above. In our deployment we're currently using kubespawner 0.15 and jupyterhub 1.3. I'm looking to upgrade to z2jh 1.2.0, jupyterhub 1.5.0 and kubespawner 1.1.2. We use a single PVC that is backed by cloud object storage so all of the singleuser-server pods share the same PVC. We leave the default of storage_pvc_ensure=False
which from looking at the code means kubespawner won't actually create a PVC named self.pvc_name
in _start
. So shouldn't delete_forever
also be checking storage_pvc_ensure
? Maybe not if it's assumed that someone could pre-create the PVC using the same pvc_name_template
, and in that case we just try to delete the PVC and if it doesn't exist we get a 404 and log the warning.
I guess in our case the point is I don't want to see the warnings when users get culled (which happens after 5 days of inactivity for us) so to avoid that we just need to set delete_pvc=False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted this to #543 to avoid loosing track of this comment! Please follow up there rather than here so we don't disperse the discussion about this in two places.
PR Summary by Erik
This PR implements a
delete_forever
hook for KubeSpawner. Thedelete_forever
hook will be respected by JupyterHub 1.4.1 and later and was implemented in jupyterhub/jupyterhub#3337. The idea of this hook is to allow the Spawners to perform cleanup of a users associated resources at the time a JupyterHub user is deleted.As KubeSpawner can create PVCs for users, this KubeSpawner implementation checks for a PVC that it could have created for a user and tries to delete it. This behavior should be documented properly in the CHANGELOG.md for the next KubeSpawner release, I'm not sure if I think it would be labelled as a "breaking" change, but it is a change in the user experience that is quite important to be aware about.
This PR closes #446.
Original PR message
have tested with sqlite-pvc