Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions for adding GPUs and increasing shared memory #1358

Merged
merged 1 commit into from
Aug 16, 2019

Conversation

tlkh
Copy link
Contributor

@tlkh tlkh commented Aug 16, 2019

This adds brief instructions to the docs for how to:

  • allocate GPUs to users
  • increase the shared memory (SHM) allocation for users, which is beneficial for deep learning workloads and even required for functions like PyTorch's DataLoader to run

I noticed #994, but I also think it's beneficial to start adding basic information and instructions into the documentation so users can have something to refer to as a starting point.

Any feedback or comments appreciated!

@betatim
Copy link
Member

betatim commented Aug 16, 2019

Looks good to me.

Are there any downsides to the increased SHM or weird side effects it could have? If there is a resource we could link to for "I changed my SHM size and now stuff is breaking" that could be useful.

Should we also link to https://cloud.google.com/kubernetes-engine/docs/how-to/gpus as GKE specific docs for getting GPUs setup on your node pool there? As we find them we could add links for other cloud providers as well.

@tlkh
Copy link
Contributor Author

tlkh commented Aug 16, 2019

@betatim good idea!

The only downside I'm aware of is that (and you're right that we should mention this):

  • use of SHM counts towards the pod's memory limits
  • if the SHM usage exceeds pod's memory limits the pod will be evicted

Otherwise, increasing SHM is recommended by NVIDIA and also is built into Kubeflow's notebook spawner for the same reasons I mentioned.

I'll add in links for allocating GPUs on GKE, and the equivalents for AWS and Azure.

I'll amend the commit and push over.

@tlkh
Copy link
Contributor Author

tlkh commented Aug 16, 2019

Pushed the improvements!

@betatim betatim merged commit 154913a into jupyterhub:master Aug 16, 2019
@betatim
Copy link
Member

betatim commented Aug 16, 2019

Thanks for helping improve the docs and the fast turn around!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants