Skip to content
This repository has been archived by the owner on Sep 3, 2024. It is now read-only.

Jupyter Enterprise Gateway proposal #11

Merged
merged 4 commits into from
Oct 14, 2017

Conversation

lresende
Copy link
Collaborator

This is an Incubator proposal for Jupyter Enterprise Gateway.

Thanks, @rgbkrk and @parente for volunteering to be our Sponsors/Advocates.

@rgbkrk
Copy link
Member

rgbkrk commented Sep 28, 2017

The enterprise gateway builds on top of the kernel gateway in ways that I had originally hoped that the kernel gateway would come to provide, mainly multitenancy. The other big boon for this is the support for running kernels on a YARN cluster. Admittedly, this may seem like it overlaps with JupyterHub -- it's more of a complementary approach that would make notebook job scheduling simpler and allow building different kinds of applications on top of jupyter protocols.

@yuvipanda
Copy link

This is awesome and I'm excited to hear more details about it :)

I'm curious about overlap with / complementation of JupyterHub. From my understanding, this is mostly intended for consumption by other services, and possibly directly launches kernels rather than notebook servers. Is that accurate?

@rgbkrk
Copy link
Member

rgbkrk commented Sep 28, 2017

possibly directly launches kernels rather than notebook servers

That's correct.

Side disclaimer: @lresende, @kevin-bates, @ckadner, @liukun1016, @akchinSTC, @sherryxg, @frreiss came to Netflix (or was on our video call) to demo to me and @rdblue, seeing if we're interested in using it, and I suggested they make a proposal for incubation into Jupyter

@yuvipanda
Copy link

Awesome. Looking forward to seeing how this evolves :)

Is this meant to be spark / YARN specific? I'm also guessing not, and you'd want to have a layer of abstraction that allows plugging in other executors? Is supporting that a primary or secondary goal?

@yuvipanda
Copy link

(am also not sure if this is the right place to ask questions, so feel free to tell me if it is not!)

@lresende
Copy link
Collaborator Author

@yuvipanda The resource manager is abstracted and pluggable, we currently have a version that works on Yarn Cluster-Mode and "pseudo-distributed" in Yarn client mode. We will certainly look into other RMs in the near future.

## Other options

We are not aware of alternatives that provide the same set of capabilities, the closest that we could find is JupyterHub, but that aims to spawn multiple Jupyter Notebook servers around a given cluster - where the kernel processes are still co-located with the launching server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't seem correct? JupyterHub can spin up kernels remotely via docker, docker swarm or any custom remote spawner. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider listing some related projects here and noting how enterprise gateway differs so that the community gets a sense of where it fits in the ecosystem. Here are a few relevant projects:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhirschfeld - thank you for your comment.
JuypyterHub spins up Notebook servers, while Enterprise Gateway inherits Kernel Gateway's model of spinning up kernels (typically) launched on behalf of a Notebook server via NB2KG.

When the kernels are remoted, they can be launched in one of two ways (although this is pluggable):

  1. As direct kernels running on hosts other than Enterprise Gateway or
  2. As Yarn cluster applications where the Yarn resource manager determines on which host the application (i.e., the kernel) is launched.

This form of remoting places the kernel (which is typically a Spark driver application) closer to the workers, but also spread out across a cluster - thereby reducing a resource bottleneck on the kernel-launching server.

JupyterHub could get you something similar in the Docker swarm case, but even then, the Notebook server is what is spawned, so all kernels launched from that notebook server are local and none are capable of running as actual resource-managed applications.

@parente - great idea. I figured there'd be JupyterHub-related questions but its good to know about these others. At quick glance, these appear to tackle the remoting at the kernelspec level whereas Enterprise Gateway provides a mechanism to plug-in process lifecycle management, among other things, within the gateway itself.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevin-bates I think editing the proposal to include this comment about hub vs this proposal will be useful. I too had the same initial reaction as @dhirschfeld

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just a little concerned about fracturing the ecosystem and having two projects which do very similar things meaning development of either would be slower than if everyone pooled their resources. Everyone is free to scratch their own itch but I'm still unconvinced that JupyterHub couldn't easily satisfy the requirements above with much less effort than a whole new project.

@yuvipanda or @minrk would probably know better that me but it seems that it would be pretty easy to get JupyterHub spinning up kernels rather than notebook servers. In my setup JupyterHub spins up a remote docker container with Anaconda installed and my config tells JupyterHub to pass jupyter labhub *args to the container. To spin up a remote kernel instead I would just change the container command to ipykernel *args. That might require some minor changes to JupyterHub to not interact with the kernel, I'm not sure, but I think it's certainly possible. As for yarn, it seems that could potentially be supported with a custom YarnSpawner

With the new named servers feature in JupyterHub I can imagine it being capable of managing remote kernels for not just notebook instances in the browser but desktop apps such as JupyterLab, nteract or even Spyder. @minrk might have a thing or two to say about scope creep but that was the direction I thought things were heading...

Anyway, just wanted to make sure all options had been properly considered and dismissing JupyterHub as incapable of supporting remote kernels seemed to be starting from a questionable premise.

Copy link
Collaborator Author

@lresende lresende Oct 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhirschfeld, Thanks for taking the time to review and comment on the proposal. As @rgbkrk mentioned in his first comment, I see the “Enterprise Gateway” as really being an enhancement to the existing Kernel Gateway, and the target audience being Bring-Your-Own-Notebook where data scientists run notebooks on their desktops - on the other side of a firewall - and thus need a “gateway” to enable sharing the Spark runtime, which to me is a very specific scenario, which to certain extent is common in enterprises or cloud environments. In this case, if JKG wasn’t fracturing the JupyterHub community, I am certain that Enterprise Gateway will not either.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if it's useful and fills a niche that's great. I'm not sure about the name "Enterprise" though - to me Spark isn't synonymous with Enterprise so perhaps a more targeted name would be appropriate?

For background, I'm deploying JupyterHub in the "Enterprise", using it to spin up JupyterLab running in containers on our Windows HPC cluster. My hope is that I will be able to also use the same infrastructure to support spinning up remote kernels for any application but JupyterLab, nteract and spyder in particular as those are the ones we're using day-to-day.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhirschfeld The name chosen was "Enterprise Gateway" to distinguish the project from "Kernel Gateway". It's mostly an adjective to the gateway noun and the "enterprise" version provides extra functionality to a gateway that is usually seen on enterprise/cloud deployments.

@ellisonbg
Copy link
Contributor

Thanks for getting this started! I haven't looked at the details, but my main questions are these:

Can this be done instead as part of the existing kernel gateway or jupyterhub? We are already struggling to develop and maintain those two and i am a bit hesitant to introduce a new, separate thing in that space. JupyterHub already has a pretty general set of abstractions for:

  • Authentication
  • Spawning things (right now the focus is not full notebook servers, but I don't see any reason the same abstractions couldn't be modified to support standalone kernels).
  • Proxy/routing

From an abstract perspective are there things that the kernel gateway and this new proposal do that are not covered by these abstractions? I know the implementations might need work to address the particular usage cases, I just want to understand the differences from a general abstraction level.

@lresende
Copy link
Collaborator Author

lresende commented Oct 4, 2017

@ellisonbg The Enterprise Gateway has great synergy with the Kernel Gateway, but there is active discussion of moving the required JKG functionality directly into Jupyter Notebook, see JKG-259 and even some work, see JUPYTER-2644 , which might make JKG obsolete as the functionality will be available directly from the Notebook server and that is one of the reasons that we have chosen to bootstrap "Enterprise Gateway".

Please see these slides starting on slide 10 that can give some more details about Enterprise Gateway.

@rgbkrk
Copy link
Member

rgbkrk commented Oct 4, 2017

I'm going to provide some leading questions, since I experience some of the pain around setting up spark drivers per kernel and how we do have gaps to fill. This is hopefully a good followup to what @ellisonbg is asking:

  • What prevents JupyterHub from running on YARN in a way that works well for cluster operators?
  • Who does it benefit to run the kernels in this manner?
  • How will others integrate with a running kernel gateway?
  • What's the disconnect between how JupyterHub and the notebook runs and how a Spark Application runs on a cluster?

@parente
Copy link
Member

parente commented Oct 5, 2017

Surfacing a few notes I've made during private conversations about kernel gateway ...

Regarding the request to move features from kernel gateway to notebook server

Moving features from KG to notebook is something that'll probably take place slowly over time as we find ways to do it cleanly. At some point, KG may evaporate completely, leaving enterprise gateway dependent directly on the notebook server. This would be a good thing, IMHO.

The main sticking point is the "personality" support in KG (http://jupyter-kernel-gateway.readthedocs.io/en/latest/http-mode.html, http://jupyter-kernel-gateway.readthedocs.io/en/latest/plug-in.html). It does have some use in the community (e.g., https://github.com/natbusa/autoscience, maybe pixiedust/pixiedust#450). If enough of the KG features are absorbed into notebook, perhaps what's left can be rebranded (jupyter-personality? jupyter-build-your-own-api?) or deprecated.

Regarding JupyterHub, scaling, and multi-tenancy

The kernel gateway incubator proposal (#3) started life describing a system that would manage kernel scale-out and multi-tenancy on via a pluggable cluster resource manager. We ultimately backed away from having the KG worry about those aspects because Binder was on a similar path at the same time. Instead, we settled on a simple component that bridged websocket-to-zeromq connections to kernels and was meant to be scaled out itself by some other system (e.g., JupyterHub, binder, Mesos, ...). The diagram here depicts that concept using tmpnb as an example: http://jupyter-kernel-gateway.readthedocs.io/en/latest/uses.html. That design still holds true for the KG today.

@lresende
Copy link
Collaborator Author

lresende commented Oct 9, 2017

Thank you all for the feedback (and sorry for the delay responding as I am on the road), I have updated the incubator proposal with some details based on the discussion here, also let me try to answer in more details the questions from @ellisonbg and @rgbkrk .

IMHO, One of the main differences from JupyterHub and EG is that JupyterHub provides multitenancy by “authorizing” and routing users to an application running and managed by JupyterHub (e.g. a Jupyter Notebook managed and spawned by JupyterHub) while EG aims to act as a gateway and enable external applications (locally/outside of the cluster) to attach themselves and share computing resources from a computing cluster running on premise or in the cloud. And, while JupyterHub provides some abstractions such as spawners, this don’t help much based on the principal design distinction described above.

Another thing is that JupyterHub needs to manage some of the cluster resources, which usually cause issues with DevOps teams, particularly when this is a shared computing cluster or there are other resource managers involved and they are not aware of each other and start competing for the same amount of resources. On the same area, by managing some of the resources itself and utilizing local filesystem as writable storage for users (see Docker Spawner example](https://github.com/jupyterhub/jupyterhub/blob/master/examples/bootstrap-script/bootstrap.sh) recovering from hardware failures will be more difficult. EG mitigate most of these problems by currently leveraging Yarn as the Resource Manager for the Spark Cluster and also providing integration with HDFS a distributed file system. When EG requests a new kernel to Yarn, it will decide in which node to launch the application based on resource load/availability and then callback EG with the connection profile to be used to connect to the kernel.

As for common patterns related to how external applications will integrate with EG, I would saw that the two most common scenarios are:

  • External "Jupyter Notebook Servers" will use NB2KG to use Enterprise Gateway as a gateway to a Spark Cluster and the EG will be integrating with the Spark resource manager to request new kernels.
  • External Applications could use programmatic clients to interact and use the Enterprise Gateway as an interactive gateway to Spark

Having said that, I believe there are scenarios that would benefit of the integration between JupyterHub and Enterprise Gateway and we are definitely open to collaborate with the JupyterHub and the Jupyter community in general to make the best usage of the different components..

@rgbkrk
Copy link
Member

rgbkrk commented Oct 9, 2017

... JupyterHub provides multitenancy by “authorizing” and routing users to an application running and managed by JupyterHub ... while EG aims to act as a gateway and enable external applications (locally/outside of the cluster) to attach themselves and share computing resources...

Another thing is that JupyterHub needs to manage some of the cluster resources, which usually cause issues with DevOps teams, particularly when this is a shared computing cluster or there are other resource managers involved and they are not aware of each other and start competing for the same amount of resources

These two alone pretty well indicate to me why this development / direction is a necessity within the ecosystem. It sounds like you're all happy to explore these directions in tandem with the JupyterHub team (cc @willingc, @minrk). One possible exploration I'd recommend is seeing about a spawner that completely defers to Yarn as the resource manager. Then again, given the concerns you've listed above, maybe this isn't as clean as we'd hope.

When EG requests a new kernel to Yarn, it will decide in which node to launch the application based on resource load/availability and then callback EG with the connection profile to be used to connect to the kernel.

Yep, that's exactly what I'd be hoping for. That a kernel would be launched in the Yarn cluster (so you're not locked to a single notebook -- the resources provided are per kernel). I've faced this time and time again with our cluster where we can only realisitically have two Scala (Toree) kernels running at the same time per user (mostly because of the size of the data).

Thanks for clearing things up, I'm in favor of this incubation project. 👍

@ellisonbg
Copy link
Contributor

ellisonbg commented Oct 9, 2017 via email

@parente
Copy link
Member

parente commented Oct 10, 2017

@lresende thanks for the clarification and updates to the proposal. I'm also in favor of EG entering the incubator, if only because it fits all of the criteria for incubation in our governance docs:

  • Significant unanswered technical questions or uncertainties that require exploration.
  • Entirely new directions, scopes or ideas that haven't been vetted with the community.
  • Significant, already existing code bases where it is not clear how the Subproject will integrate with the rest of Jupyter.

with the following potential benefits for having it in the jupyter-incubator org:

  • Contributors can quickly and easily get their code exposed to the Jupyter community while complying with individual and organizational contribution restrictions.
  • Contributors can work with the community and Steering Council and gather feedback early and often that will help them develop and refine a clear and concise integration proposal.
  • Allow the community to distinguish between officially supported Subprojects and experimental Subprojects pursued by members of the community.

Personally, I view the incubator as a place for people to explore new ideas without worrying too much about long-term plans just yet. I think exposure to the community and time for collaborative development are healthy ways to figure out how a project should shape up eventually.

@rgbkrk
Copy link
Member

rgbkrk commented Oct 10, 2017

Thanks for outlining that Peter!

@lresende
Copy link
Collaborator Author

Thank you all for the feedback. What are the next steps here? We would like to move the repository and start talking more open about the project... could someone please give me the necessary permission or assist with the move and any other steps required.

Thank you.

@rgbkrk
Copy link
Member

rgbkrk commented Oct 13, 2017

I'll go ahead and add you to the incubator org @lresende so you can transfer the repository.

@ellisonbg
Copy link
Contributor

ellisonbg commented Oct 14, 2017 via email

Copy link

@willingc willingc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rgbkrk rgbkrk merged commit 800fb7b into jupyter-incubator:master Oct 14, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants