Jupyter Enterprise Gateway proposal #11

lresende · 2017-09-28T01:15:30Z

This is an Incubator proposal for Jupyter Enterprise Gateway.

Thanks, @rgbkrk and @parente for volunteering to be our Sponsors/Advocates.

rgbkrk · 2017-09-28T19:02:27Z

The enterprise gateway builds on top of the kernel gateway in ways that I had originally hoped that the kernel gateway would come to provide, mainly multitenancy. The other big boon for this is the support for running kernels on a YARN cluster. Admittedly, this may seem like it overlaps with JupyterHub -- it's more of a complementary approach that would make notebook job scheduling simpler and allow building different kinds of applications on top of jupyter protocols.

yuvipanda · 2017-09-28T19:22:01Z

This is awesome and I'm excited to hear more details about it :)

I'm curious about overlap with / complementation of JupyterHub. From my understanding, this is mostly intended for consumption by other services, and possibly directly launches kernels rather than notebook servers. Is that accurate?

rgbkrk · 2017-09-28T19:26:16Z

possibly directly launches kernels rather than notebook servers

That's correct.

Side disclaimer: @lresende, @kevin-bates, @ckadner, @liukun1016, @akchinSTC, @sherryxg, @frreiss came to Netflix (or was on our video call) to demo to me and @rdblue, seeing if we're interested in using it, and I suggested they make a proposal for incubation into Jupyter

yuvipanda · 2017-09-28T20:13:21Z

Awesome. Looking forward to seeing how this evolves :)

Is this meant to be spark / YARN specific? I'm also guessing not, and you'd want to have a layer of abstraction that allows plugging in other executors? Is supporting that a primary or secondary goal?

yuvipanda · 2017-09-28T20:13:42Z

(am also not sure if this is the right place to ask questions, so feel free to tell me if it is not!)

lresende · 2017-09-28T22:45:32Z

@yuvipanda The resource manager is abstracted and pluggable, we currently have a version that works on Yarn Cluster-Mode and "pseudo-distributed" in Yarn client mode. We will certainly look into other RMs in the near future.

dhirschfeld · 2017-09-29T12:52:28Z

enterprise_gateway/proposal.md

+## Other options
+
+We are not aware of alternatives that provide the same set of capabilities, the closest that we could find is JupyterHub, but that aims to spawn multiple Jupyter Notebook servers around a given cluster - where the kernel processes are still co-located with the launching server.  
+


That doesn't seem correct? JupyterHub can spin up kernels remotely via docker, docker swarm or any custom remote spawner. Am I missing something?

Consider listing some related projects here and noting how enterprise gateway differs so that the community gets a sense of where it fits in the ecosystem. Here are a few relevant projects:

https://github.com/danielballan/remotekernel

https://bitbucket.org/tdaff/remote_ikernel

https://github.com/ipython/ipyparallel

https://github.com/korniichuk/rk

@dhirschfeld - thank you for your comment.
JuypyterHub spins up Notebook servers, while Enterprise Gateway inherits Kernel Gateway's model of spinning up kernels (typically) launched on behalf of a Notebook server via NB2KG.

When the kernels are remoted, they can be launched in one of two ways (although this is pluggable):

As direct kernels running on hosts other than Enterprise Gateway or

As Yarn cluster applications where the Yarn resource manager determines on which host the application (i.e., the kernel) is launched.

This form of remoting places the kernel (which is typically a Spark driver application) closer to the workers, but also spread out across a cluster - thereby reducing a resource bottleneck on the kernel-launching server.

JupyterHub could get you something similar in the Docker swarm case, but even then, the Notebook server is what is spawned, so all kernels launched from that notebook server are local and none are capable of running as actual resource-managed applications.

@parente - great idea. I figured there'd be JupyterHub-related questions but its good to know about these others. At quick glance, these appear to tackle the remoting at the kernelspec level whereas Enterprise Gateway provides a mechanism to plug-in process lifecycle management, among other things, within the gateway itself.

@kevin-bates I think editing the proposal to include this comment about hub vs this proposal will be useful. I too had the same initial reaction as @dhirschfeld

I'm just a little concerned about fracturing the ecosystem and having two projects which do very similar things meaning development of either would be slower than if everyone pooled their resources. Everyone is free to scratch their own itch but I'm still unconvinced that JupyterHub couldn't easily satisfy the requirements above with much less effort than a whole new project.

@yuvipanda or @minrk would probably know better that me but it seems that it would be pretty easy to get JupyterHub spinning up kernels rather than notebook servers. In my setup JupyterHub spins up a remote docker container with Anaconda installed and my config tells JupyterHub to pass jupyter labhub *args to the container. To spin up a remote kernel instead I would just change the container command to ipykernel *args. That might require some minor changes to JupyterHub to not interact with the kernel, I'm not sure, but I think it's certainly possible. As for yarn, it seems that could potentially be supported with a custom YarnSpawner

With the new named servers feature in JupyterHub I can imagine it being capable of managing remote kernels for not just notebook instances in the browser but desktop apps such as JupyterLab, nteract or even Spyder. @minrk might have a thing or two to say about scope creep but that was the direction I thought things were heading...

Anyway, just wanted to make sure all options had been properly considered and dismissing JupyterHub as incapable of supporting remote kernels seemed to be starting from a questionable premise.

@dhirschfeld, Thanks for taking the time to review and comment on the proposal. As @rgbkrk mentioned in his first comment, I see the “Enterprise Gateway” as really being an enhancement to the existing Kernel Gateway, and the target audience being Bring-Your-Own-Notebook where data scientists run notebooks on their desktops - on the other side of a firewall - and thus need a “gateway” to enable sharing the Spark runtime, which to me is a very specific scenario, which to certain extent is common in enterprises or cloud environments. In this case, if JKG wasn’t fracturing the JupyterHub community, I am certain that Enterprise Gateway will not either.

Sure, if it's useful and fills a niche that's great. I'm not sure about the name "Enterprise" though - to me Spark isn't synonymous with Enterprise so perhaps a more targeted name would be appropriate?

For background, I'm deploying JupyterHub in the "Enterprise", using it to spin up JupyterLab running in containers on our Windows HPC cluster. My hope is that I will be able to also use the same infrastructure to support spinning up remote kernels for any application but JupyterLab, nteract and spyder in particular as those are the ones we're using day-to-day.

@dhirschfeld The name chosen was "Enterprise Gateway" to distinguish the project from "Kernel Gateway". It's mostly an adjective to the gateway noun and the "enterprise" version provides extra functionality to a gateway that is usually seen on enterprise/cloud deployments.

ellisonbg · 2017-10-04T17:11:36Z

Thanks for getting this started! I haven't looked at the details, but my main questions are these:

Can this be done instead as part of the existing kernel gateway or jupyterhub? We are already struggling to develop and maintain those two and i am a bit hesitant to introduce a new, separate thing in that space. JupyterHub already has a pretty general set of abstractions for:

Authentication
Spawning things (right now the focus is not full notebook servers, but I don't see any reason the same abstractions couldn't be modified to support standalone kernels).
Proxy/routing

From an abstract perspective are there things that the kernel gateway and this new proposal do that are not covered by these abstractions? I know the implementations might need work to address the particular usage cases, I just want to understand the differences from a general abstraction level.

lresende · 2017-10-04T17:41:37Z

@ellisonbg The Enterprise Gateway has great synergy with the Kernel Gateway, but there is active discussion of moving the required JKG functionality directly into Jupyter Notebook, see JKG-259 and even some work, see JUPYTER-2644 , which might make JKG obsolete as the functionality will be available directly from the Notebook server and that is one of the reasons that we have chosen to bootstrap "Enterprise Gateway".

Please see these slides starting on slide 10 that can give some more details about Enterprise Gateway.

rgbkrk · 2017-10-04T19:09:41Z

I'm going to provide some leading questions, since I experience some of the pain around setting up spark drivers per kernel and how we do have gaps to fill. This is hopefully a good followup to what @ellisonbg is asking:

What prevents JupyterHub from running on YARN in a way that works well for cluster operators?
Who does it benefit to run the kernels in this manner?
How will others integrate with a running kernel gateway?
What's the disconnect between how JupyterHub and the notebook runs and how a Spark Application runs on a cluster?

parente · 2017-10-05T01:18:38Z

Surfacing a few notes I've made during private conversations about kernel gateway ...

Regarding the request to move features from kernel gateway to notebook server

Moving features from KG to notebook is something that'll probably take place slowly over time as we find ways to do it cleanly. At some point, KG may evaporate completely, leaving enterprise gateway dependent directly on the notebook server. This would be a good thing, IMHO.

The main sticking point is the "personality" support in KG (http://jupyter-kernel-gateway.readthedocs.io/en/latest/http-mode.html, http://jupyter-kernel-gateway.readthedocs.io/en/latest/plug-in.html). It does have some use in the community (e.g., https://github.com/natbusa/autoscience, maybe pixiedust/pixiedust#450). If enough of the KG features are absorbed into notebook, perhaps what's left can be rebranded (jupyter-personality? jupyter-build-your-own-api?) or deprecated.

Regarding JupyterHub, scaling, and multi-tenancy

The kernel gateway incubator proposal (#3) started life describing a system that would manage kernel scale-out and multi-tenancy on via a pluggable cluster resource manager. We ultimately backed away from having the KG worry about those aspects because Binder was on a similar path at the same time. Instead, we settled on a simple component that bridged websocket-to-zeromq connections to kernels and was meant to be scaled out itself by some other system (e.g., JupyterHub, binder, Mesos, ...). The diagram here depicts that concept using tmpnb as an example: http://jupyter-kernel-gateway.readthedocs.io/en/latest/uses.html. That design still holds true for the KG today.

lresende · 2017-10-09T01:47:36Z

Thank you all for the feedback (and sorry for the delay responding as I am on the road), I have updated the incubator proposal with some details based on the discussion here, also let me try to answer in more details the questions from @ellisonbg and @rgbkrk .

IMHO, One of the main differences from JupyterHub and EG is that JupyterHub provides multitenancy by “authorizing” and routing users to an application running and managed by JupyterHub (e.g. a Jupyter Notebook managed and spawned by JupyterHub) while EG aims to act as a gateway and enable external applications (locally/outside of the cluster) to attach themselves and share computing resources from a computing cluster running on premise or in the cloud. And, while JupyterHub provides some abstractions such as spawners, this don’t help much based on the principal design distinction described above.

Another thing is that JupyterHub needs to manage some of the cluster resources, which usually cause issues with DevOps teams, particularly when this is a shared computing cluster or there are other resource managers involved and they are not aware of each other and start competing for the same amount of resources. On the same area, by managing some of the resources itself and utilizing local filesystem as writable storage for users (see Docker Spawner example](https://github.com/jupyterhub/jupyterhub/blob/master/examples/bootstrap-script/bootstrap.sh) recovering from hardware failures will be more difficult. EG mitigate most of these problems by currently leveraging Yarn as the Resource Manager for the Spark Cluster and also providing integration with HDFS a distributed file system. When EG requests a new kernel to Yarn, it will decide in which node to launch the application based on resource load/availability and then callback EG with the connection profile to be used to connect to the kernel.

As for common patterns related to how external applications will integrate with EG, I would saw that the two most common scenarios are:

External "Jupyter Notebook Servers" will use NB2KG to use Enterprise Gateway as a gateway to a Spark Cluster and the EG will be integrating with the Spark resource manager to request new kernels.
External Applications could use programmatic clients to interact and use the Enterprise Gateway as an interactive gateway to Spark

Having said that, I believe there are scenarios that would benefit of the integration between JupyterHub and Enterprise Gateway and we are definitely open to collaborate with the JupyterHub and the Jupyter community in general to make the best usage of the different components..

rgbkrk · 2017-10-09T03:31:05Z

... JupyterHub provides multitenancy by “authorizing” and routing users to an application running and managed by JupyterHub ... while EG aims to act as a gateway and enable external applications (locally/outside of the cluster) to attach themselves and share computing resources...

Another thing is that JupyterHub needs to manage some of the cluster resources, which usually cause issues with DevOps teams, particularly when this is a shared computing cluster or there are other resource managers involved and they are not aware of each other and start competing for the same amount of resources

These two alone pretty well indicate to me why this development / direction is a necessity within the ecosystem. It sounds like you're all happy to explore these directions in tandem with the JupyterHub team (cc @willingc, @minrk). One possible exploration I'd recommend is seeing about a spawner that completely defers to Yarn as the resource manager. Then again, given the concerns you've listed above, maybe this isn't as clean as we'd hope.

When EG requests a new kernel to Yarn, it will decide in which node to launch the application based on resource load/availability and then callback EG with the connection profile to be used to connect to the kernel.

Yep, that's exactly what I'd be hoping for. That a kernel would be launched in the Yarn cluster (so you're not locked to a single notebook -- the resources provided are per kernel). I've faced this time and time again with our cluster where we can only realisitically have two Scala (Toree) kernels running at the same time per user (mostly because of the size of the data).

Thanks for clearing things up, I'm in favor of this incubation project. 👍

ellisonbg · 2017-10-09T04:10:35Z

Luciano, thanks for the clarification. Based on those distinctions, I think this should move forward. However, let's keep an eye out for ways that this work could potentially be leveraged by other Jupyter subproject.

…

On Sun, Oct 8, 2017 at 8:31 PM, Kyle Kelley ***@***.***> wrote: ... JupyterHub provides multitenancy by “authorizing” and routing users to an application running and managed by JupyterHub ... while EG aims to act as a gateway and enable external applications (locally/outside of the cluster) to attach themselves and share computing resources... Another thing is that JupyterHub needs to manage some of the cluster resources, which usually cause issues with DevOps teams, particularly when this is a shared computing cluster or there are other resource managers involved and they are not aware of each other and start competing for the same amount of resources These two alone pretty well indicate to me why this development / direction is a necessity within the ecosystem. It sounds like you're all happy to explore these directions in tandem with the JupyterHub team (cc @willingc <https://github.com/willingc>, @minrk <https://github.com/minrk>). One possible exploration I'd recommend is seeing about a spawner that *completely* defers to Yarn as the resource manager. Then again, given the concerns you've listed above, maybe this isn't as clean as you'd hope. When EG requests a new kernel to Yarn, it will decide in which node to launch the application based on resource load/availability and then callback EG with the connection profile to be used to connect to the kernel. Yep, that's exactly what I'd be hoping for. That a kernel would be launched in the Yarn cluster (so you're not locked to a single notebook -- the resources provided are per kernel). I've faced this time and time again with our cluster where we can only realisitically have two Scala (Toree) kernels running at the same time per user. Thanks for clearing things up, I'm in favor of this incubation project. 👍 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#11 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABr0FvT_bmtr-C7Vjmy5gkKmjLjmSsLks5sqZN6gaJpZM4PmmVq> .

-- Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

parente · 2017-10-10T01:12:30Z

@lresende thanks for the clarification and updates to the proposal. I'm also in favor of EG entering the incubator, if only because it fits all of the criteria for incubation in our governance docs:

Significant unanswered technical questions or uncertainties that require exploration.

Entirely new directions, scopes or ideas that haven't been vetted with the community.

Significant, already existing code bases where it is not clear how the Subproject will integrate with the rest of Jupyter.

with the following potential benefits for having it in the jupyter-incubator org:

Contributors can quickly and easily get their code exposed to the Jupyter community while complying with individual and organizational contribution restrictions.

Contributors can work with the community and Steering Council and gather feedback early and often that will help them develop and refine a clear and concise integration proposal.

Allow the community to distinguish between officially supported Subprojects and experimental Subprojects pursued by members of the community.

Personally, I view the incubator as a place for people to explore new ideas without worrying too much about long-term plans just yet. I think exposure to the community and time for collaborative development are healthy ways to figure out how a project should shape up eventually.

rgbkrk · 2017-10-10T03:07:44Z

Thanks for outlining that Peter!

lresende · 2017-10-13T13:41:19Z

Thank you all for the feedback. What are the next steps here? We would like to move the repository and start talking more open about the project... could someone please give me the necessary permission or assist with the move and any other steps required.

Thank you.

rgbkrk · 2017-10-13T14:58:13Z

I'll go ahead and add you to the incubator org @lresende so you can transfer the repository.

ellisonbg · 2017-10-14T21:36:29Z

+1

…

On Fri, Oct 13, 2017 at 7:58 AM, Kyle Kelley ***@***.***> wrote: I'll go ahead and add you to the incubator org @lresende <https://github.com/lresende> so you can transfer the repository. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#11 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABr0KuNNkO9MUG7_Ge931R4js7FWWo-ks5sr3qGgaJpZM4PmmVq> .

-- Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

willingc

Thanks!

Jupyter Enterprise Gateway proposal

0436e16

rgbkrk approved these changes Sep 28, 2017

View reviewed changes

dhirschfeld reviewed Sep 29, 2017

View reviewed changes

kevin-bates mentioned this pull request Oct 3, 2017

Allow http user and password to be passed in if JKG is behind a secured gateway such as Apache Knox jupyter/kernel_gateway_demos#48

Merged

lresende added 3 commits October 5, 2017 17:01

Text formatting - no content changes

98723a0

Updates based on initial proposal feedback

7445227

Update 'other options' proposal section

8f7a0ef

parente approved these changes Oct 10, 2017

View reviewed changes

willingc approved these changes Oct 14, 2017

View reviewed changes

rgbkrk merged commit 800fb7b into jupyter-incubator:master Oct 14, 2017

lresende mentioned this pull request Jan 18, 2019

Add navigation link to Enterprise Gateway jupyter/jupyter.github.io#321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jupyter Enterprise Gateway proposal #11

Jupyter Enterprise Gateway proposal #11

lresende commented Sep 28, 2017

rgbkrk commented Sep 28, 2017

yuvipanda commented Sep 28, 2017

rgbkrk commented Sep 28, 2017 •

edited

Loading

yuvipanda commented Sep 28, 2017

yuvipanda commented Sep 28, 2017

lresende commented Sep 28, 2017

dhirschfeld Sep 29, 2017

parente Sep 29, 2017

kevin-bates Sep 29, 2017

yuvipanda Sep 29, 2017

dhirschfeld Sep 29, 2017

lresende Oct 2, 2017 •

edited

Loading

dhirschfeld Oct 2, 2017

lresende Oct 2, 2017

ellisonbg commented Oct 4, 2017

lresende commented Oct 4, 2017

rgbkrk commented Oct 4, 2017

parente commented Oct 5, 2017

lresende commented Oct 9, 2017

rgbkrk commented Oct 9, 2017 •

edited

Loading

ellisonbg commented Oct 9, 2017 via email

parente commented Oct 10, 2017

rgbkrk commented Oct 10, 2017

lresende commented Oct 13, 2017

rgbkrk commented Oct 13, 2017

ellisonbg commented Oct 14, 2017 via email

willingc left a comment

		## Other options

		We are not aware of alternatives that provide the same set of capabilities, the closest that we could find is JupyterHub, but that aims to spawn multiple Jupyter Notebook servers around a given cluster - where the kernel processes are still co-located with the launching server.

Jupyter Enterprise Gateway proposal #11

Jupyter Enterprise Gateway proposal #11

Conversation

lresende commented Sep 28, 2017

rgbkrk commented Sep 28, 2017

yuvipanda commented Sep 28, 2017

rgbkrk commented Sep 28, 2017 • edited Loading

yuvipanda commented Sep 28, 2017

yuvipanda commented Sep 28, 2017

lresende commented Sep 28, 2017

dhirschfeld Sep 29, 2017

Choose a reason for hiding this comment

parente Sep 29, 2017

Choose a reason for hiding this comment

kevin-bates Sep 29, 2017

Choose a reason for hiding this comment

yuvipanda Sep 29, 2017

Choose a reason for hiding this comment

dhirschfeld Sep 29, 2017

Choose a reason for hiding this comment

lresende Oct 2, 2017 • edited Loading

Choose a reason for hiding this comment

dhirschfeld Oct 2, 2017

Choose a reason for hiding this comment

lresende Oct 2, 2017

Choose a reason for hiding this comment

ellisonbg commented Oct 4, 2017

lresende commented Oct 4, 2017

rgbkrk commented Oct 4, 2017

parente commented Oct 5, 2017

lresende commented Oct 9, 2017

rgbkrk commented Oct 9, 2017 • edited Loading

ellisonbg commented Oct 9, 2017 via email

parente commented Oct 10, 2017

rgbkrk commented Oct 10, 2017

lresende commented Oct 13, 2017

rgbkrk commented Oct 13, 2017

ellisonbg commented Oct 14, 2017 via email

willingc left a comment

Choose a reason for hiding this comment

rgbkrk commented Sep 28, 2017 •

edited

Loading

lresende Oct 2, 2017 •

edited

Loading

rgbkrk commented Oct 9, 2017 •

edited

Loading