Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate storage mechanism for routes #81

Merged
merged 8 commits into from
Sep 30, 2016

Conversation

pseudomuto
Copy link
Contributor

@minrk

The Problem

Currently the code that handles persistence of routes is embedded directly in the proxy code. This makes it difficult to choose a different storage method (e.g. redis, memcached, db, etc.).

This issue becomes apparent when you're running multiple JupyterHub instances behind nginx (or some other proxy). In this scenario, each JH instance will spawn their own proxy and maintain their own route table (in the node proc).

When a request comes in at /user/xyz, depending on the host that nginx routes to, the local proxy will either have the route, or not.

😢

Initially I patched the JH Proxy object to update all routes and lowered the last_activity_timeout param so it would happen often, but this seemed like a total hack and doesn't always work (still a small window of time where the error can occur).

The Solution

Ideally, storage should be configurable/pluggable. I'm thinking something like passing a config option to change the persistence method or something.

In order to keep this PR small and reviewable, I've simply moved the storage from configproxy.js into it's own store.js module. The trie is still used, and all storage is still in memory (i.e. nothing changes).

Next Steps

If this seems reasonable, then I'll add implementations for other providers in a separate PR.

@pseudomuto
Copy link
Contributor Author

CI failure seems to be flaky tests. This one for instance passed on the retry, but failed initially

@rgbkrk
Copy link
Member

rgbkrk commented Sep 28, 2016

Great, thanks for bringing this up! I agree that we need to have a way to persist routes reliably across many nodes.

@minrk
Copy link
Member

minrk commented Sep 28, 2016

Thanks! I think it's fine to add this functionality. I'm not sure the best way to go about it. The things I'd be concerned about are:

  1. what's the best way to add providers? I don't want CHP to grow too much, and adding a provider in general shouldn't have to be a PR against this repo, and I don't love adding dependencies that people don't use.
  2. I would prefer to avoid route lookup becoming an async operation, but it would presumably have to be for any external storage, and therefore also become an immediately-evaluated async API for the in-memory case, as well.

I don't know how to achieve 1 best, given how npm bundles dependencies. I'd also be worried about growing optional dependencies, but at least the redis and memjs clients aren't problematic (other memcache clients appear to be).

@pseudomuto
Copy link
Contributor Author

@minrk I agree with all of that!

I'm just working on making the storage async as we speak. As for adding deps/providers in this repo, I'm not entirely sure what the best move would be just yet. One of those "I'll burn that bridge when I get there" kinda things 😄

Maybe peer dependencies could be used somehow?

@minrk
Copy link
Member

minrk commented Sep 28, 2016

I don't think the tests have been flaky recently, so that seems like there might be something amiss in the changes here, especially since every run is failing in the same way, and a push from master just now is all green. This PR does pass all tests when I run on my laptop, though, which makes debugging harder.

Re: peerDependencies, maybe! I'm far from an expert on npm packaging, and mostly get frustrated the more I try to solve nontrivial problems with it. My first impression is that peerDeps would solve it well where configproxy is used as a library, but may not work well for the npm install -g-style application used most often. I would love for someone else to figure out the best way to deal with that. Let us know what you figure out!

@pseudomuto pseudomuto force-pushed the separate_route_storage branch 3 times, most recently from e39e45b to 9f03146 Compare September 29, 2016 18:16
@pseudomuto
Copy link
Contributor Author

It's 🍏 !!!

I've updated the code to now use an async API for storage. Not gonna lie, it was a little rough. This is much bigger than I originally intended, so Iet me apologize in advance 😄

I don't think the tests have been flaky recently, so that seems like there might be something amiss in the changes here

This was indeed my changes. The first was actually a lint error, and the 0.x versions didn't support Object.assign.

@minrk
Copy link
Member

minrk commented Sep 30, 2016

You're right, this is really huge! The async changes are unfortunate, but I guess there's no way around it.

It apparently also contains a bunch of aesthetic changes? I wish that didn't happen, as the PR is big enough without those, especially since in most cases I prefer the before to the after. Was there a reason for those? Or did you use an auto-formatter?


if (inactive_since) {
var keys = Object.keys(routes).filter(function (key) {
return routes[key].last_activity < inactive_since;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous implementation was more efficient because it iterated over the routes only once, rather than twice.


this._routes.getTarget(base_path + decodeURIComponent(req.url), function (route) {
timer.stop();
var result = route ? { prefix: route.prefix, target: route.data.target } : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to leave the less-dense format above.

];

r.get(api_url + "?inactive_since=endoftheuniverse", function (error, res, body) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was removed. Can it come back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved it to it's own test (above)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed that, thanks!

},
{ name: 'long ago', since: long_ago, expected: {} },
{ name: 'an hour ago', since: hour_ago, expected: { '/yesterday': true } },
{ name: 'the future', since: hour_from_now, expected: { '/yesterday': true, '/today': true } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I would prefer this kind of style change to not happen (before is better than after, for me).

var now = new Date();
var yesterday = new Date(now.getTime() - (24 * 3.6e6));
var long_ago = new Date(1);
var hour_ago = new Date(now.getTime() - 3.6e6);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done aligned assignments generally, and would prefer them not to be applied to existing code without any other changes. Not a big deal, just something for the future.

@pseudomuto
Copy link
Contributor Author

Updated to remove all of the style-only changes (align assignments, etc.). I've also rebase on the latest master with @rgbkrk's changes

@minrk minrk merged commit 3ec36da into jupyterhub:master Sep 30, 2016
@pseudomuto pseudomuto deleted the separate_route_storage branch September 30, 2016 14:13
@@ -62,7 +62,12 @@ function MemoryStore () {
},
update: {
value: function (path, data, cb) {
Object.assign(routes[path], data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably go back to this to simplify the logic in a later PR.

@willingc
Copy link
Contributor

Thank you @pseudomuto 😎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants