Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use historical nodes to host shared cache #7570

Open
leventov opened this issue Apr 29, 2019 · 4 comments
Open

Use historical nodes to host shared cache #7570

leventov opened this issue Apr 29, 2019 · 4 comments

Comments

@leventov
Copy link
Member

leventov commented Apr 29, 2019

There is an idea that instead of memcached, historical nodes themselves can be used to host shared, not only their local cache. Since there are a lot of them, only a small fraction of each historical node's memory can be devoted to the shared cache.

Such colocation can also simplify Druid setup because no separate fleet of memcached nodes would be required.

@sascha-coenen
Copy link

awesome idea
I was playing around with this using Apache Ignite
https://ignite.apache.org

Since then my mind keeps coming back to the thought how great it would be for many usecases if Druid had an underlying colocated cache.
Brokers could leave their individual queries or query state in the cache and one could then easily consult that cache to see the total number of executing queries. Zookeeper is oldfashioned, using REST endpoints is cumbersome...but a ready-made distributed cache or better to say an underlying generic compute grid like Ignite would be quite a nice platform foundation. Distributed clojures. Data Streamers... All so low-level that one could build on it.
With Ignite I could get the above distributed co-located cache and one can also optionally configure Ignite to write through to a remote section of itself too, so having a second-level cache, which in turn one can configure with optional persistence. Its quite nice to use such a product as a foundational component...

@gianm
Copy link
Contributor

gianm commented Apr 29, 2019

It seems for this case it would be nice to choose a 'small' (few dependencies) and embeddable (same JVM as the historical) distributed cache. Is Ignite like that? (Or are there others that are?)

@leventov
Copy link
Member Author

leventov commented May 1, 2019

Historical nodes are designed for restarts, including restarts of large groups of nodes nearly at the same time. We don't want such restarts to disrupt the quality of service. So the cache should be replicated at least 2x. This requirement makes simple local embeddable cache insufficient (unless we want to basically solve again all the hard problems with reliable replicated cluster services).

So Ignite looks to me like a more reasonable solution.

@devozerov would appreciate your opinion.

@leventov
Copy link
Member Author

A related paper: https://blog.acolyer.org/2019/06/24/fast-key-value-stores/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants