Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peons put high load on zookeeper on disconnects #1970

Closed
nishantmonu51 opened this issue Nov 13, 2015 · 0 comments · Fixed by #2015
Closed

Peons put high load on zookeeper on disconnects #1970

nishantmonu51 opened this issue Nov 13, 2015 · 0 comments · Fixed by #2015
Assignees

Comments

@nishantmonu51
Copy link
Member

Realtime Index tasks need to know when the segments is handed over to the historical nodes.
To get this information realtime index task keeps a FilteredBrokerserverView and gets data from zookeeper for all the segments in the cluster. In case of ZK disconnects, each task will reconnect to zookeeper and starts reading the distribution of segments in the cluster which puts a high load on zookeeper as the number of tasks starts to grow.

One solution to handle this is to make the overlord responsible for keeping the state segments in cluster and the realtime nodes will interact with the overlord to know when a segment has been handed off to another historical node.

@nishantmonu51 nishantmonu51 self-assigned this Nov 13, 2015
nishantmonu51 added a commit to metamx/druid that referenced this issue Nov 24, 2015
…zk load apache#1970

- fixes apache#1970
- extract out segment handoff callbacks in SegmentHandoffNotifier which
is responsible for tracking segment handoffs and doing callbacks when
handoff is complete.
- Overlord now maintains a view of segments in the cluster, this will
affect the jam heap requirements for the overlord for large clusters.
- realtime Index Tasks now use SegmentHandoffCheckAction to check for
handoffs.
- Realtime Nodes still use the old way of maintaining a
FilteredServerView to check for handoffs.
- Add tests for individual components

fix broker test after merging
@nishantmonu51 nishantmonu51 changed the title Peons puts high load on zookeeper on disconnects Peons put high load on zookeeper on disconnects Nov 24, 2015
nishantmonu51 added a commit to metamx/druid that referenced this issue Dec 4, 2015
…dpoint for handoffs

- fixes apache#1970
- extracted out segment handoff callbacks in SegmentHandoffNotifier
which is responsible for tracking segment handoffs and doing callbacks
when handoff is complete.
- Coordinator now maintains a view of segments in the cluster, this
will affect the jam heap requirements for the overlord for large
clusters.
realtime index task and nodes now use HTTP end points exposed by the
coordinator to get serverView
TODO:
- Add more  tests for individual components
- Add docs

Add tests and more loggings.
nishantmonu51 added a commit to metamx/druid that referenced this issue Dec 7, 2015
…dpoint for handoffs

- fixes apache#1970
- extracted out segment handoff callbacks in SegmentHandoffNotifier
which is responsible for tracking segment handoffs and doing callbacks
when handoff is complete.
- Coordinator now maintains a view of segments in the cluster, this
will affect the jam heap requirements for the overlord for large
clusters.
realtime index task and nodes now use HTTP end points exposed by the
coordinator to get serverView

review comment

fix realtime node guide injection

review comments

make test not rely on scheduled exec

fix compilation

fix import
nishantmonu51 added a commit to metamx/druid that referenced this issue Dec 8, 2015
…dpoint for handoffs

- fixes apache#1970
- extracted out segment handoff callbacks in SegmentHandoffNotifier
which is responsible for tracking segment handoffs and doing callbacks
when handoff is complete.
- Coordinator now maintains a view of segments in the cluster, this
will affect the jam heap requirements for the overlord for large
clusters.
realtime index task and nodes now use HTTP end points exposed by the
coordinator to get serverView

review comment

fix realtime node guide injection

review comments

make test not rely on scheduled exec

fix compilation

fix import

review comment

introduce immutableSegmentLoadInfo

fix son reading

remove unnecessary logging
@xvrl xvrl closed this as completed in #2015 Dec 8, 2015
nishantmonu51 added a commit to metamx/druid that referenced this issue Dec 9, 2015
…dpoint for handoffs

- fixes apache#1970
- extracted out segment handoff callbacks in SegmentHandoffNotifier
which is responsible for tracking segment handoffs and doing callbacks
when handoff is complete.
- Coordinator now maintains a view of segments in the cluster, this
will affect the jam heap requirements for the overlord for large
clusters.
realtime index task and nodes now use HTTP end points exposed by the
coordinator to get serverView

review comment

fix realtime node guide injection

review comments

make test not rely on scheduled exec

fix compilation

fix import

review comment

introduce immutableSegmentLoadInfo

fix son reading

remove unnecessary logging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment