You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Loading primary replicants for Druid Segments is one of the most important things that the Coordinator does. Without a primary replicant available on the cluster, a segment is not available for querying. The Coordinator performs primary replicant loading within a set of Coordinator duties that relate to Historical Management. This grouping can result in the coordinator spending a lot of time doing other things such as loading non-primary replicants, balancing segments, etc. A side effect of this waiting for other Coordinator jobs to complete before more primary replicants can be loaded is that data stays unavailable for longer than it otherwise might have to. This can be a negative end user experience. Breaking primary replicant loading out into its own scheduled runnable group can guarantee that primary replicants are loaded more regularly.
I am proposing an optional new DutiesRunnable in the DruidCoordinator. Operators can choose whether or not to break primary replicant loading out into its own DutiesRunnable. If they choose not to enable the dedicated primary replicant loading, their coordinator will function just as it always has. If they choose to enable the dedicated primary replicant loading, their coordinator will add a scheduled DutiesRunnable dedicated to executing matching LoadRule for segments and only doing the primary replicant load for that LoadRule when ran. The HistoricalManagementDutiesRunnable will continue all other HistoricalManagement duties including performing non-primary replicant loading and replicant dropping while executing a matched LoadRule for a segment.
My POC implementation for the proposal exposes two new Coordinator runtime configurations for operators: druid.coordinator.loadPrimaryReplicantSeparately and druid.coordinator.period.primaryReplicantLoaderPeriod. If they choose to enable the first, then a scheduled executor with a configurable backoff period is configured for loading primary replicants.
The new DutiesRunnable would have consist of two duties, UpdateCoordinatorStateAndPrepareCluster and RunRules.
There is an open TODO on analyzing the negative effects of having two DutiesRunnable with UpdateCoordinatorStateAndPrepareCluster. It is possible that only one of the two should execute the full thing and the other should run a scaled down duty.
RunRules and LoadRule will need a mode associated with them. Now we will be executing RunRules in one of two modes. One mode is to only execute LoadRule rules that match. The other is to run all matched Rule. LoadRule is similar, for the primary replicant load, it should run in a mode where it only loads a primary replicant. There also needs to be a mode for skipping primary replicant load. And then lastly, a mode for running all of LoadRule and not worrying about replicant types.
Rationale
I think the biggest benefit here is more control for the operator to ensure that primary replicant loading is running as often as needed. In the case of large clusters who do lots of balancing, and non-primary replicant loading due to servers coming in and out of the cluster, primary replicant loading can get blocked often enough that users are asking about why their new segments aren't becoming available in a timely manner after batch indexing finishes.
As for alternative approaches, I have not thought of any similar ways to achieve this elevated priority for loading primary replicants at this time. I am definitely open to suggestions though.
Operational impact
This section should describe how the proposed changes will impact the operation of existing clusters. It should answer questions such as:
Is anything going to be deprecated or removed by this change? How will we phase out old behavior?
N/A
Is there a migration path that cluster operators need to be aware of?
Enabling this requires coordinator config changes and a restart.
Will there be any effect on the ability to do a rolling upgrade, or to do a rolling downgrade if an operator wants to switch back to a previous version?
rolling upgrade to the first version that includes this would not require any changes because not adding the configs will leave the coordinator as is. An operator can enable after upgrade if they so choose.
Downgrading should not have any impact. The configs, even if specified by operator would be ignored and coordinator would go back to how it operated before there was a dedicated primary replicant loader.
Test plan (optional)
TBD
Future work (optional)
TBD
The text was updated successfully, but these errors were encountered:
If this feature allows an admin to set a configuration property on the (dynamic) coordinator config to control primary replicant loading specifically, this could be extremely valuable because it could potentially improve realtime task publishing time (which in turn can prevent pending or even failed tasks and therefore ingest lag, etc) and also prevent segments in limbo that have been published but are not queryable (which may even relate to query failure rates, too).
This is such a critical aspect of the system, and it deserves as much of an interface as non-primary replicants currently have in the coordinator dynamic config. +1
Motivation
Loading primary replicants for Druid Segments is one of the most important things that the Coordinator does. Without a primary replicant available on the cluster, a segment is not available for querying. The Coordinator performs primary replicant loading within a set of Coordinator duties that relate to Historical Management. This grouping can result in the coordinator spending a lot of time doing other things such as loading non-primary replicants, balancing segments, etc. A side effect of this waiting for other Coordinator jobs to complete before more primary replicants can be loaded is that data stays unavailable for longer than it otherwise might have to. This can be a negative end user experience. Breaking primary replicant loading out into its own scheduled runnable group can guarantee that primary replicants are loaded more regularly.
Proposed changes
POC Code Link
I am proposing an optional new
DutiesRunnable
in theDruidCoordinator
. Operators can choose whether or not to break primary replicant loading out into its ownDutiesRunnable
. If they choose not to enable the dedicated primary replicant loading, their coordinator will function just as it always has. If they choose to enable the dedicated primary replicant loading, their coordinator will add a scheduledDutiesRunnable
dedicated to executing matchingLoadRule
for segments and only doing the primary replicant load for thatLoadRule
when ran. TheHistoricalManagement
DutiesRunnable
will continue all otherHistoricalManagement
duties including performing non-primary replicant loading and replicant dropping while executing a matchedLoadRule
for a segment.My POC implementation for the proposal exposes two new Coordinator runtime configurations for operators:
druid.coordinator.loadPrimaryReplicantSeparately
anddruid.coordinator.period.primaryReplicantLoaderPeriod
. If they choose to enable the first, then a scheduled executor with a configurable backoff period is configured for loading primary replicants.The new
DutiesRunnable
would have consist of two duties,UpdateCoordinatorStateAndPrepareCluster
andRunRules
.DutiesRunnable
withUpdateCoordinatorStateAndPrepareCluster
. It is possible that only one of the two should execute the full thing and the other should run a scaled down duty.RunRules
andLoadRule
will need a mode associated with them. Now we will be executingRunRules
in one of two modes. One mode is to only executeLoadRule
rules that match. The other is to run all matchedRule
.LoadRule
is similar, for the primary replicant load, it should run in a mode where it only loads a primary replicant. There also needs to be a mode for skipping primary replicant load. And then lastly, a mode for running all ofLoadRule
and not worrying about replicant types.Rationale
I think the biggest benefit here is more control for the operator to ensure that primary replicant loading is running as often as needed. In the case of large clusters who do lots of balancing, and non-primary replicant loading due to servers coming in and out of the cluster, primary replicant loading can get blocked often enough that users are asking about why their new segments aren't becoming available in a timely manner after batch indexing finishes.
As for alternative approaches, I have not thought of any similar ways to achieve this elevated priority for loading primary replicants at this time. I am definitely open to suggestions though.
Operational impact
This section should describe how the proposed changes will impact the operation of existing clusters. It should answer questions such as:
Test plan (optional)
TBD
Future work (optional)
TBD
The text was updated successfully, but these errors were encountered: