Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Separate Primary Replicant loading from the rest of HistoricalManagementDuties #10606

Open
capistrant opened this issue Nov 25, 2020 · 1 comment

Comments

@capistrant
Copy link
Contributor

capistrant commented Nov 25, 2020

Motivation

Loading primary replicants for Druid Segments is one of the most important things that the Coordinator does. Without a primary replicant available on the cluster, a segment is not available for querying. The Coordinator performs primary replicant loading within a set of Coordinator duties that relate to Historical Management. This grouping can result in the coordinator spending a lot of time doing other things such as loading non-primary replicants, balancing segments, etc. A side effect of this waiting for other Coordinator jobs to complete before more primary replicants can be loaded is that data stays unavailable for longer than it otherwise might have to. This can be a negative end user experience. Breaking primary replicant loading out into its own scheduled runnable group can guarantee that primary replicants are loaded more regularly.

Proposed changes

POC Code Link

I am proposing an optional new DutiesRunnable in the DruidCoordinator. Operators can choose whether or not to break primary replicant loading out into its own DutiesRunnable. If they choose not to enable the dedicated primary replicant loading, their coordinator will function just as it always has. If they choose to enable the dedicated primary replicant loading, their coordinator will add a scheduled DutiesRunnable dedicated to executing matching LoadRule for segments and only doing the primary replicant load for that LoadRule when ran. The HistoricalManagement DutiesRunnable will continue all other HistoricalManagement duties including performing non-primary replicant loading and replicant dropping while executing a matched LoadRule for a segment.

My POC implementation for the proposal exposes two new Coordinator runtime configurations for operators: druid.coordinator.loadPrimaryReplicantSeparately and druid.coordinator.period.primaryReplicantLoaderPeriod. If they choose to enable the first, then a scheduled executor with a configurable backoff period is configured for loading primary replicants.

The new DutiesRunnable would have consist of two duties, UpdateCoordinatorStateAndPrepareCluster and RunRules.

  • There is an open TODO on analyzing the negative effects of having two DutiesRunnable with UpdateCoordinatorStateAndPrepareCluster. It is possible that only one of the two should execute the full thing and the other should run a scaled down duty.

RunRules and LoadRule will need a mode associated with them. Now we will be executing RunRules in one of two modes. One mode is to only execute LoadRule rules that match. The other is to run all matched Rule. LoadRule is similar, for the primary replicant load, it should run in a mode where it only loads a primary replicant. There also needs to be a mode for skipping primary replicant load. And then lastly, a mode for running all of LoadRule and not worrying about replicant types.

Rationale

I think the biggest benefit here is more control for the operator to ensure that primary replicant loading is running as often as needed. In the case of large clusters who do lots of balancing, and non-primary replicant loading due to servers coming in and out of the cluster, primary replicant loading can get blocked often enough that users are asking about why their new segments aren't becoming available in a timely manner after batch indexing finishes.

As for alternative approaches, I have not thought of any similar ways to achieve this elevated priority for loading primary replicants at this time. I am definitely open to suggestions though.

Operational impact

This section should describe how the proposed changes will impact the operation of existing clusters. It should answer questions such as:

  • Is anything going to be deprecated or removed by this change? How will we phase out old behavior?
    • N/A
  • Is there a migration path that cluster operators need to be aware of?
    • Enabling this requires coordinator config changes and a restart.
  • Will there be any effect on the ability to do a rolling upgrade, or to do a rolling downgrade if an operator wants to switch back to a previous version?
    • rolling upgrade to the first version that includes this would not require any changes because not adding the configs will leave the coordinator as is. An operator can enable after upgrade if they so choose.
    • Downgrading should not have any impact. The configs, even if specified by operator would be ignored and coordinator would go back to how it operated before there was a dedicated primary replicant loader.

Test plan (optional)

TBD

Future work (optional)

TBD

@OurNewestMember
Copy link

+1

If this feature allows an admin to set a configuration property on the (dynamic) coordinator config to control primary replicant loading specifically, this could be extremely valuable because it could potentially improve realtime task publishing time (which in turn can prevent pending or even failed tasks and therefore ingest lag, etc) and also prevent segments in limbo that have been published but are not queryable (which may even relate to query failure rates, too).

This is such a critical aspect of the system, and it deserves as much of an interface as non-primary replicants currently have in the coordinator dynamic config. +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants