-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Auto scaler pendingTaskBased provisioning strategy to handle when there are no currently running worker node better #11440
Changes from 5 commits
789ea9c
b1d132e
2459031
f3955ce
d6bb9a8
06441b9
f5ed8df
298146c
cf39ec2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -80,8 +80,8 @@ public GceAutoScaler( | |
@JsonProperty("envConfig") GceEnvironmentConfig envConfig | ||
) | ||
{ | ||
Preconditions.checkArgument(minNumWorkers > 0, | ||
"minNumWorkers must be greater than 0"); | ||
Preconditions.checkArgument(minNumWorkers >= 0, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the overlord should fail to start if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's is only if you are using PendingTaskBasedWorkerProvisioningStrategy. You can have minNumWorkers = 0 and the new config not set if you use SimpleWorkerProvisioningStrategy. I think it makes more sense to have the check done in the PendingTaskBasedWorkerProvisioningStrategy, instead of having a check in every AutoScaler (EC2, GCE, etc) to check if the ProvisioningStategy == PendingTaskBasedWorkerProvisioningStrategy and minNumWorkers = 0 and the new config is not set. However, this means that the failure would only show when the auto scaler cycle runs. I think this is ok as the PendingTaskBasedWorkerProvisioningStrategy#getDefaultWorkerBehaviorConfig already have some check that make sure the config/AutoScaler set is valid. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah sure, I didn't mean that the check code should be in each autoScaler because it's not possible as autoScaler doesn't know about provisioningStrategyConfig. The check code should be in
Hmm, the check can throw an exception in the constructor of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The PendingTaskBasedWorkerProvisioningStrategy gets created once at startup which at that time may not have an auto scaler configured or if auto scaler was configured the minWorkerCount can change later. This is because the auto scaler along with the minWorkerCount can be configured dynamically after the cluster is already up and running. Hence, if the check is in the constructor of PendingTaskBasedWorkerProvisioningStrategy , then we can miss the cases I mentioned above. I think it is better to have the check in PendingTaskBasedWorkerProvisioningStrategy#getDefaultWorkerBehaviorConfig which is every time we do terminate/provision, etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yeah you are right. Thanks for the explanation. |
||
"minNumWorkers must be greater than or equal to 0"); | ||
this.minNumWorkers = minNumWorkers; | ||
Preconditions.checkArgument(maxNumWorkers > 0, | ||
"maxNumWorkers must be greater than 0"); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -246,13 +246,16 @@ private int getScaleUpNodeCount( | |
log.info("Min/max workers: %d/%d", minWorkerCount, maxWorkerCount); | ||
final int currValidWorkers = getCurrValidWorkers(workers); | ||
|
||
// If there are no worker, spin up minWorkerCount, we cannot determine the exact capacity here to fulfill the need | ||
// since we are not aware of the expectedWorkerCapacity. | ||
int moreWorkersNeeded = currValidWorkers == 0 ? minWorkerCount : getWorkersNeededToAssignTasks( | ||
// If there are no worker and workerCapacityFallback config is not set (-1) or invalid (<= 0), then spin up minWorkerCount | ||
// as we cannot determine the exact capacity here to fulfill the need. | ||
// However, if there are no worker but workerCapacityFallback config is set (>0), then we can | ||
// determine the number of workers needed using workerCapacityFallback config as expected worker capacity | ||
int moreWorkersNeeded = currValidWorkers == 0 && config.getWorkerCapacityFallback() <= 0 ? minWorkerCount : getWorkersNeededToAssignTasks( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: this line is too long. Please break the line. One example would be int moreWorkersNeeded = currValidWorkers == 0 && config.getWorkerCapacityFallback() <= 0
? minWorkerCount
: getWorkersNeededToAssignTasks(
remoteTaskRunnerConfig,
workerConfig,
pendingTasks,
workers,
config.getWorkerCapacityFallback()
); There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
remoteTaskRunnerConfig, | ||
workerConfig, | ||
pendingTasks, | ||
workers | ||
workers, | ||
config.getWorkerCapacityFallback() | ||
); | ||
log.debug("More workers needed: %d", moreWorkersNeeded); | ||
|
||
|
@@ -280,7 +283,8 @@ private int getWorkersNeededToAssignTasks( | |
final WorkerTaskRunnerConfig workerTaskRunnerConfig, | ||
final DefaultWorkerBehaviorConfig workerConfig, | ||
final Collection<Task> pendingTasks, | ||
final Collection<ImmutableWorkerInfo> workers | ||
final Collection<ImmutableWorkerInfo> workers, | ||
final int workerCapacityFallback | ||
) | ||
{ | ||
final Collection<ImmutableWorkerInfo> validWorkers = Collections2.filter( | ||
|
@@ -295,7 +299,7 @@ private int getWorkersNeededToAssignTasks( | |
} | ||
WorkerSelectStrategy workerSelectStrategy = workerConfig.getSelectStrategy(); | ||
int need = 0; | ||
int capacity = getExpectedWorkerCapacity(workers); | ||
int capacity = getExpectedWorkerCapacity(workers, workerCapacityFallback); | ||
log.info("Expected worker capacity: %d", capacity); | ||
|
||
// Simulate assigning tasks to dummy workers using configured workerSelectStrategy | ||
|
@@ -441,12 +445,18 @@ private int getCurrValidWorkers(Collection<ImmutableWorkerInfo> workers) | |
return currValidWorkers; | ||
} | ||
|
||
private static int getExpectedWorkerCapacity(final Collection<ImmutableWorkerInfo> workers) | ||
private static int getExpectedWorkerCapacity(final Collection<ImmutableWorkerInfo> workers, final int workerCapacityFallback) | ||
{ | ||
int size = workers.size(); | ||
if (size == 0) { | ||
// No existing workers assume capacity per worker as 1 | ||
return 1; | ||
// No existing workers | ||
if (workerCapacityFallback > 0) { | ||
// Return workerCapacityFallback if it is set in config | ||
return workerCapacityFallback; | ||
} else { | ||
// Assume capacity per worker as 1 | ||
return 1; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on this function I don't think you need to detect the case where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having user set the config to 1 and the default value being 1 should not be the same due to https://github.com/apache/druid/pull/11440/files#diff-86276fbfe645ed104be05c850062fdb8e89fce43c656e41725c1e6d6ed671c25R253 If you have no worker and the new config NOT SET then you do the current behavior which is scale to minNumWorkers (this basically disregard capacity and pending tasks you have) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah I see. Sorry I missed that earlier |
||
} else { | ||
// Assume all workers have same capacity | ||
return workers.iterator().next().getWorker().getCapacity(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fallback" in the name does not seem intuitive to me. How about
workerCapacityHint
? Also, it would be nice to document how to set this value. For example, the doc can say "this value should be typically equal todruid.worker.capacity
when you have a homogeneous cluster". For heterogeneous clusters, does it make sense to set this to the average capacity?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the config name to
workerCapacityHint
. Updated the docs. I think for heterogeneous clusters, it does make sense to set this to the average capacity. I think the auto scaler should use the average capacity when there is running worker nodes too but that can be a separate change/discussion.