Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: Debounce commit events #1287

Merged
merged 1 commit into from
Aug 1, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
scheduler: Debounce commit events
When loading a state that contained large numbers of nodes and tasks,
but no ready nodes that could accept the tasks, swarmd used large
amounts of CPU repeatedly trying to schedule the full set of tasks. The
allocator caused many commits on startup (see #1286), and this produced
a large backlog of commit events, each one of which caused a full
scheduling pass.

To avoid this pathological behavior, debounce the commit events
similarly to how the dispatcher's Tasks loop debounces events. When a
commit event is received, that starts a 50 ms countdown to wait for
another commit event before running the scheduling pass. If commit
events keep being received and resetting this timer, the scheduler will
run the scheduling pass anyway after a second.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
  • Loading branch information
aaronlehmann committed Aug 1, 2016
commit 77c62dbdce8e5417fab57d1355b2ac38a77bfbc1
47 changes: 40 additions & 7 deletions manager/scheduler/scheduler.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,31 @@ func (s *Scheduler) Run(ctx context.Context) error {
// Queue all unassigned tasks before processing changes.
s.tick(ctx)

const (
// commitDebounceGap is the amount of time to wait between
// commit events to debounce them.
commitDebounceGap = 50 * time.Millisecond
// maxLatency is a time limit on the debouncing.
maxLatency = time.Second
)
var (
debouncingStarted time.Time
commitDebounceTimer *time.Timer
commitDebounceTimeout <-chan time.Time
)

pendingChanges := 0

schedule := func() {
if len(s.preassignedTasks) > 0 {
s.processPreassignedTasks(ctx)
}
if pendingChanges > 0 {
s.tick(ctx)
pendingChanges = 0
}
}

// Watch for changes.
for {
select {
Expand All @@ -131,15 +154,25 @@ func (s *Scheduler) Run(ctx context.Context) error {
case state.EventDeleteNode:
s.nodeHeap.remove(v.Node.ID)
case state.EventCommit:
if len(s.preassignedTasks) > 0 {
s.processPreassignedTasks(ctx)
}
if pendingChanges > 0 {
s.tick(ctx)
pendingChanges = 0
if commitDebounceTimer != nil {
if time.Since(debouncingStarted) > maxLatency {
commitDebounceTimer.Stop()
commitDebounceTimer = nil
commitDebounceTimeout = nil
schedule()
} else {
commitDebounceTimer.Reset(commitDebounceGap)
}
} else {
commitDebounceTimer = time.NewTimer(commitDebounceGap)
commitDebounceTimeout = commitDebounceTimer.C
debouncingStarted = time.Now()
}
}

case <-commitDebounceTimeout:
schedule()
commitDebounceTimer = nil
commitDebounceTimeout = nil
case <-s.stopChan:
return nil
}
Expand Down