Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce global, per-tenant flags and interval to evaluation threshold to control rule evaluation concurrency #8146

Merged
merged 17 commits into from
Jul 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,12 @@
* [ENHANCEMENT] Expose a new `s3.trace.enabled` configuration option to enable detailed logging of operations against S3-compatible object stores. #8690
* [ENHANCEMENT] memberlist: locally-generated messages (e.g. ring updates) are sent to gossip network before forwarded messages. Introduced `-memberlist.broadcast-timeout-for-local-updates-on-shutdown` option to modify how long to wait until queue with locally-generated messages is empty when shutting down. Previously this was hard-coded to 10s, and wait included all messages (locally-generated and forwarded). Now it defaults to 10s, 0 means no timeout. Increasing this value may help to avoid problem when ring updates on shutdown are not propagated to other nodes, and ring entry is left in a wrong state. #8761
* [ENHANCEMENT] Querier: allow using both raw numbers of seconds and duration literals in queries where previously only one or the other was permitted. For example, `predict_linear` now accepts a duration literal (eg. `predict_linear(..., 4h)`), and range vector selectors now accept a number of seconds (eg. `rate(metric[2])`). #8780
* [ENHANCEMENT] Ruler: Add `ruler.max-independent-rule-evaluation-concurrency` to allow independent rules of a tenant to be run concurrently. You can control the amount of concurrency per tenant is controlled via the `-ruler.max-independent-rule-evaluation-concurrency-per-tenan` as a limit. Use a `-ruler.max-independent-rule-evaluation-concurrency` value of `0` can be used to disable the feature for all tenants. By default, this feature is disabled. A rule is eligible for concurrency as long as it doesn't depend on any other rules, doesn't have any other rules that depend on it, and has a total rule group runtime that exceeds 50% of its interval by default. The threshold can can be adjusted with `-ruler.independent-rule-evaluation-concurrency-min-duration-percentage`. #8146
* This work introduces the following metrics:
* `cortex_ruler_independent_rule_evaluation_concurrency_slots_in_use`
* `cortex_ruler_independent_rule_evaluation_concurrency_attempts_started_total`
* `cortex_ruler_independent_rule_evaluation_concurrency_attempts_incomplete_total`
* `cortex_ruler_independent_rule_evaluation_concurrency_attempts_completed_total`
* [BUGFIX] Ruler: add support for draining any outstanding alert notifications before shutting down. This can be enabled with the `-ruler.drain-notification-queue-on-shutdown=true` CLI flag. #8346
* [BUGFIX] Query-frontend: fix `-querier.max-query-lookback` enforcement when `-compactor.blocks-retention-period` is not set, and viceversa. #8388
* [BUGFIX] Ingester: fix sporadic `not found` error causing an internal server error if label names are queried with matchers during head compaction. #8391
Expand Down
33 changes: 33 additions & 0 deletions cmd/mimir/config-descriptor.json
Original file line number Diff line number Diff line change
Expand Up @@ -4375,6 +4375,17 @@
"fieldType": "string",
"fieldCategory": "experimental"
},
{
"kind": "field",
"name": "ruler_max_independent_rule_evaluation_concurrency_per_tenant",
"required": false,
"desc": "Maximum number of independent rules that can run concurrently for each tenant. Depends on ruler.max-independent-rule-evaluation-concurrency being greater than 0. Ideally this flag should be a lower value. 0 to disable.",
"fieldValue": null,
"fieldDefaultValue": 4,
"fieldFlag": "ruler.max-independent-rule-evaluation-concurrency-per-tenant",
"fieldType": "int",
"fieldCategory": "experimental"
},
{
"kind": "field",
"name": "store_gateway_tenant_shard_size",
Expand Down Expand Up @@ -12556,6 +12567,28 @@
],
"fieldValue": null,
"fieldDefaultValue": null
},
{
"kind": "field",
"name": "max_independent_rule_evaluation_concurrency",
"required": false,
"desc": "Number of rules rules that don't have dependencies that we allow to be evaluated concurrently across all tenants. 0 to disable.",
"fieldValue": null,
"fieldDefaultValue": 0,
"fieldFlag": "ruler.max-independent-rule-evaluation-concurrency",
"fieldType": "int",
"fieldCategory": "experimental"
},
{
"kind": "field",
"name": "threshold_independent_rule_evaluation_concurrency",
"required": false,
"desc": "Minimum threshold of the interval to last rule group runtime duration to allow a rule to be evaluated concurrency. By default, the rule group runtime duration must exceed 50.0% of the evaluation interval.",
"fieldValue": null,
"fieldDefaultValue": 50,
"fieldFlag": "ruler.independent-rule-evaluation-concurrency-min-duration-percentage",
"fieldType": "float",
"fieldCategory": "experimental"
}
],
"fieldValue": null,
Expand Down
6 changes: 6 additions & 0 deletions cmd/mimir/help-all.txt.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -2623,6 +2623,12 @@ Usage of ./cmd/mimir/mimir:
This grace period controls which alerts the ruler restores after a restart. Alerts with "for" duration lower than this grace period are not restored after a ruler restart. This means that if the alerts have been firing before the ruler restarted, they will now go to pending state and then to firing again after their "for" duration expires. Alerts with "for" duration greater than or equal to this grace period that have been pending before the ruler restart will remain in pending state for at least this grace period. Alerts with "for" duration greater than or equal to this grace period that have been firing before the ruler restart will continue to be firing after the restart. (default 2m0s)
-ruler.for-outage-tolerance duration
Max time to tolerate outage for restoring "for" state of alert. (default 1h0m0s)
-ruler.independent-rule-evaluation-concurrency-min-duration-percentage float
[experimental] Minimum threshold of the interval to last rule group runtime duration to allow a rule to be evaluated concurrency. By default, the rule group runtime duration must exceed 50.0% of the evaluation interval. (default 50)
-ruler.max-independent-rule-evaluation-concurrency int
[experimental] Number of rules rules that don't have dependencies that we allow to be evaluated concurrently across all tenants. 0 to disable.
-ruler.max-independent-rule-evaluation-concurrency-per-tenant int
[experimental] Maximum number of independent rules that can run concurrently for each tenant. Depends on ruler.max-independent-rule-evaluation-concurrency being greater than 0. Ideally this flag should be a lower value. 0 to disable. (default 4)
-ruler.max-rule-groups-per-tenant int
Maximum number of rule groups per-tenant. 0 to disable. (default 70)
-ruler.max-rule-groups-per-tenant-by-namespace value
Expand Down
4 changes: 4 additions & 0 deletions docs/sources/mimir/configure/about-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ The following features are currently experimental:
- `-ruler.max-rule-groups-per-tenant-by-namespace`
- Allow protecting rule groups from modification by namespace. Rule groups can always be read, and you can use the `X-Mimir-Ruler-Override-Namespace-Protection` header with namespace names as values to override protection from modification.
- `-ruler.protected-namespaces`
- Allow control over independent rules to be evaluated concurrently as long as they exceed a certain threshold on their rule group last duration runtime against their interval. We have both a limit on the number of rules that can be executed per ruler and per tenant:
- `-ruler.max-independent-rule-evaluation-concurrency`
- `-ruler.max-independent-rule-evaluation-concurrency-per-tenant`
- `-ruler.independent-rule-evaluation-concurrency-min-duration-percentage`
- Distributor
- Metrics relabeling
- `-distributor.metric-relabeling-enabled`
Expand Down
17 changes: 17 additions & 0 deletions docs/sources/mimir/configure/configuration-parameters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2081,6 +2081,17 @@ tenant_federation:
# then these rules groups will be skipped during evaluations.
# CLI flag: -ruler.tenant-federation.enabled
[enabled: <boolean> | default = false]

# (experimental) Number of rules rules that don't have dependencies that we
# allow to be evaluated concurrently across all tenants. 0 to disable.
# CLI flag: -ruler.max-independent-rule-evaluation-concurrency
[max_independent_rule_evaluation_concurrency: <int> | default = 0]

# (experimental) Minimum threshold of the interval to last rule group runtime
# duration to allow a rule to be evaluated concurrency. By default, the rule
# group runtime duration must exceed 50.0% of the evaluation interval.
# CLI flag: -ruler.independent-rule-evaluation-concurrency-min-duration-percentage
[threshold_independent_rule_evaluation_concurrency: <float> | default = 50]
```

### ruler_storage
Expand Down Expand Up @@ -3522,6 +3533,12 @@ The `limits` block configures default and per-tenant limits imposed by component
# CLI flag: -ruler.protected-namespaces
[ruler_protected_namespaces: <string> | default = ""]

# (experimental) Maximum number of independent rules that can run concurrently
# for each tenant. Depends on ruler.max-independent-rule-evaluation-concurrency
# being greater than 0. Ideally this flag should be a lower value. 0 to disable.
# CLI flag: -ruler.max-independent-rule-evaluation-concurrency-per-tenant
[ruler_max_independent_rule_evaluation_concurrency_per_tenant: <int> | default = 4]

# The tenant's shard size, used when store-gateway sharding is enabled. Value of
# 0 disables shuffle sharding for the tenant, that is all tenant blocks are
# sharded across all store-gateway replicas.
Expand Down
15 changes: 14 additions & 1 deletion pkg/mimir/modules.go
Original file line number Diff line number Diff line change
Expand Up @@ -869,11 +869,24 @@ func (t *Mimir) initRuler() (serv services.Service, err error) {
queryFunc = rules.EngineQueryFunc(eng, queryable)
}
}

var concurrencyController ruler.MultiTenantRuleConcurrencyController
concurrencyController = &ruler.NoopConcurrencyController{}
if t.Cfg.Ruler.MaxIndependentRuleEvaluationConcurrency > 0 {
concurrencyController = ruler.NewMultiTenantConcurrencyController(
util_log.Logger,
t.Cfg.Ruler.MaxIndependentRuleEvaluationConcurrency,
t.Cfg.Ruler.IndependentRuleEvaluationConcurrencyMinDurationPercentange,
t.Registerer,
t.Overrides,
)
}
managerFactory := ruler.DefaultTenantManagerFactory(
t.Cfg.Ruler,
t.Distributor,
embeddedQueryable,
queryFunc,
concurrencyController,
t.Overrides,
t.Registerer,
)
Expand All @@ -889,7 +902,7 @@ func (t *Mimir) initRuler() (serv services.Service, err error) {
)

dnsResolver := dns.NewProvider(util_log.Logger, dnsProviderReg, dns.GolangResolverType)
manager, err := ruler.NewDefaultMultiTenantManager(t.Cfg.Ruler, managerFactory, t.Registerer, util_log.Logger, dnsResolver)
manager, err := ruler.NewDefaultMultiTenantManager(t.Cfg.Ruler, managerFactory, t.Registerer, util_log.Logger, dnsResolver, concurrencyController)
if err != nil {
return nil, err
}
Expand Down
3 changes: 3 additions & 0 deletions pkg/ruler/compat.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ type RulesLimits interface {
RulerAlertingRulesEvaluationEnabled(userID string) bool
RulerSyncRulesOnChangesEnabled(userID string) bool
RulerProtectedNamespaces(userID string) []string
RulerMaxIndependentRuleEvaluationConcurrencyPerTenant(userID string) int64
}

func MetricsQueryFunc(qf rules.QueryFunc, queries, failedQueries prometheus.Counter, remoteQuerier bool) rules.QueryFunc {
Expand Down Expand Up @@ -269,6 +270,7 @@ func DefaultTenantManagerFactory(
p Pusher,
queryable storage.Queryable,
queryFunc rules.QueryFunc,
concurrencyController MultiTenantRuleConcurrencyController,
overrides RulesLimits,
reg prometheus.Registerer,
) ManagerFactory {
Expand Down Expand Up @@ -336,6 +338,7 @@ func DefaultTenantManagerFactory(
// to metric that haven't been forwarded to Mimir yet.
return overrides.EvaluationDelay(userID)
},
RuleConcurrencyController: concurrencyController,
})
}
}
Expand Down
6 changes: 3 additions & 3 deletions pkg/ruler/compat_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -509,7 +509,7 @@ func TestDefaultManagerFactory_CorrectQueryableUsed(t *testing.T) {
// create and use manager factory
pusher := newPusherMock()
pusher.MockPush(&mimirpb.WriteResponse{}, nil)
managerFactory := DefaultTenantManagerFactory(cfg, pusher, federatedQueryable, queryFunc, options.limits, nil)
managerFactory := DefaultTenantManagerFactory(cfg, pusher, federatedQueryable, queryFunc, &NoopConcurrencyController{}, options.limits, nil)

manager := managerFactory(context.Background(), userID, notifierManager, options.logger, nil)

Expand Down Expand Up @@ -614,7 +614,7 @@ func TestDefaultManagerFactory_ShouldInjectReadConsistencyToContextBasedOnRuleDe

// Create the manager from the factory.
queryable := &storage.MockQueryable{MockQuerier: querier}
managerFactory := DefaultTenantManagerFactory(cfg, pusher, queryable, rules.EngineQueryFunc(eng, queryable), options.limits, nil)
managerFactory := DefaultTenantManagerFactory(cfg, pusher, queryable, rules.EngineQueryFunc(eng, queryable), &NoopConcurrencyController{}, options.limits, nil)
manager := managerFactory(context.Background(), userID, notifierManager, options.logger, nil)

// Load rules into manager.
Expand Down Expand Up @@ -710,7 +710,7 @@ func TestDefaultManagerFactory_ShouldInjectStrongReadConsistencyToContextWhenQue

// Create the manager from the factory.
queryable := &storage.MockQueryable{MockQuerier: querier}
managerFactory := DefaultTenantManagerFactory(cfg, pusher, queryable, rules.EngineQueryFunc(eng, queryable), options.limits, nil)
managerFactory := DefaultTenantManagerFactory(cfg, pusher, queryable, rules.EngineQueryFunc(eng, queryable), &NoopConcurrencyController{}, options.limits, nil)
manager := managerFactory(context.Background(), userID, notifierManager, options.logger, nil)

// Load rules into manager.
Expand Down
24 changes: 13 additions & 11 deletions pkg/ruler/manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,10 @@ import (
)

type DefaultMultiTenantManager struct {
cfg Config
notifierCfg *config.Config
managerFactory ManagerFactory
cfg Config
notifierCfg *config.Config
managerFactory ManagerFactory
concurrencyController MultiTenantRuleConcurrencyController

mapper *mapper

Expand All @@ -60,7 +61,7 @@ type DefaultMultiTenantManager struct {
rulerIsRunning atomic.Bool
}

func NewDefaultMultiTenantManager(cfg Config, managerFactory ManagerFactory, reg prometheus.Registerer, logger log.Logger, dnsResolver cache.AddressProvider) (*DefaultMultiTenantManager, error) {
func NewDefaultMultiTenantManager(cfg Config, managerFactory ManagerFactory, reg prometheus.Registerer, logger log.Logger, dnsResolver cache.AddressProvider, concurrencyController MultiTenantRuleConcurrencyController) (*DefaultMultiTenantManager, error) {
refreshMetrics := discovery.NewRefreshMetrics(reg)
ncfg, err := buildNotifierConfig(&cfg, dnsResolver, refreshMetrics)
if err != nil {
Expand All @@ -73,13 +74,14 @@ func NewDefaultMultiTenantManager(cfg Config, managerFactory ManagerFactory, reg
}

return &DefaultMultiTenantManager{
cfg: cfg,
notifierCfg: ncfg,
managerFactory: managerFactory,
notifiers: map[string]*rulerNotifier{},
mapper: newMapper(cfg.RulePath, logger),
userManagers: map[string]RulesManager{},
userManagerMetrics: userManagerMetrics,
cfg: cfg,
notifierCfg: ncfg,
managerFactory: managerFactory,
concurrencyController: concurrencyController,
notifiers: map[string]*rulerNotifier{},
mapper: newMapper(cfg.RulePath, logger),
userManagers: map[string]RulesManager{},
userManagerMetrics: userManagerMetrics,
managersTotal: promauto.With(reg).NewGauge(prometheus.GaugeOpts{
Namespace: "cortex",
Name: "ruler_managers_total",
Expand Down
6 changes: 3 additions & 3 deletions pkg/ruler/manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ func TestDefaultMultiTenantManager_SyncFullRuleGroups(t *testing.T) {
user2Group1 = createRuleGroup("group-1", user2, createRecordingRule("sum:metric_1", "sum(metric_1)"))
)

m, err := NewDefaultMultiTenantManager(Config{RulePath: t.TempDir()}, managerMockFactory, nil, logger, nil)
m, err := NewDefaultMultiTenantManager(Config{RulePath: t.TempDir()}, managerMockFactory, nil, logger, nil, nil)
require.NoError(t, err)

// Initialise the manager with some rules and start it.
Expand Down Expand Up @@ -132,7 +132,7 @@ func TestDefaultMultiTenantManager_SyncPartialRuleGroups(t *testing.T) {
user2Group1 = createRuleGroup("group-1", user2, createRecordingRule("sum:metric_1", "sum(metric_1)"))
)

m, err := NewDefaultMultiTenantManager(Config{RulePath: t.TempDir()}, managerMockFactory, nil, logger, nil)
m, err := NewDefaultMultiTenantManager(Config{RulePath: t.TempDir()}, managerMockFactory, nil, logger, nil, nil)
require.NoError(t, err)
t.Cleanup(m.Stop)

Expand Down Expand Up @@ -309,7 +309,7 @@ func TestDefaultMultiTenantManager_WaitsToDrainPendingNotificationsOnShutdown(t
NotificationTimeout: 10 * time.Second,
DrainNotificationQueueOnShutdown: true,
}
m, err := NewDefaultMultiTenantManager(cfg, managerMockFactory, nil, logger, nil)
m, err := NewDefaultMultiTenantManager(cfg, managerMockFactory, nil, logger, nil, nil)
require.NoError(t, err)

m.SyncFullRuleGroups(ctx, map[string]rulespb.RuleGroupList{
Expand Down
Loading
Loading