Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Alerting rule for Anomaly Detection jobs monitoring #106084

Merged

Conversation

darnautov
Copy link
Contributor

@darnautov darnautov commented Jul 19, 2021

Summary

Part of #101028

image

Adds new alerting rule type for monitoring anomaly detection jobs and groups.

This PR only contains a check for the datafeed state. The other tests will be added in follow-up PRs.

Checklist

@darnautov darnautov added :ml Feature:Anomaly Detection ML anomaly detection v8.0.0 release_note:feature Makes this part of the condensed release notes auto-backport Deprecated - use backport:version if exact versions are needed v7.15.0 Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types labels Jul 19, 2021
@darnautov darnautov self-assigned this Jul 19, 2021
@darnautov darnautov requested a review from a team as a code owner July 19, 2021 11:42
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@@ -38,6 +42,8 @@ export type SharedServices = JobServiceProvider &
ResultsServiceProvider &
MlAlertingServiceProvider;

export type MlServicesProviders = JobsHealthServiceProvider;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a bit of inconsistency with the naming, perhaps this should be called internalServices or something like that?
Something that is more obvious when compared to SharedServices above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed in 5f886cb

@darnautov darnautov requested a review from a team as a code owner July 21, 2021 11:13
Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one nitpick but on the whole code LGTM

x-pack/plugins/ml/server/lib/alerts/jobs_health_service.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@lcawl lcawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI text LGTM!

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest edit and LGTM

@darnautov darnautov enabled auto-merge (squash) July 22, 2021 11:41
@darnautov
Copy link
Contributor Author

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
ml 1729 1733 +4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 5.9MB 5.9MB +26.2KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 421.7KB 421.8KB +111.0B
ml 64.6KB 64.4KB -137.0B
total -26.0B
Unknown metric groups

async chunk count

id before after diff
ml 22 23 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @darnautov

@darnautov darnautov disabled auto-merge July 22, 2021 15:30
@darnautov darnautov merged commit 10ef0e9 into elastic:master Jul 22, 2021
@darnautov darnautov deleted the ml-101028-operational-alerting-rule branch July 22, 2021 15:30
@kibanamachine
Copy link
Contributor

💔 Backport failed

Status Branch Result
7.x Commit could not be cherrypicked due to conflicts

To backport manually run:
node scripts/backport --pr 106084

darnautov added a commit to darnautov/kibana that referenced this pull request Jul 26, 2021
)

* [ML] init job health alerting rule type

* [ML] add health checks selection ui

* [ML] define schema

* [ML] support all jobs selection

* [ML] jobs health service

* [ML] add logger

* [ML] add context message

* [ML] fix default message for i18n

* [ML] check response size

* [ML] add exclude jobs control

* [ML] getResultJobsHealthRuleConfig

* [ML] change naming for shared services

* [ML] fix excluded jobs filtering

* [ML] check for execution results

* [ML] update context fields

* [ML] unit tests for getResultJobsHealthRuleConfig

* [ML] refactor and job ids check

* [ML] rename datafeed

* [ML] fix translation messages

* [ML] hide non-implemented tests

* [ML] remove jod ids join from the getJobs call

* [ML] add validation for the tests config

* [ML] fix excluded jobs udpate

* [ML] update jobIdsDescription message

* [ML] allow selection all jobs only for include

* [ML] better ux for excluded jobs setup

* [ML] change rule type name

* [ML] fix typo

* [ML] change instances names

* [ML] fix messages

* [ML] hide error callout, show health checks error in EuiFormRow

* [ML] add check for job state

* [ML] add alertingRules key to the doc links

* [ML] update types

* [ML] remove redundant type

* [ML] fix job and datafeed states check

* [ML] fix job and datafeed states check, add comments

* [ML] add unit tests
# Conflicts:
#	src/core/public/doc_links/doc_links_service.ts
darnautov added a commit that referenced this pull request Jul 26, 2021
…106675)

* [ML] init job health alerting rule type

* [ML] add health checks selection ui

* [ML] define schema

* [ML] support all jobs selection

* [ML] jobs health service

* [ML] add logger

* [ML] add context message

* [ML] fix default message for i18n

* [ML] check response size

* [ML] add exclude jobs control

* [ML] getResultJobsHealthRuleConfig

* [ML] change naming for shared services

* [ML] fix excluded jobs filtering

* [ML] check for execution results

* [ML] update context fields

* [ML] unit tests for getResultJobsHealthRuleConfig

* [ML] refactor and job ids check

* [ML] rename datafeed

* [ML] fix translation messages

* [ML] hide non-implemented tests

* [ML] remove jod ids join from the getJobs call

* [ML] add validation for the tests config

* [ML] fix excluded jobs udpate

* [ML] update jobIdsDescription message

* [ML] allow selection all jobs only for include

* [ML] better ux for excluded jobs setup

* [ML] change rule type name

* [ML] fix typo

* [ML] change instances names

* [ML] fix messages

* [ML] hide error callout, show health checks error in EuiFormRow

* [ML] add check for job state

* [ML] add alertingRules key to the doc links

* [ML] update types

* [ML] remove redundant type

* [ML] fix job and datafeed states check

* [ML] fix job and datafeed states check, add comments

* [ML] add unit tests
# Conflicts:
#	src/core/public/doc_links/doc_links_service.ts

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Feature:Anomaly Detection ML anomaly detection :ml release_note:feature Makes this part of the condensed release notes v7.15.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants