Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Improves bucket span estimator stability. #21282

Merged
merged 4 commits into from
Jul 27, 2018

Conversation

walterra
Copy link
Contributor

@walterra walterra commented Jul 26, 2018

Fixes #18163.

  • Fixes the bucket span estimator when median is selected as a detector function. agg.type.name is median and therefor not usable for an Elasticsearch aggregation. agg.type.dslName is percentile and is the correct mapping. .dslName is also used for the aggregations used for the preview charts.
  • 7.0 will introduce a search.max_buckets setting which defaults to 10000. This could lead to failing bucket estimations because the values used for creating the required aggregations could result in more buckets. This PR fixes it by taking search.max_buckets into account when calculating the time range used for the bucket estimation. (Since 6.2 that setting is available so backporting this to current unreleased minor releases 6.4 and 6.5)

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just one minor typo! Not sure if 250 hours should be left as the maximum.

// only run the tests over the last 250 hours of data
// determine durations for bucket span estimation
// taking into account the clusters' search.max_buckets settings
// the polled_data_checker uses an aggretation interval of 1 minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo - should be aggregation

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticmachine
Copy link
Contributor

💔 Build Failed

@walterra
Copy link
Contributor Author

retest

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@walterra walterra merged commit 3b6c9e3 into elastic:master Jul 27, 2018
@walterra walterra deleted the ml-bucket-span-estimator-fixes branch July 27, 2018 08:44
walterra added a commit to walterra/kibana that referenced this pull request Jul 27, 2018
- Fixes the bucket span estimator when median is selected as a detector function. agg.type.name is median and therefor not usable for an Elasticsearch aggregation. agg.type.dslName is percentile and is the correct mapping. .dslName is also used for the aggregations used for the preview charts.
- 7.0 will introduce a search.max_buckets setting which defaults to 10000. This could lead to failing bucket estimations because the values used for creating the required aggregations could result in more buckets. This PR fixes it by taking search.max_buckets into account when calculating the time range used for the bucket estimation. (Since 6.2 that setting is available so backporting this to current unreleased minor releases 6.4 and 6.5)
walterra added a commit to walterra/kibana that referenced this pull request Jul 27, 2018
- Fixes the bucket span estimator when median is selected as a detector function. agg.type.name is median and therefor not usable for an Elasticsearch aggregation. agg.type.dslName is percentile and is the correct mapping. .dslName is also used for the aggregations used for the preview charts.
- 7.0 will introduce a search.max_buckets setting which defaults to 10000. This could lead to failing bucket estimations because the values used for creating the required aggregations could result in more buckets. This PR fixes it by taking search.max_buckets into account when calculating the time range used for the bucket estimation. (Since 6.2 that setting is available so backporting this to current unreleased minor releases 6.4 and 6.5)
walterra added a commit that referenced this pull request Jul 27, 2018
- Fixes the bucket span estimator when median is selected as a detector function. agg.type.name is median and therefor not usable for an Elasticsearch aggregation. agg.type.dslName is percentile and is the correct mapping. .dslName is also used for the aggregations used for the preview charts.
- 7.0 will introduce a search.max_buckets setting which defaults to 10000. This could lead to failing bucket estimations because the values used for creating the required aggregations could result in more buckets. This PR fixes it by taking search.max_buckets into account when calculating the time range used for the bucket estimation. (Since 6.2 that setting is available so backporting this to current unreleased minor releases 6.4 and 6.5)
walterra added a commit that referenced this pull request Jul 27, 2018
- Fixes the bucket span estimator when median is selected as a detector function. agg.type.name is median and therefor not usable for an Elasticsearch aggregation. agg.type.dslName is percentile and is the correct mapping. .dslName is also used for the aggregations used for the preview charts.
- 7.0 will introduce a search.max_buckets setting which defaults to 10000. This could lead to failing bucket estimations because the values used for creating the required aggregations could result in more buckets. This PR fixes it by taking search.max_buckets into account when calculating the time range used for the bucket estimation. (Since 6.2 that setting is available so backporting this to current unreleased minor releases 6.4 and 6.5)
@lcawl lcawl added the bug Fixes for quality problems that affect the customer experience label Oct 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Anomaly Detection ML anomaly detection :ml v6.4.0 v6.5.0 v7.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Estimate bucket span fails for median function
5 participants