Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] lat_long anomaly not detected anymore #2162

Open
pheyos opened this issue Dec 17, 2021 · 0 comments
Open

[ML] lat_long anomaly not detected anymore #2162

pheyos opened this issue Dec 17, 2021 · 0 comments
Labels

Comments

@pheyos
Copy link
Member

pheyos commented Dec 17, 2021

Summary

We used to have a job configuration that produced an anomaly and now it doesn't do that anymore, so we'd like to assess whether or not this is an expected change / a change we want to tolerate.

Steps to reproduce

  1. Install the Kibana ecommerce sample data

  2. Create and run the anomaly detection lookback job (synchronize Kibana saved objects if needed)

    Config
    PUT _ml/anomaly_detectors/ecommerce-geo
    {
      "analysis_config" : {
        "bucket_span":"15m",
        "detectors": [
          {
            "detector_description": "Unusual coordinates by user",
            "function": "lat_long",
            "field_name": "geoip.location",
            "by_field_name": "user"
          }
        ],
        "influencers": [
          "geoip.country_iso_code",
          "day_of_week",
          "category.keyword",
          "user"
          ]
      },
      "data_description" : {
        "time_field": "order_date"
      },
      "datafeed_config":{ 
        "datafeed_id": "datafeed-ecommerce-geo",
        "indices": ["kibana_sample_data_ecommerce"],
        "query": {
          "bool": {
            "must": [
              {
                "match_all": {}
              }
            ]
          }
        }
      }
    }
    
  3. View the job results

Additional information

  • This job config was used to create screenshots for the docs, e.g. the first one in this section:
    image
  • When looking at the data for user: jackson, we can see 114 documents with 113 of them having the same geoip.location somewhere near Los Angeles and one of them having a geoip.location near New York
    image
  • We can also see this by running a high precision geohash grid aggregation:
    GET kibana_sample_data_ecommerce/_search
    {
      "query": {
        "simple_query_string": {
          "query": "jackson",
          "fields": ["user"]
        }
      },
      "aggs": {
        "locations": {
          "geohash_grid": {"field": "geoip.location", "precision": 12}
        }
      },
      "size": 0
    }
    
    which gives us
    [...]
      "locations" : {
        "buckets" : [
          {
            "key" : "9q5cyr9qukez",
            "doc_count" : 113
          },
          {
            "key" : "dr5rs14yejbs",
            "doc_count" : 1
          }
        ]
      }
    
  • So from this data, the originally detected anomaly seems correct.
@pheyos pheyos added the :ml label Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant