Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Implement rare terms #121500

Merged
merged 15 commits into from
Jan 18, 2022
Merged

[Lens] Implement rare terms #121500

merged 15 commits into from
Jan 18, 2022

Conversation

flash1293
Copy link
Contributor

@flash1293 flash1293 commented Dec 17, 2021

This PR adds support for rare terms to aggconfigs and Lens.

Be aware of the following things:

  • No sorting
  • No "size" parameter (only max doc count like supported by API)
  • No other bucket, no missing values
  • No multi fields

If a precision warning is shown and the current sorting is ascending count, there's a call to action button to turn it into rare terms:
Screenshot 2021-12-17 at 11 08 02

Title changes automatically to "Rare values of"
Screenshot 2021-12-17 at 11 08 44

Sorting can be enabled from the flyout by picking "Rarity" in the ranking options:
Screenshot 2021-12-17 at 11 09 18

If it's picked, the "add field" button, the "size" input, order direction and existing advanced options are greyed out because they are not applicable. A new input is shown for the max doc count. Max doc count is defaulting to 1 which is also the min value. The maximum value is 100 (both values set by Elasticsearch in the same way)

@flash1293 flash1293 added release_note:enhancement Team:Visualizations Visualization editors, elastic-charts and infrastructure Team:AppServicesSv backport:skip This commit does not require backporting v8.1.0 labels Dec 17, 2021
@flash1293
Copy link
Contributor Author

flash1293 commented Dec 17, 2021

cc @MichaelMarcialis tests are failing, but I think we can start thinking about the open UI questions already

@flash1293
Copy link
Contributor Author

@ghudgins @MichaelMarcialis Just discussed this in the dev sync and @dej611 suggested to make this a new function instead of hacking it in as a rank option (there would be "Rare values" and "Top values"). In the rare values config UI there's just max doc count, no disabled features.

@ghudgins
Copy link

i'm speaking purely user interface here but "top values" implies the top of some list ordered by something. I think rare terms fits better in the top values function for this reason....if we have "rare values" then we could apply this logic and have "alphabetical values" and it gets too verbose. if there's a good technical reason to keep it separate then we can adjust but using Lens as a translation for these technical details was something I was hoping we could work in. just my opinion though... @MichaelMarcialis?

@flash1293
Copy link
Contributor Author

flash1293 commented Dec 17, 2021

No technical reason to keep them apart, it doesn't matter much either way - our operations architecture is flexible enough to handle both ways well

@dej611
Copy link
Contributor

dej611 commented Dec 17, 2021

i'm speaking purely user interface here but "top values" implies the top of some list ordered by something. I think rare terms fits better in the top values function for this reason....if we have "rare values" then we could apply this logic and have "alphabetical values" and it gets too verbose. if there's a good technical reason to keep it separate then we can adjust but using Lens as a translation for these technical details was something I was hoping we could work in. just my opinion though... @MichaelMarcialis?

I think I agree with you for the general case. I have no strong opinion here, but just thought it may be useful to discuss this peculiar case.
The particular case of Rare terms vs other ranking logics here is mostly a UI/UX problem, where the user has everything disabled or "reduced" just picking a Rank by option: while for a regular single term case this means losing few bits like the Other and Missing categories, in case of multi terms the Rank by option "Rarity" gives the illusion to work but effectively will remove the full list of terms already configured. That was the lead for the proposal of a specific function.

@flash1293
Copy link
Contributor Author

flash1293 commented Dec 17, 2021

I don't have a strong opinion here - given that this is more of an expert feature anyway (an everyday user won't look at the rare terms) I'm leaning slightly towards keeping it as a special sorting option so it's not bloating the list of operations with "weird" things.

@MichaelMarcialis
Copy link
Contributor

I don't have a particularly strong opinion here regarding whether this should be a standalone quick function versus a sorting option within the existing top values function. My only concern with adding it as a sorting option to the top values function is discoverability. However, if we're comfortable making the feature a bit less discoverable, then it's a non-issue (which I assume is indeed the case, given Joe's description of it being a more expert feature).

@flash1293 flash1293 marked this pull request as ready for review January 10, 2022 17:55
@flash1293 flash1293 requested review from a team as code owners January 10, 2022 17:55
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-vis-editors @elastic/kibana-vis-editors-external (Team:VisEditors)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServicesSv)

@flash1293
Copy link
Contributor Author

Not sure why docs build is failing - the build contains this line:

17:18:34 runbld>>> There were errors while communicating with Elasticsearch.   There may be data missing in the Build Stats cluster.   Infra has been notified.

@flash1293
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@Dosant Dosant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rare terms agg addition lgtm

@flash1293
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@dej611 dej611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and it works.
Left some minor comments, and found some edge case problem with the labelling.

When the label is cleared the placeholder is not handling the Rarity flag properly:
Screenshot 2022-01-17 at 14 59 56

label={i18n.translate('xpack.lens.indexPattern.terms.maxDocCount', {
defaultMessage: 'Max doc count per term',
})}
maxValue={100}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where this value come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it into a constant, magic numbers are bad

@flash1293
Copy link
Contributor Author

Good catch about the label thing - the problem is that the label input is storing the "old" value internally but it doesn't refresh if the order by is changed. Fixed it by re-rendering the label input in case the default label changes and the user hasn't changed it yet. This is fixing the same issue for the case if the user is changing the field name, then clearing the label input. @dej611 could you review again?

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
data 518 520 +2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
data 2758 2766 +8

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
lens 1.0MB 1.0MB +2.7KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 291.1KB 291.2KB +72.0B
data 446.5KB 448.8KB +2.3KB
visTypeHeatmap 10.3KB 10.3KB +42.0B
visTypeMetric 8.7KB 8.7KB +14.0B
visTypePie 7.9KB 7.9KB +28.0B
visTypeTable 15.0KB 15.1KB +28.0B
visTypeVislib 18.7KB 18.7KB +28.0B
visTypeXy 41.1KB 41.3KB +168.0B
total +2.7KB
Unknown metric groups

API count

id before after diff
data 3355 3363 +8

References to deprecated APIs

id before after diff
data 477 485 +8

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@dej611 dej611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested again and it worked fine 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Lens release_note:enhancement Team:Visualizations Visualization editors, elastic-charts and infrastructure v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants