Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: fix repetitive selectivity accounting and stabilify the result (#15536) #16052

Merged
merged 1 commit into from
Apr 3, 2020

Conversation

sre-bot
Copy link
Contributor

@sre-bot sre-bot commented Apr 3, 2020

cherry-pick #15536 to release-3.1


What problem does this PR solve?

Problem Summary:

  • in Selectivity, index order in coll.Indices is non-deterministic, so the greedy search algorithm may return different results in different runs, that would confuse users since the stats is not changed at all;
  • in the greedy search algorithm, there are repetitive selectivity accounting sometimes. For example, if the filter is like t.a = 1 and t.b > 1 and t.c > 1, and there are 2 indexes idx1(a,b) and idx2(a,c), the greedy algorithm would choose both indexes and multiply their selectivity computed respectively. Obviously, this is wrong, because selectivity of t.a = 1 is accounted twice.

What is changed and how it works?

What's Changed:

  • do not choose indexes whose filters covered are overlapped;
  • sort the StatsNode slice before greedy search;

How it Works:

Note that, how we sort the StatsNode slice impacts the greedy search result. I put the PK in the end of the slice, indexes in the middle and columns in the front, to enforce the heuristic rule that, PK is preferred over indexes in estimation, and indexes are preferred over columns.

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test

Side effects

  • Performance regression: the change of the selectivity result may change plan generated.
  • Breaking backward compatibility: nope, in order to keep compatibility, I introduces the compareType function, instead of changing the values of IndexType / PkType / ColType, because feedback encoding uses these constants.

Release note

@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 3, 2020

/run-all-tests

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 3, 2020
Copy link
Contributor

@lzmhhh123 lzmhhh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lzmhhh123 lzmhhh123 added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 3, 2020
@zz-jason zz-jason merged commit 879c0e2 into pingcap:release-3.1 Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug. type/3.1-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants