Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: re-implement partition pruning for better performance #14679

Merged
merged 15 commits into from
Feb 14, 2020

Conversation

tiancaiamao
Copy link
Contributor

@tiancaiamao tiancaiamao commented Feb 7, 2020

What problem does this PR solve?

Re-implement partition pruning for better performance.

The old partition pruning's implementation uses an algorithm we called constraint propagate which is powerful, yet slow.

It works like this:

For each partition:
    construct EXPR =  (partition's expression AND query's expression)
    if FixPoint(constraint propagate EXPR) == AlwaysFalse
            Prune

It's powerful as it can propagate something like a > b and b > 3 => a > 3 and it also include some propagate rules for functions x > const1, f(x) < const2, f is monotonous => false.

It's possible to handle some cases like:

create table t (a int, b int) partition by (a) ...
select * from t where a > b and b > 3;

But the process is slow because constructing new expressions involve too many object allocation, and the constraint propagation process is also heavy. If there are many partitions, 2048 for example, the whole time spent on partition pruning would be significant.

What is changed and how it works?

The new partition pruning algorithm is much faster by using binary search and avoid constructing expressions.

The query expression would be abstract to f(col) op const, where op is one of > < = >= <=, in the code it's represent as dataForPrune.

The 'partition p0 less than xxx, partition p1 less than xxx, ...' forms an array [p0 p1 p2 ... maxvalue], represented as the lessThanData.

The new algorithm uses a binary search pruneUseBinarySearch to locate const in the array.

A simple benchmark,
Before:

session git:(partition-prune) ✗ go test -test.bench BenchmarkPartitionPruning -test.run Ignore
BenchmarkPartitionPruning-4   	       8	 128010294 ns/op

After:

session git:(partition-prune) ✗ go test -test.bench BenchmarkPartitionPruning -test.run Ignore
BenchmarkPartitionPruning-4   	      21	  57979203 ns/op

Some side notes:

  1. As null values are all located in the first partition, the first partition expression in the old implementation is val < xxx or val is null. The or val is null condition is hard to eliminate so we can't prune the first partition in many cases.
    The new implementation doesn't consider the null value, so the first partition is pruned.

  2. There is a relax operation during the pruning, and it may lead to less accurate pruning results.
    For example,

create table t (a datetime) partition by to_days(a) ...

This condition doesn't always hold:

a < const => to_days(a) < to_days(const)

A counterexample is:

2020-02-12 10:08:00 < 2020-02-12 23:59:59 => to_days(2020-02-12 10:08:00) = to_days(2020-02-12 23:59:59)

So we have to relax < to <= to handle functions.

a < const => to_days(a) <= to_days(const)

Check List

Tests

  • Unit test

Related changes

  • Need to cherry-pick to the release branch

Release note

  • Write release note for bug-fix or new feature.

The new partition pruning algorithm is not as powerful as the original one.
However, it's much faster by using binary search and avoid constructing expressions.
@tiancaiamao tiancaiamao requested review from a team as code owners February 7, 2020 14:42
@ghost ghost requested review from eurekaka and winoros and removed request for a team February 7, 2020 14:42
@tiancaiamao tiancaiamao added sig/planner SIG: Planner type/performance type/enhancement The issue or PR belongs to an enhancement. labels Feb 7, 2020
@tiancaiamao tiancaiamao changed the title *: re-implement partition pruning for better performance *: re-implement partition pruning for better performance [WIP] Feb 7, 2020
@tiancaiamao tiancaiamao changed the title *: re-implement partition pruning for better performance [WIP] *: re-implement partition pruning for better performance Feb 11, 2020
@tiancaiamao
Copy link
Contributor Author

/run-unit-test

PTAL @imtbkcat @zz-jason

@imtbkcat imtbkcat self-requested a review February 12, 2020 03:15
@SunRunAway SunRunAway removed the request for review from a team February 13, 2020 08:06
Copy link

@imtbkcat imtbkcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@imtbkcat imtbkcat added the status/LGT1 Indicates that a PR has LGTM 1. label Feb 14, 2020
@tiancaiamao
Copy link
Contributor Author

tiancaiamao commented Feb 14, 2020

Just now I run a benchmark on my own laptop and compare the result
(latency in milliseconds, the lower the better).

The old version:

Mean:  3650.8462739778993
Quantile 80:  4975.331340000001
Quantile 95:  7505.103714999999
Quantile 99:  9457.780304
Total Succ: 181

and the new version:

Mean:  2072.399426448846
Quantile 80:  3150.99298025
Quantile 95:  4735.244528499999
Quantile 99:  6287.017979
Total Succ: 303

Copy link
Contributor

@lysu lysu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

func fullRange(end int) partitionRangeOR {
var reduceAllocation [3]partitionRange
reduceAllocation[0] = partitionRange{0, end}
return partitionRangeOR(reduceAllocation[:1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return partitionRangeOR(reduceAllocation[:1])
return reduceAllocation[:1]

// Let M = intersection, U = union, then
// a M (b U c) == (a M b) U (a M c)
ret := or[:0]
for _, r1 := range or {
Copy link
Contributor

@lysu lysu Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a question: can we make sure or be sorted?

if so maybe we can more optimize to avoid {6, 7} U {3, 9} call in {0, 4}, {6, 7}, {8, 11} U {3, 9} or {3, +INF} U {6, 7} call in {3, +INF} U {0, 4}, {6, 7}- -?

it seems leaf exp (e.g. a > 3 in a > 3 and b < 5) can take benfit from this(although intersectionRange is fast)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintain the sorted condition is more restrictive.
The or array is usually small, so I guess loop over the whole array VS maintain the sorted array and break in advance don't have a big difference.

Copy link
Contributor

@lysu lysu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tiancaiamao tiancaiamao merged commit 9543a0f into pingcap:master Feb 14, 2020
@tiancaiamao tiancaiamao deleted the partition-prune branch February 14, 2020 11:18
tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request Mar 13, 2020
tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request Mar 24, 2020
tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request Mar 25, 2020
tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner status/LGT1 Indicates that a PR has LGTM 1. type/enhancement The issue or PR belongs to an enhancement. type/performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants