*: support to execute CTE on MPP side #42296

winoros · 2023-03-16T02:06:00Z

What problem does this PR solve?

Issue Number: close #43333

Problem Summary:

This pull intends to support the CTE on MPP side.
There's a detailed design in this doc.

What is changed and how it works?

You can refer the detailed design mentioned above to see how the codes work.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Support CTE on MPP side

ti-chi-bot · 2023-03-16T02:06:02Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

AilinKid
time-and-fate

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

ti-chi-bot · 2023-03-16T02:06:02Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

store/copr/mpp.go

AilinKid · 2023-04-12T01:57:56Z

distsql/select_result.go

 				recorededPlanIDs[r.ctx.GetSessionVars().StmtCtx.RuntimeStatsColl.
-					RecordOneCopTask(planID, r.storeType.Name(), callee, detail)] = 0
+					RecordOneCopTask(-1, r.storeType.Name(), callee, detail)] = 0


why change the old here？

It's emmm...
I debugged a lot of time but could see the reason why the original codes panicked. But passing -1 will always work.

planner/core/fragment.go

AilinKid · 2023-04-12T04:54:58Z

planner/core/fragment.go

 	for _, task := range tasks {
 		addr := task.Meta.GetAddress()
 		// for upper fragment, the task num is equal to address num covered by lower tasks
 		_, ok := addressMap[addr]
+		if _, okk := cteAddrMap[addr]; !okk && len(cteAddrMap) > 0 {


how to understand this？if a children task‘s address is not in the cteProducerAddrs and cteProducerAddrs is not empty, then skip this task?

The possible workers are decided from bottom to top. So the address appears in the child fragments must appear in the parent fragments

still confused here, take case:

3 tiflash node: A,B,C
one cte producer is from A,B

when shared-cte as one side of join, should left the base table side un-moved. another side is broadcasted.
join
+-- base table (un-moved)
+-- receiver2 (cte task from A,B)

Soga: so your code here is meaning that: Consuming-cte means the current OP is join or something, we can always let one side un-moved, just let cte to be as close to the data as possible!?

After reading the design again, meaning the align the worker here, make sense

Yes, the current way is easy to implement. But it doesn't use the full nodes of our MPP.
We can support a more enhanced n:m sending strategy to ensure that most data is computed at the local node while we can use the full nodes of our MPP. But it's not contained in this pr.

time-and-fate

Please add some test cases at least to show the new execution plan.

time-and-fate · 2023-04-14T09:11:42Z

go.mod

@@ -281,5 +281,6 @@ replace (
 	// fix potential security issue(CVE-2020-26160) introduced by indirect dependency.
 	github.com/dgrijalva/jwt-go => github.com/form3tech-oss/jwt-go v3.2.6-0.20210809144907-32ab6a8243d7+incompatible
 	github.com/pingcap/tidb/parser => ./parser
+	github.com/pingcap/tipb => github.com/pingcap/tipb v0.0.0-20230328072712-dd18a6bb40f1


Why modify here?

You can view the changed in plan_to_pb.go

I mean, why are you using replace?

That side pr hasn't merged yet. Since you have not reviewed the fragment.go.
So the pb might be change due to your review.

time-and-fate · 2023-04-14T10:09:39Z

planner/core/optimizer.go

+func DoOptimize(ctx context.Context, sctx sessionctx.Context, flag uint64, logic LogicalPlan) (PhysicalPlan, float64, error) {
+	_, finalPlan, cost, err := DoOptimizeAndLogicAsRet(ctx, sctx, flag, logic)
+	return finalPlan, cost, err


Why do we need a DoOptimize and a DoOptimizeAndLogicAsRet?
I think DoOptimizeAndLogicAsRet can do all.

Many tests used the DoOptimize, the two will reduce the unnecessary changes of this pr.:joy:

planner/core/physical_plans.go

time-and-fate · 2023-04-14T12:58:50Z

planner/core/find_best_task.go

@@ -478,6 +587,107 @@ END:
 	return bestTask, cntPlan, nil
 }

+// findBestTask implements LogicalPlan interface.
+func (p *LogicalSequence) findBestTask(prop *property.PhysicalProperty, planCounter *PlanCounterTp, opt *physicalOptimizeOp) (bestTask task, cntPlan int64, err error) {


What's the difference between (p *baseLogicalPlan) findBestTask and (p *LogicalSequence) findBestTask?

seem same with baseLogicalPlan.findBestTask
+1

time-and-fate · 2023-04-14T13:31:08Z

util/plancodec/id.go

@@ -133,6 +133,8 @@ const (
 	TypeForeignKeyCheck = "Foreign_Key_Check"
 	// TypeForeignKeyCascade is the type of FKCascade
 	TypeForeignKeyCascade = "Foreign_Key_Cascade"
+	// TypeSequence
+	TypeSequence = "Sequence"


Please make sure the new execution plan can be displayed correctly in slow log, stmt summary, dashboard...

time-and-fate · 2023-04-14T14:31:30Z

planner/core/find_best_task.go

@@ -289,6 +289,115 @@ func (p *baseLogicalPlan) enumeratePhysicalPlans4Task(physicalPlans []PhysicalPl
 	return bestTask, cntPlan, nil
 }

+func (p *LogicalSequence) enumeratePhysicalPlans4Task(physicalPlans []PhysicalPlan,


(p *LogicalSequence) enumeratePhysicalPlans4Task and (p *baseLogicalPlan) enumeratePhysicalPlans4Task is 80% the same, I think it's better not to copy it.

But is there a good way to merge them? I haven't come up with one. So the codes are like current.

You can just check if p.self is a LogicalSequence in (p *baseLogicalPlan) enumeratePhysicalPlans4Task.

time-and-fate · 2023-04-14T14:42:54Z

planner/property/physical_property.go

+type cteConsumerStatus int
+
+const (
+	NoCTE cteConsumerStatus = iota
+	SomeCTEFailedMpp
+	AllCTECanMpp
+)


I think

the name should be "producer" status instead of "consumer" status.

two values are enough.

planner/core/exhaust_physical_plans.go

AilinKid · 2023-04-17T09:46:21Z

planner/core/find_best_task.go

@@ -478,6 +587,107 @@ END:
 	return bestTask, cntPlan, nil
 }

+// findBestTask implements LogicalPlan interface.
+func (p *LogicalSequence) findBestTask(prop *property.PhysicalProperty, planCounter *PlanCounterTp, opt *physicalOptimizeOp) (bestTask task, cntPlan int64, err error) {


seem same with baseLogicalPlan.findBestTask
+1

AilinKid · 2023-04-17T09:50:48Z

planner/core/find_best_task.go

-func (p *LogicalCTE) findBestTask(prop *property.PhysicalProperty, _ *PlanCounterTp, _ *physicalOptimizeOp) (t task, cntPlan int64, err error) {
+func (p *LogicalCTE) findBestTask(prop *property.PhysicalProperty, counter *PlanCounterTp, pop *physicalOptimizeOp) (t task, cntPlan int64, err error) {
+	if len(p.children) > 0 {
+		return p.baseLogicalPlan.findBestTask(prop, counter, pop)


when should we set the children for LogicalCTE（child is field that it already has，we didn't utilize it before？）

Currently, we use whether it has children to identify whether is producer or consumer.

got it, make sense, better comment on it above

time-and-fate

If I understand it correctly, "CTE storage" and "CTE producer" are the same thing, and "CTE reader" and "CTE consumer" are the same thing.
I think unifying the naming is better for understanding.

time-and-fate · 2023-05-05T15:52:41Z

planner/core/fragment.go

 	return tasks, nil
 }

+// flipCTEReader fix the plan tree. Before we enter the func. The plan tree is like ParentPlan->CTEConsumer->ExchangeReceiver.
+// The CTEConsumer has no real meaning in MPP's execution. We prune it to make the plan become ParentPlan->ExchangeReceiver.
+// But the Recevier needs a schema since itself doesn't hold the schema. So the final plan become ParentPlan->ExchangeRecevier->CTEConsumer.


typo of "receive"

I think the result is just "ParentPlan->ExchangeRecevier".
I didn't find where do we put the PhysicalCTE under the PhysicalExchangeReceiver.

It's in generateTasksForCTEReader.

Please make it more clear in the comments then.

winoros · 2023-05-08T07:29:17Z

If I understand it correctly, "CTE storage" and "CTE producer" are the same thing, and "CTE reader" and "CTE consumer" are the same thing.
I think unifying the naming is better for understanding.

I want to do that in the next pr.

winoros · 2023-05-17T10:28:33Z

/review default

ti-chi-bot · 2023-05-17T10:28:36Z

@winoros:
Sorry, failed to send message to OpenAI server!

In response to this:

/review default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

winoros · 2023-05-24T13:26:45Z

/merge

ti-chi-bot · 2023-05-24T13:26:49Z

This pull request has been accepted and is ready to merge.

Commit hash: 34864c8

winoros added 11 commits February 15, 2023 12:23

tmp

3f91dd3

fix the mpp prop

95375d7

Merge branch 'master' into add-sequence-operator

e50927c

Merge branch 'master' into add-sequence-operator

6285fb0

planner: support sending cte mpp task

5584c9a

fix panics

6044c08

some codes updates

6c69b8a

update the codes

d6c27bf

change style and clean

9133d92

Merge branch 'master' into add-sequence-operator

2dc1de9

clean the debugging info, make it ready for review

6c81152

winoros added 3 commits March 28, 2023 15:33

Merge branch 'master' into add-sequence-operator

380b4d1

push sequence down

e4010a8

Merge remote-tracking branch 'origin/master' into add-sequence-operator

699e39d

winoros force-pushed the add-sequence-operator branch from ec4d11d to 699e39d Compare April 4, 2023 11:39

ti-chi-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 4, 2023

AilinKid reviewed Apr 12, 2023

View reviewed changes

winoros marked this pull request as ready for review April 12, 2023 09:07

ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 12, 2023

time-and-fate reviewed Apr 14, 2023

View reviewed changes

AilinKid reviewed Apr 17, 2023

View reviewed changes

time-and-fate reviewed May 5, 2023

View reviewed changes

winoros added 3 commits May 9, 2023 22:48

Merge branch 'master' into add-sequence-operator

1c8994f

fix the aggregation's bad case

e5879cd

Merge branch 'master' into add-sequence-operator

af5c066

ti-chi-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 10, 2023

winoros added 3 commits May 17, 2023 17:18

Merge branch 'master' into add-sequence-operator

98868c0

address comments

84e8ac8

fix bazel_prepare

9004ef0

remove debug log

9088ad4

ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 22, 2023

winoros added 2 commits May 23, 2023 20:53

address comments

06f4b3b

Merge branch 'master' into add-sequence-operator

f7026cd

ti-chi-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 23, 2023

fix lint

51a1178

time-and-fate approved these changes May 24, 2023

View reviewed changes

ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 24, 2023

fix tests

34864c8

ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label May 24, 2023

qw4990 approved these changes May 24, 2023

View reviewed changes

ti-chi-bot bot merged commit 610ca18 into pingcap:master May 24, 2023

winoros deleted the add-sequence-operator branch May 24, 2023 14:21

winoros mentioned this pull request Jul 25, 2023

optimizer: add docs for shared cte and explain enhancement pingcap/docs-cn#14634

Merged

17 tasks

AilinKid mentioned this pull request Nov 17, 2023

planner: fix issue 48643 that aggDesc modification will change the referrence #48662

Merged

13 tasks

*: support to execute CTE on MPP side #42296

*: support to execute CTE on MPP side #42296

Conversation

winoros commented Mar 16, 2023 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

ti-chi-bot commented Mar 16, 2023 • edited by ti-chi-bot bot Loading

ti-chi-bot commented Mar 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AilinKid Apr 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

time-and-fate left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AilinKid Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AilinKid Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AilinKid Apr 19, 2023 • edited Loading

Choose a reason for hiding this comment

time-and-fate left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

winoros commented May 8, 2023

winoros commented May 17, 2023

ti-chi-bot bot commented May 17, 2023

winoros commented May 24, 2023

ti-chi-bot bot commented May 24, 2023

winoros commented Mar 16, 2023 •

edited

Loading

ti-chi-bot commented Mar 16, 2023 •

edited by ti-chi-bot bot

Loading

AilinKid Apr 19, 2023 •

edited

Loading

AilinKid Apr 17, 2023 •

edited

Loading

AilinKid Apr 17, 2023 •

edited

Loading

AilinKid Apr 19, 2023 •

edited

Loading