Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fold column equal conditions when simplifying outer join #7679

Closed
eurekaka opened this issue Sep 12, 2018 · 1 comment
Closed

fold column equal conditions when simplifying outer join #7679

eurekaka opened this issue Sep 12, 2018 · 1 comment
Assignees
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@eurekaka
Copy link
Contributor

eurekaka commented Sep 12, 2018

Feature Request

Is your feature request related to a problem? Please describe:

Currently, for the following 4 kinds of query, outer join is generated, but they can be converted to inner join semantically.

create table t1(id primary key, a int, b int);
create table t2(id primary key, a int, b int);

mysql> explain select * from t1 left join t2 on t1.a=t2.a where t1.b = t2.b and t1.b = 2;
+---------------------------+----------+------+-------------------------------------------------------------------------+
| id                        | count    | task | operator info                                                           |
+---------------------------+----------+------+-------------------------------------------------------------------------+
| Selection_7               | 10.00    | root | eq(test.t1.b, test.t2.b)                                                |
| └─HashLeftJoin_8          | 12.50    | root | left outer join, inner:TableReader_13, equal:[eq(test.t1.a, test.t2.a)] |
|   ├─TableReader_11        | 10.00    | root | data:Selection_10                                                       |
|   │ └─Selection_10        | 10.00    | cop  | eq(test.t1.b, 2)                                                        |
|   │   └─TableScan_9       | 10000.00 | cop  | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo             |
|   └─TableReader_13        | 10000.00 | root | data:TableScan_12                                                       |
|     └─TableScan_12        | 10000.00 | cop  | table:t2, range:[-inf,+inf], keep order:false, stats:pseudo             |
+---------------------------+----------+------+-------------------------------------------------------------------------+
7 rows in set (0.01 sec)

mysql> explain select * from t1 left join t2 on t1.a = 1 where t1.a = t2.a;
+-------------------------+--------------+------+---------------------------------------------------------------------+
| id                      | count        | task | operator info                                                       |
+-------------------------+--------------+------+---------------------------------------------------------------------+
| Selection_7             | 80000000.00  | root | eq(test.t1.a, test.t2.a)                                            |
| └─HashLeftJoin_8        | 100000000.00 | root | left outer join, inner:TableReader_12, left cond:[eq(test.t1.a, 1)] |
|   ├─TableReader_10      | 10000.00     | root | data:TableScan_9                                                    |
|   │ └─TableScan_9       | 10000.00     | cop  | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo         |
|   └─TableReader_12      | 10000.00     | root | data:TableScan_11                                                   |
|     └─TableScan_11      | 10000.00     | cop  | table:t2, range:[-inf,+inf], keep order:false, stats:pseudo         |
+-------------------------+--------------+------+---------------------------------------------------------------------+
6 rows in set (0.00 sec)

mysql> explain select * from t1 left join t2 on t1.a = t2.a where t1.b = t2.b and t1.b > 1;
+---------------------------+----------+------+-------------------------------------------------------------------------+
| id                        | count    | task | operator info                                                           |
+---------------------------+----------+------+-------------------------------------------------------------------------+
| Selection_7               | 3333.33  | root | eq(test.t1.b, test.t2.b)                                                |
| └─HashLeftJoin_8          | 4166.67  | root | left outer join, inner:TableReader_13, equal:[eq(test.t1.a, test.t2.a)] |
|   ├─TableReader_11        | 3333.33  | root | data:Selection_10                                                       |
|   │ └─Selection_10        | 3333.33  | cop  | gt(test.t1.b, 1)                                                        |
|   │   └─TableScan_9       | 10000.00 | cop  | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo             |
|   └─TableReader_13        | 10000.00 | root | data:TableScan_12                                                       |
|     └─TableScan_12        | 10000.00 | cop  | table:t2, range:[-inf,+inf], keep order:false, stats:pseudo             |
+---------------------------+----------+------+-------------------------------------------------------------------------+
7 rows in set (0.00 sec)

mysql> explain select * from t1 left join t2 on t1.a > 1 where t1.a = t2.a;
+-------------------------+--------------+------+---------------------------------------------------------------------+
| id                      | count        | task | operator info                                                       |
+-------------------------+--------------+------+---------------------------------------------------------------------+
| Selection_7             | 80000000.00  | root | eq(test.t1.a, test.t2.a)                                            |
| └─HashLeftJoin_8        | 100000000.00 | root | left outer join, inner:TableReader_12, left cond:[gt(test.t1.a, 1)] |
|   ├─TableReader_10      | 10000.00     | root | data:TableScan_9                                                    |
|   │ └─TableScan_9       | 10000.00     | cop  | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo         |
|   └─TableReader_12      | 10000.00     | root | data:TableScan_11                                                   |
|     └─TableScan_11      | 10000.00     | cop  | table:t2, range:[-inf,+inf], keep order:false, stats:pseudo         |
+-------------------------+--------------+------+---------------------------------------------------------------------+
6 rows in set (0.00 sec)

If we can convert them to inner join, we can apply more predicate pushdown optimizations, and we have more choices when exploring physical join operators.

Describe the feature you'd like:

rewrite the query to a semantically equivalent query by constant propagation, then this new query is able to be converted to inner join in function simplifyOuterJoin. For the above 4 kinds of query, the query after rewrite is:

select * from t1 left join t2 on t1.a=t2.a where t1.b = t2.b and t1.b = 2;
=> select * from t1 left join t2 on t1.a = t2.a where t2.b = 2 and t1.b = 2;

select * from t1 left join t2 on t1.a = 1 where t1.a = t2.a;
=> select * from t1 left join t2 on t1.a = 1 where 1 = t2.a;

select * from t1 left join t2 on t1.a = t2.a where t1.b = t2.b and t1.b > 1;
=> select * from t1 left join t2 on t1.a = t2.a where t1.b = t2.b and t1.b > 1 and t2.b > 1;

select * from t1 left join t2 on t1.a > 1 where t1.a = t2.a;
=> select * from t1 left join t2 on t1.a > 1 where t1.a = t2.a and t2.a > 1;

This enhancement should be a small step forward to tackle #7559

@eurekaka eurekaka added the sig/planner SIG: Planner label Sep 12, 2018
@eurekaka eurekaka self-assigned this Sep 12, 2018
@eurekaka eurekaka added the type/enhancement The issue or PR belongs to an enhancement. label Sep 13, 2018
@eurekaka
Copy link
Contributor Author

I just came to realize that we do not need to rewrite the filters, instead, we can enhance foldConstant to handle null op column expressions.

@eurekaka eurekaka changed the title propagate constant before simplify outer join fold column equal conditions when simplifying outer join Sep 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

1 participant