Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: put all cardinality estimation code into a single separate package #46358

Closed
qw4990 opened this issue Aug 23, 2023 · 0 comments
Closed
Labels
epic/cardinality-estimation the optimizer cardinality estimation sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@qw4990
Copy link
Contributor

qw4990 commented Aug 23, 2023

Enhancement

Currently, the code of CE is coupled with other modules, and you can find its code in:

  1. The stats package: HistColl.Selectivity, HistColl.GetRowCountByXXX, HistColl.crossValidationSelectivity, etc.
  2. Physical optimization: fullJoinRowCountHelper.estimate, DataSource.getOriginalPhysicalTableScan, etc.
  3. Logical Optimization: baseLogicalPlan.recursiveDeriveStats, LogicalPlan.DeriveStats, etc.

This makes this module hard to maintain and evolve.

An ideal architecture is shown below, where boundaries between the CE module and other modules are clear:
image

  1. At the bottom, the Stats module provides some low-level interfaces to directly estimate cardinality for range or point on those statistical structures like Hist, TopN, etc.
  2. Based on Stats, the CE module provides some high-level interfaces to estimate cardinality for CNF, Join, Agg, etc. All estimation strategies should be put in this separate package, e.g.:
    2.1. How to estimate for Join (fullJoinRowCountHelper.estimate)?
    2.2. How to handle ModifyCnt?
    2.3. How to handle out-of-range estimation (outOfRangeEQSelectivity)?
    2.4. How to prioritize index statistics and column statistics?
    2.5. How to use multiple sorts of statistics to make the estimation result more accurate (crossValidationSelectivity)?
    2.6. ...
  3. On the top, all other modules use the interfaces provided by CE to do cardinality estimation.

We decided to refactor(reorganize) related code by following the above design.

@qw4990 qw4990 added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner epic/cardinality-estimation the optimizer cardinality estimation labels Aug 23, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 23, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 24, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 24, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 25, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 25, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 28, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 28, 2023
…l-optimization package into cardinality package (#46442)

ref #46358
ti-chi-bot bot pushed a commit that referenced this issue Aug 29, 2023
ti-chi-bot bot pushed a commit that referenced this issue Aug 29, 2023
@qw4990 qw4990 closed this as completed Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic/cardinality-estimation the optimizer cardinality estimation sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

1 participant