Skip to content

Commit

Permalink
Minor: Document LogicalPlan tree node transformations (#10010)
Browse files Browse the repository at this point in the history
* Document LogicalPlan tree node transformations

* Add exists

* touchups, add apply_subqueries, map_subqueries
  • Loading branch information
alamb authored Apr 10, 2024
1 parent 843caea commit 5820507
Show file tree
Hide file tree
Showing 4 changed files with 52 additions and 12 deletions.
10 changes: 7 additions & 3 deletions datafusion/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -296,11 +296,15 @@
//! A [`LogicalPlan`] is a Directed Acyclic Graph (DAG) of other
//! [`LogicalPlan`]s, each potentially containing embedded [`Expr`]s.
//!
//! [`Expr`]s can be rewritten using the [`TreeNode`] API and simplified using
//! [`ExprSimplifier`]. Examples of working with and executing `Expr`s can be found in the
//! [`expr_api`.rs] example
//! `LogicalPlan`s can be rewritten with [`TreeNode`] API, see the
//! [`tree_node module`] for more details.
//!
//! [`Expr`]s can also be rewritten with [`TreeNode`] API and simplified using
//! [`ExprSimplifier`]. Examples of working with and executing `Expr`s can be
//! found in the [`expr_api`.rs] example
//!
//! [`TreeNode`]: datafusion_common::tree_node::TreeNode
//! [`tree_node module`]: datafusion_expr::logical_plan::tree_node
//! [`ExprSimplifier`]: crate::optimizer::simplify_expressions::ExprSimplifier
//! [`expr_api`.rs]: https://github.com/apache/arrow-datafusion/blob/main/datafusion-examples/examples/expr_api.rs
//!
Expand Down
2 changes: 1 addition & 1 deletion datafusion/expr/src/logical_plan/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ pub mod dml;
mod extension;
mod plan;
mod statement;
mod tree_node;
pub mod tree_node;

pub use builder::{
build_join_schema, table_scan, union, wrap_projection_for_join_if_necessary,
Expand Down
20 changes: 19 additions & 1 deletion datafusion/expr/src/logical_plan/plan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ pub use datafusion_common::{JoinConstraint, JoinType};
/// an output relation (table) with a (potentially) different
/// schema. A plan represents a dataflow tree where data flows
/// from leaves up to the root to produce the query result.
///
/// # See also:
/// * [`tree_node`]: visiting and rewriting API
///
/// [`tree_node`]: crate::logical_plan::tree_node
#[derive(Clone, PartialEq, Eq, Hash)]
pub enum LogicalPlan {
/// Evaluates an arbitrary list of expressions (essentially a
Expand Down Expand Up @@ -238,7 +243,10 @@ impl LogicalPlan {
}

/// Returns all expressions (non-recursively) evaluated by the current
/// logical plan node. This does not include expressions in any children
/// logical plan node. This does not include expressions in any children.
///
/// Note this method `clone`s all the expressions. When possible, the
/// [`tree_node`] API should be used instead of this API.
///
/// The returned expressions do not necessarily represent or even
/// contributed to the output schema of this node. For example,
Expand All @@ -248,6 +256,8 @@ impl LogicalPlan {
/// The expressions do contain all the columns that are used by this plan,
/// so if there are columns not referenced by these expressions then
/// DataFusion's optimizer attempts to optimize them away.
///
/// [`tree_node`]: crate::logical_plan::tree_node
pub fn expressions(self: &LogicalPlan) -> Vec<Expr> {
let mut exprs = vec![];
self.apply_expressions(|e| {
Expand Down Expand Up @@ -773,10 +783,16 @@ impl LogicalPlan {
/// Returns a new `LogicalPlan` based on `self` with inputs and
/// expressions replaced.
///
/// Note this method creates an entirely new node, which requires a large
/// amount of clone'ing. When possible, the [`tree_node`] API should be used
/// instead of this API.
///
/// The exprs correspond to the same order of expressions returned
/// by [`Self::expressions`]. This function is used by optimizers
/// to rewrite plans using the following pattern:
///
/// [`tree_node`]: crate::logical_plan::tree_node
///
/// ```text
/// let new_inputs = optimize_children(..., plan, props);
///
Expand Down Expand Up @@ -1367,6 +1383,7 @@ macro_rules! handle_transform_recursion_up {
}

impl LogicalPlan {
/// Visits a plan similarly to [`Self::visit`], but including embedded subqueries.
pub fn visit_with_subqueries<V: TreeNodeVisitor<Node = Self>>(
&self,
visitor: &mut V,
Expand All @@ -1380,6 +1397,7 @@ impl LogicalPlan {
.visit_parent(|| visitor.f_up(self))
}

/// Rewrites a plan similarly t [`Self::visit`], but including embedded subqueries.
pub fn rewrite_with_subqueries<R: TreeNodeRewriter<Node = Self>>(
self,
rewriter: &mut R,
Expand Down
32 changes: 25 additions & 7 deletions datafusion/expr/src/logical_plan/tree_node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,35 @@
// specific language governing permissions and limitations
// under the License.

//! Tree node implementation for logical plan

//! [`TreeNode`] based visiting and rewriting for [`LogicalPlan`]s
//!
//! Visiting (read only) APIs
//! * [`LogicalPlan::visit`]: recursively visit the node and all of its inputs
//! * [`LogicalPlan::visit_with_subqueries`]: recursively visit the node and all of its inputs, including subqueries
//! * [`LogicalPlan::apply_children`]: recursively visit all inputs of this node
//! * [`LogicalPlan::apply_expressions`]: (non recursively) visit all expressions of this node
//! * [`LogicalPlan::apply_subqueries`]: (non recursively) visit all subqueries of this node
//! * [`LogicalPlan::apply_with_subqueries`]: recursively visit all inputs and embedded subqueries.
//!
//! Rewriting (update) APIs:
//! * [`LogicalPlan::exists`]: search for an expression in a plan
//! * [`LogicalPlan::rewrite`]: recursively rewrite the node and all of its inputs
//! * [`LogicalPlan::map_children`]: recursively rewrite all inputs of this node
//! * [`LogicalPlan::map_expressions`]: (non recursively) visit all expressions of this node
//! * [`LogicalPlan::map_subqueries`]: (non recursively) rewrite all subqueries of this node
//! * [`LogicalPlan::rewrite_with_subqueries`]: recursively rewrite the node and all of its inputs, including subqueries
//!
//! (Re)creation APIs (these require substantial cloning and thus are slow):
//! * [`LogicalPlan::with_new_exprs`]: Create a new plan with different expressions
//! * [`LogicalPlan::expressions`]: Return a copy of the plan's expressions
use crate::{
Aggregate, Analyze, CreateMemoryTable, CreateView, CrossJoin, DdlStatement, Distinct,
DistinctOn, DmlStatement, Explain, Extension, Filter, Join, Limit, LogicalPlan,
Prepare, Projection, RecursiveQuery, Repartition, Sort, Subquery, SubqueryAlias,
Union, Unnest, Window,
dml::CopyTo, Aggregate, Analyze, CreateMemoryTable, CreateView, CrossJoin,
DdlStatement, Distinct, DistinctOn, DmlStatement, Explain, Extension, Filter, Join,
Limit, LogicalPlan, Prepare, Projection, RecursiveQuery, Repartition, Sort, Subquery,
SubqueryAlias, Union, Unnest, Window,
};
use std::sync::Arc;

use crate::dml::CopyTo;
use datafusion_common::tree_node::{
Transformed, TreeNode, TreeNodeIterator, TreeNodeRecursion,
};
Expand Down

0 comments on commit 5820507

Please sign in to comment.