Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parsed expressions and binding analysis in more places. #14124

Merged
merged 10 commits into from
Jun 27, 2023

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Apr 20, 2023

Main changes:

  1. Cache parsed and analyzed expressions within PlannerContext for a single SQL query.
  2. Cache parsed expressions together with input binding analysis using a new class AnalyzeExpr.

Edit: Scope reduced to only do (part of) the first item.

This speeds up SQL planning, because SQL planning involves parsing analyzing the same expression strings over and over again.

Main changes:

1) Cache parsed and analyzed expressions within PlannerContext for a
   single SQL query.

2) Cache parsed expressions together with input binding analysis using
   a new class AnalyzeExpr.

This speeds up SQL planning, because SQL planning involves parsing
analyzing the same expression strings over and over again.
@gianm
Copy link
Contributor Author

gianm commented Jun 26, 2023

In talking offline with @clintropolis, I learned he's planning on doing some work to make it faster to get required bindings in the future, minimizing the need to introduce AnalyzedExpr throughout the code base. So, I'm thinking of reworking this patch to be limited to Expr caching in PlannerContext during SQL planning. Marking it as a draft til then.

@gianm gianm marked this pull request as draft June 26, 2023 21:06
@gianm gianm marked this pull request as ready for review June 27, 2023 00:28
@gianm
Copy link
Contributor Author

gianm commented Jun 27, 2023

Pushed a simplifying commit and marked ready for review.

@@ -303,7 +303,7 @@ public static PostAggregator toPostAggregator(
postAggregatorVisitor.getOutputNamePrefix() + postAggregatorVisitor.getAndIncrementCounter(),
druidExpression.getExpression(),
null,
plannerContext.getExprMacroTable()
plannerContext.parse(druidExpression.getExpression())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: am conflicted, this reads funny like it isn't obvious at a glance it is parsing native expressions since like why would that be built-in to the plannerContext, otoh it would be hella tedious to always be plannerContext.getExpressionParser().parse(...) all the time..

Copy link
Contributor Author

@gianm gianm Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the idea is to avoid doing plannerContext.getExpressionParser().parse(...) all the time.

What do you think about naming it plannerContext.parseExpression(...)?

OTOH, we could also go with plannerContext.getExpressionParser().parse(...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about naming it plannerContext.parseExpression(...)?

that does seem nicer 👍

@gianm gianm merged commit c78d885 into apache:master Jun 27, 2023
73 checks passed
@gianm gianm deleted the expr-more-caching branch June 27, 2023 20:40
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
…#14124)

* Cache parsed expressions and binding analysis in more places.

Main changes:

1) Cache parsed and analyzed expressions within PlannerContext for a
   single SQL query.

2) Cache parsed expressions together with input binding analysis using
   a new class AnalyzeExpr.

This speeds up SQL planning, because SQL planning involves parsing
analyzing the same expression strings over and over again.

* Fixes.

* Fix style.

* Fix test.

* Simplify: get rid of AnalyzedExpr, focus on caching.

* Rename parse -> parseExpression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants