-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix inconsistencies in AQE support for broadcast joins #1042
Conversation
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsMeta.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastJoinMeta.scala
Outdated
Show resolved
Hide resolved
also can we update the description or issue to have your findings on what the problem is |
Signed-off-by: Andy Grove <andygrove@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuBroadcastJoinMeta.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Andy Grove <andygrove@nvidia.com>
build |
@@ -119,6 +119,13 @@ class Spark300Shims extends SparkShims { | |||
} | |||
} | |||
|
|||
override def isGpuBroadcastNestedLoopJoin(plan: SparkPlan): Boolean = { | |||
plan match { | |||
case _: GpuBroadcastNestedLoopJoinExecBase => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for my late comment I was looking at something else and realized this doesn't need to be in the shim, you are using the Base class here so there is nothing shim specific about it.
* Fix inconsistencies with AQE support for broadcast joins Signed-off-by: Andy Grove <andygrove@nvidia.com> * code cleanup and change test behavior for Spark 3.0.0 Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix inconsistency Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix test failure with Spark 3.1.0 Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix inconsistency Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix imports Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix regression Signed-off-by: Andy Grove <andygrove@nvidia.com> * tighten up rules Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move GpuBroadcastJoinMeta to com.nvidia.spark.rapids package Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move GpuBroadcastJoinMeta to com.nvidia.spark.rapids package Signed-off-by: Andy Grove <andygrove@nvidia.com>
* Fix inconsistencies with AQE support for broadcast joins Signed-off-by: Andy Grove <andygrove@nvidia.com> * code cleanup and change test behavior for Spark 3.0.0 Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix inconsistency Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix test failure with Spark 3.1.0 Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix inconsistency Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix imports Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix regression Signed-off-by: Andy Grove <andygrove@nvidia.com> * tighten up rules Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move GpuBroadcastJoinMeta to com.nvidia.spark.rapids package Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move GpuBroadcastJoinMeta to com.nvidia.spark.rapids package Signed-off-by: Andy Grove <andygrove@nvidia.com>
* Fix inconsistencies with AQE support for broadcast joins Signed-off-by: Andy Grove <andygrove@nvidia.com> * code cleanup and change test behavior for Spark 3.0.0 Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix inconsistency Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix test failure with Spark 3.1.0 Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix inconsistency Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix imports Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix regression Signed-off-by: Andy Grove <andygrove@nvidia.com> * tighten up rules Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move GpuBroadcastJoinMeta to com.nvidia.spark.rapids package Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move GpuBroadcastJoinMeta to com.nvidia.spark.rapids package Signed-off-by: Andy Grove <andygrove@nvidia.com>
…IDIA#1042) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
This PR fixes some inconsistencies in the logic in broadcast joins for determining whether they can run on the GPU or not when AQE is enabled.
The existing logic for
BroadcastNestedLoopJoin
was expecting the build side to beGpuBroadcastExchangeExecBase
and would fail if the build side was aBroadcastQueryStageExec
as is the case when AQE is enabled and when the build side has already been materialized.Although the logic for
BroadcastHashJoin
was taking AQE into account, the code was inconsistent and incomplete so this PR addresses that too.Changes in this PR:
GpuBroadcastJoinMeta
base class introduced so that common logic for broadcast nested-loop and hash joins could be introduced for determining when these operators can run on GPU.GpuBroadcastNestedLoopJoin
with AQE on and off to confirm that the fixes work.TestUtils.operatorCount
tofindOperators
since the name was misleading (it returns a list of operators, not a count) and this change only impacted a few lines of existing code.I manually tested these changes with TPC-DS q90 and confirmed that the query now runs without error.
This closes #1035