-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose row count statistics in GpuShuffleExchangeExec #1855
Conversation
Signed-off-by: Andy Grove <andygrove@nvidia.com>
shims/spark301/src/main/scala/com/nvidia/spark/rapids/shims/spark301/Spark301Shims.scala
Outdated
Show resolved
Hide resolved
…ark301/Spark301Shims.scala Co-authored-by: Jason Lowe <jlowe@nvidia.com>
…Exec instead of SparkPlan Signed-off-by: Andy Grove <andygrove@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
val shuffleExchanges = ShimLoader.getSparkShims | ||
.findOperators(innerAdaptivePlan, _.isInstanceOf[ShuffleQueryStageExec]) | ||
.map(_.asInstanceOf[ShuffleQueryStageExec]) | ||
assert(shuffleExchanges.nonEmpty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe add a message to the assert for a more descriptive exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't generally been using assertion messages. I have updated the assertion to make it more specific though.
@@ -134,6 +135,10 @@ class Spark300Shims extends SparkShims { | |||
override def isShuffleExchangeLike(plan: SparkPlan): Boolean = | |||
plan.isInstanceOf[ShuffleExchangeExec] | |||
|
|||
override def getQueryStageRuntimeStatistics(plan: QueryStageExec): Statistics = { | |||
Statistics(0, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Statistics(0)
since None is the default for rowcount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Signed-off-by: Andy Grove <andygrove@nvidia.com>
build |
The new unit test failed against 3.1.x. I am investigating this. |
Signed-off-by: Andy Grove <andygrove@nvidia.com>
build |
Sorry, @abellina @gerashegalov @jlowe I had to push yet another change. CI is green now. Thanks. |
.../spark311/src/main/scala/com/nvidia/spark/rapids/shims/spark311/GpuShuffleExchangeExec.scala
Show resolved
Hide resolved
val shuffleExchanges = ShimLoader.getSparkShims | ||
.findOperators(innerAdaptivePlan, _.isInstanceOf[ShuffleQueryStageExec]) | ||
.map(_.asInstanceOf[ShuffleQueryStageExec]) | ||
assert(shuffleExchanges.length == 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: triple equals produces better diagnostics:
scala> assert(1 == 0)
org.scalatest.exceptions.TestFailedException
at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528)
at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527)
at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1387)
at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501)
... 59 elided
scala> assert(1 === 0)
org.scalatest.exceptions.TestFailedException: 1 did not equal 0
at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528)
at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527)
at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1387)
at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501)
... 59 elided
Signed-off-by: Andy Grove <andygrove@nvidia.com>
build |
Signed-off-by: Andy Grove <andygrove@nvidia.com>
build |
Expose row count statistics in GpuShuffleExchangeExec so that we can later use them as an input to the cost model