-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix divide-by-zero in GpuAverage with ansi mode #2130
Conversation
…erage Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -330,7 +333,8 @@ object GpuDivideUtil { | |||
} | |||
|
|||
// This is for doubles and floats... | |||
case class GpuDivide(left: Expression, right: Expression) extends GpuDivModLike { | |||
case class GpuDivide(left: Expression, right: Expression, | |||
override val failOnErrorOverride: Option[Boolean] = None) extends GpuDivModLike { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good but I was curious why we are using a different pattern to Spark which just has a plain boolean argument with a default value, rather than using an option. Was this necessary because of the way we're using the shim layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I tried other ways but couldn't think of something cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a 3-value logic of Option[Boolean].
I think we can do it almost like Spark.
let us undo the change to DivModLike just make failOnError non-lazy
and define
case class GpuDivide(
left: Expression, right: Expression,
override val failOnError: Boolean = ShimLoader.getSparkShims.shouldFailDivByZero()
) extends GpuDivModLike {
build |
override lazy val evaluateExpression: GpuExpression = GpuDivide( | ||
GpuCast(cudfSum, DoubleType), | ||
GpuCast(cudfCount, DoubleType)) | ||
GpuCast(cudfCount, DoubleType), Some(false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: best practice Option(false)
but I think we can get away with a simple Boolean
@@ -330,7 +333,8 @@ object GpuDivideUtil { | |||
} | |||
|
|||
// This is for doubles and floats... | |||
case class GpuDivide(left: Expression, right: Expression) extends GpuDivModLike { | |||
case class GpuDivide(left: Expression, right: Expression, | |||
override val failOnErrorOverride: Option[Boolean] = None) extends GpuDivModLike { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a 3-value logic of Option[Boolean].
I think we can do it almost like Spark.
let us undo the change to DivModLike just make failOnError non-lazy
and define
case class GpuDivide(
left: Expression, right: Expression,
override val failOnError: Boolean = ShimLoader.getSparkShims.shouldFailDivByZero()
) extends GpuDivModLike {
@@ -269,7 +269,10 @@ object GpuDivModLike { | |||
} | |||
|
|||
trait GpuDivModLike extends CudfBinaryArithmetic { | |||
lazy val failOnError: Boolean = ShimLoader.getSparkShims.shouldFailDivByZero() | |||
val failOnErrorOverride: Option[Boolean] = None | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let us try without failOnErrorOverride
, just make failOnError non-lazy, so we can override it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can override a lazy val
override lazy val failOnError: Boolean = failOnErrorOverride.getOrElse(GpuDivModeLike.failOnError)
But I am fine with keeping this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I just meant you can't use lazy as a parameter.
@gerashegalov we don't have ANSI semantics in several of the operations. In the Doing a quick search in Spark, I am finding that:
So the only class that really needs it is That said, if we want to match Spark more, I can look into adding the |
@abellina this is the change I am suggesting in the nut shell. https://github.com/abellina/spark-rapids/compare/agg/fix_ansi_avg...gerashegalov:agg/fix_ansi_avg?expand=1 @revans2 if we made failOnError non-lazy then we don't need an override inside the GpuDivide case class body but could do it just as a param |
@gerashegalov updated PR to incorporate your suggestion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
build |
Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
Fixes: #2078