Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the explanation of why the operator will not work on GPU #4328

Merged
merged 6 commits into from
Dec 11, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/get-started/getting-started-workload-qualification.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ the driver logs with `spark.rapids.sql.explain=all`.
this version:

```
!NOT_FOUND <RowDataSourceScanExec> cannot run on GPU because no GPU enabled version of operator class org.apache.spark.sql.execution.RowDataSourceScanExec could be found
! <RowDataSourceScanExec> cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.RowDataSourceScanExec
```

This log can show you which operators (on what data type) can not run on GPU and the reason.
Expand Down Expand Up @@ -152,7 +152,7 @@ analysis.
For example, the log lines starting with `!` is the so-called not-supported messages:
```
!Exec <GenerateExec> cannot run on GPU because not all expressions can be replaced
!NOT_FOUND <ReplicateRows> replicaterows(sum#99L, gender#76) cannot run on GPU because no GPU enabled version of expression class
! <ReplicateRows> replicaterows(sum#99L, gender#76) cannot run on GPU because GPU does not currently support the operator ReplicateRows
```
The indentation indicates the parent and child relationship for those expressions.
If not all of the children expressions can run on GPU, the parent can not run on GPU either.
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/src/main/python/explain_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def do_explain(spark):
df2 = df.select(slen("name").alias("slen(name)"), to_upper("name"), add_one("age"))
explain_str = spark.sparkContext._jvm.com.nvidia.spark.rapids.ExplainPlan.explainPotentialGpuPlan(df2._jdf, "ALL")
# udf shouldn't be on GPU
udf_str_not = 'cannot run on GPU because no GPU enabled version of operator class org.apache.spark.sql.execution.python.BatchEvalPythonExec'
udf_str_not = 'cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.python.BatchEvalPythonExec'
assert udf_str_not in explain_str
not_on_gpu_str = spark.sparkContext._jvm.com.nvidia.spark.rapids.ExplainPlan.explainPotentialGpuPlan(df2._jdf, "NOT")
assert udf_str_not in not_on_gpu_str
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ trait DataFromReplacementRule {
* A version of DataFromReplacementRule that is used when no replacement rule can be found.
*/
final class NoRuleDataFromReplacementRule extends DataFromReplacementRule {
override val operationName: String = "NOT_FOUND"
override val operationName: String = ""

override def confKey = "NOT_FOUND"

Expand Down Expand Up @@ -463,7 +463,7 @@ final class RuleNotFoundPartMeta[INPUT <: Partitioning](
extends PartMeta[INPUT](part, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagPartForGpu(): Unit = {
willNotWorkOnGpu(s"no GPU enabled version of partitioning ${part.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${part.getClass}")
}

override def convertToGpu(): GpuPartitioning =
Expand Down Expand Up @@ -498,7 +498,7 @@ final class RuleNotFoundScanMeta[INPUT <: Scan](
extends ScanMeta[INPUT](scan, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagSelfForGpu(): Unit = {
willNotWorkOnGpu(s"no GPU enabled version of scan ${scan.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${scan.getClass}")
}

override def convertToGpu(): Scan =
Expand Down Expand Up @@ -534,7 +534,7 @@ final class RuleNotFoundDataWritingCommandMeta[INPUT <: DataWritingCommand](
extends DataWritingCommandMeta[INPUT](cmd, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagSelfForGpu(): Unit = {
willNotWorkOnGpu(s"no GPU accelerated version of command ${cmd.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${cmd.getClass}")
}

override def convertToGpu(): GpuDataWritingCommand =
Expand Down Expand Up @@ -795,7 +795,7 @@ final class RuleNotFoundSparkPlanMeta[INPUT <: SparkPlan](
extends SparkPlanMeta[INPUT](plan, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagPlanForGpu(): Unit =
willNotWorkOnGpu(s"no GPU enabled version of operator ${plan.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${plan.getClass}")

override def convertToGpu(): GpuExec =
throw new IllegalStateException("Cannot be converted to GPU")
Expand Down Expand Up @@ -1307,7 +1307,7 @@ final class RuleNotFoundExprMeta[INPUT <: Expression](
extends ExprMeta[INPUT](expr, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagExprForGpu(): Unit =
willNotWorkOnGpu(s"no GPU enabled version of expression ${expr.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${expr.getClass}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need the "currently"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "currently" is appropriate, as without it the user may incorrectly believe the operation is impossible to support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I didn't see the context of the other changes. I agree that we should be consistent here. If we're not using "currently" in the other instances when we state something is unsupported then we shouldn't state it only here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I made some inconsistent changes. I will add "currently" to all changes.


override def convertToGpu(): GpuExpression =
throw new IllegalStateException("Cannot be converted to GPU")
Expand Down