Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the explanation of why the operator will not work on GPU #4328

Merged
merged 6 commits into from
Dec 11, 2021

Conversation

HaoYang670
Copy link
Collaborator

@HaoYang670 HaoYang670 commented Dec 8, 2021

Signed-off-by: HaoYang670 13716567376yh@gmail.com
close #4088

  • Change explanation in log messages for unsupported operators. Instead of printing “NOT_FOUND”, we tell users that the GPU does not support the operator.

HaoYang670 and others added 2 commits December 8, 2021 16:53
Signed-off-by: remzi <13716567376yh@gmail.com>
@HaoYang670
Copy link
Collaborator Author

build

@@ -124,7 +124,7 @@ the driver logs with `spark.rapids.sql.explain=all`.
this version:

```
!NOT_FOUND <RowDataSourceScanExec> cannot run on GPU because no GPU enabled version of operator class org.apache.spark.sql.execution.RowDataSourceScanExec could be found
!Unsupported operator: <RowDataSourceScanExec> cannot run on GPU because no GPU enabled version of operator class org.apache.spark.sql.execution.RowDataSourceScanExec could be found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still says "no GPU enabled version of operator class .... could be found". The original problem that triggered #4088 was that the "class not found" message led the user to believe this was a classloader/classpath issue rather than a normal situation where we know this Catalyst operation is not supported.

@HaoYang670
Copy link
Collaborator Author

build

@HaoYang670 HaoYang670 changed the title Replace 'NOT_FOUND' by 'Unsupported operator` Change the explanation of why the operator will not work on GPU Dec 9, 2021
Signed-off-by: remzi <13716567376yh@gmail.com>
@HaoYang670
Copy link
Collaborator Author

build

@res-life res-life self-requested a review December 9, 2021 10:49
@@ -1307,7 +1307,7 @@ final class RuleNotFoundExprMeta[INPUT <: Expression](
extends ExprMeta[INPUT](expr, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagExprForGpu(): Unit =
willNotWorkOnGpu(s"no GPU enabled version of expression ${expr.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${expr.getClass}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need the "currently"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "currently" is appropriate, as without it the user may incorrectly believe the operation is impossible to support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I didn't see the context of the other changes. I agree that we should be consistent here. If we're not using "currently" in the other instances when we state something is unsupported then we shouldn't state it only here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I made some inconsistent changes. I will add "currently" to all changes.

@@ -1307,7 +1307,7 @@ final class RuleNotFoundExprMeta[INPUT <: Expression](
extends ExprMeta[INPUT](expr, conf, parent, new NoRuleDataFromReplacementRule) {

override def tagExprForGpu(): Unit =
willNotWorkOnGpu(s"no GPU enabled version of expression ${expr.getClass} could be found")
willNotWorkOnGpu(s"GPU does not currently support the operator ${expr.getClass}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I didn't see the context of the other changes. I agree that we should be consistent here. If we're not using "currently" in the other instances when we state something is unsupported then we shouldn't state it only here.

@@ -46,7 +46,7 @@ trait DataFromReplacementRule {
* A version of DataFromReplacementRule that is used when no replacement rule can be found.
*/
final class NoRuleDataFromReplacementRule extends DataFromReplacementRule {
override val operationName: String = "NOT_FOUND"
override val operationName: String = "Not supported by GPU: "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant, as it later explains on the same line that the operation is not supported. We don't put "Not supported by GPU: " in front of other lines that will not end up on the GPU, so it's weird to call out this specially. I think this should just be an empty string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your advice. I will change it.

…ght be supported by GPU in the future

Signed-off-by: remzi <13716567376yh@gmail.com>
@HaoYang670
Copy link
Collaborator Author

build

1 similar comment
@pxLi
Copy link
Collaborator

pxLi commented Dec 10, 2021

build

@pxLi
Copy link
Collaborator

pxLi commented Dec 10, 2021

re-triggered due to weird #4316

@sameerz sameerz added the task Work required that improves the product but is not user facing label Dec 10, 2021
@HaoYang670 HaoYang670 merged commit bdfc6ce into NVIDIA:branch-22.02 Dec 11, 2021
@HaoYang670 HaoYang670 deleted the issue4088 branch December 11, 2021 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Change Explain to not print NOT_FOUND and to have some known CPU only tasks
5 participants