-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify the default value of spark.rapids.sql.explain as NOT_ON_GPU #5819
Conversation
de67451
to
225422b
Compare
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the only other thing here is we may want to disable for our tests so the logs don't get much bigger. thoughts @revans2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super concerned about the log size right now. But I have not tested it so I'm not totally sure what that impact would be.
I would like to see the docs updated though. The default getting started guide does not mention this config at all, and it might be confusing for users to see a WARNING message logged by default when it is mostly a performance issue.
Also there are several places in the logs that mention stetting this config, like https://github.com/NVIDIA/spark-rapids/blob/branch-22.08/docs/get-started/getting-started-on-prem.md where it says
If you want to see why an operation did not run on the GPU you can turn on the configuration: --conf spark.rapids.sql.explain=NOT_ON_GPU. A log message will then be emitted to the driver log as to why a Spark operation is not able to run on the GPU.
I think we should go through the docs and make sure anywhere that sql.explain
is mentioned that we make sure it is updated to match and that in the getting started guide and FAQ we mention what these WARNING messages are all about and why they are set to WARN.
@tgravescs Do we need to update the same code in spark2-sql-plugin? |
I simply searched sql.explain in branch-22.08, I think all the places look good. Most of them mentioned to set ALL to get more detailed explain logs. |
Signed-off-by: sinkinben <sinkinben@outlook.com>
225422b
to
bb50fd8
Compare
@tgravescs I have checked the docs related with "sql.explain", it seems that there is no effect on most of them. Shall I add some description about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to see getting started updated to talk about it, but that is minor and we can do it if we start to see questions about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a section on debugging in the getting started docs that references it: https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html#debugging
As far as adding more, it might be nice like Bobby said but I'm ok if we want to wait to see if users complain or its not clear.
Also I just realized that this will break the explain API logic in explainCatalystSQLPlan, where normally the explain API defaults to ALL unless its set in to something other than NONE in spark.rapids.sql.explain. So we need to update that to keep the default of ALL there.
…alue of rapids.sql.explain Signed-off-by: sinkinben <sinkinben@outlook.com>
Signed-off-by: sinkinben <sinkinben@outlook.com>
logWarning("spark.rapids.sql.explain is set to the default value 'NOT_ON_GPU' to print " + | ||
"operations not executed on GPU. Set to 'ALL' to print the complete query plan, " + | ||
"display whether each operation is placed on the GPU. Set to 'NONE' to suppress the " + | ||
"diagnostics about the query placement on the GPU.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move it inside the if branch if (conf.isSqlEnabled && conf.isSqlExecuteOnGPU) {
on L85 so it prints only if GPU execution is enabled (default).
And it should only be printed when the value for spark.rapids.sql.explain implies some output (this is an evolution from my prior review) to make it shorter.
How about around L85:
if (conf.isSqlEnabled && conf.isSqlExecuteOnGPU) {
logWarning("RAPIDS Accelerator is enabled, to disable GPU " +
s"support set `${RapidsConf.SQL_ENABLED}` to false.")
if (conf.explain != 'NONE') {
logWarning("spark.rapids.sql.explain is set to ${conf.explain}. Set to 'NONE' to suppress the " +
"diagnostics logging about the query placement on the GPU.")
}
} else ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the code here. Thanks.
Signed-off-by: sinkinben <sinkinben@outlook.com>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala
Outdated
Show resolved
Hide resolved
* Fix typo in `logPluginMode`. * Add a function `isSQLExplainExplicitlySet` to judge rapids.sql.explain is explicitly set by the users in `explainCatalystSQLPlan`. Signed-off-by: sinkinben <sinkinben@outlook.com>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: sinkinben <sinkinben@outlook.com>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: sinkinben <sinkinben@outlook.com>
2df0e7d
to
254555e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't force push your branch, it makes it hard to compare to the last set of comments to see what changed.
build |
@tgravescs Is it fine to write commit messages like this?
|
Signed-off-by: sinkinben sinkinben@outlook.com
NOT_ON_GPU
as the default value of "spark.rapids.sql.explain" inRapidsConf.scala
.