Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify the default value of spark.rapids.sql.explain as NOT_ON_GPU #5819

Merged
merged 7 commits into from
Jun 24, 2022
7 changes: 4 additions & 3 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,9 +147,10 @@ An Apache Spark plan is transformed and optimized into a set of operators called
This plan is then run through a set of rules to translate it to a version that runs on the GPU.
If you want to know what will run on the GPU and what will not along with an explanation why you
can set [spark.rapids.sql.explain](configs.md#sql.explain) to `ALL`. If you just want to see the
operators not on the GPU you may set it to `NOT_ON_GPU`. Be aware that some queries end up being
broken down into multiple jobs, and in those cases a separate log message might be output for each
job. These are logged each time a query is compiled into an `RDD`, not just when the job runs.
operators not on the GPU you may set it to `NOT_ON_GPU` (which is the default setting value). Be
aware that some queries end up being broken down into multiple jobs, and in those cases a separate
log message might be output for each job. These are logged each time a query is compiled into an
`RDD`, not just when the job runs.
Because of this calling `explain` on a DataFrame will also trigger this to be logged.

The format of each line follows the pattern
Expand Down
2 changes: 1 addition & 1 deletion docs/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Name | Description | Default Value
<a name="sql.csv.read.float.enabled"></a>spark.rapids.sql.csv.read.float.enabled|CSV reading is not 100% compatible when reading floats.|true
<a name="sql.decimalOverflowGuarantees"></a>spark.rapids.sql.decimalOverflowGuarantees|FOR TESTING ONLY. DO NOT USE IN PRODUCTION. Please see the decimal section of the compatibility documents for more information on this config.|true
<a name="sql.enabled"></a>spark.rapids.sql.enabled|Enable (true) or disable (false) sql operations on the GPU|true
<a name="sql.explain"></a>spark.rapids.sql.explain|Explain why some parts of a query were not placed on a GPU or not. Possible values are ALL: print everything, NONE: print nothing, NOT_ON_GPU: print only parts of a query that did not go on the GPU|NONE
<a name="sql.explain"></a>spark.rapids.sql.explain|Explain why some parts of a query were not placed on a GPU or not. Possible values are ALL: print everything, NONE: print nothing, NOT_ON_GPU: print only parts of a query that did not go on the GPU|NOT_ON_GPU
<a name="sql.fast.sample"></a>spark.rapids.sql.fast.sample|Option to turn on fast sample. If enable it is inconsistent with CPU sample because of GPU sample algorithm is inconsistent with CPU.|false
<a name="sql.format.avro.enabled"></a>spark.rapids.sql.format.avro.enabled|When set to true enables all avro input and output acceleration. (only input is currently supported anyways)|false
<a name="sql.format.avro.multiThreadedRead.maxNumFilesParallel"></a>spark.rapids.sql.format.avro.multiThreadedRead.maxNumFilesParallel|A limit on the maximum number of files per task processed in parallel on the CPU side before the file is sent to the GPU. This affects the amount of host memory used when reading the files in parallel. Used with MULTITHREADED reader, see spark.rapids.sql.format.avro.reader.type|2147483647
Expand Down
5 changes: 5 additions & 0 deletions docs/get-started/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ will vary depending on your cluster manager. Here are some example configs:
- `--conf spark.task.resource.gpu.amount=1`
- Specify a GPU discovery script (required on YARN and K8S):
- `--conf spark.executor.resource.gpu.discoveryScript=./getGpusResources.sh`
- Explain why some operations of a query were not placed on a GPU or not:
- `--conf spark.rapids.sql.explain=ALL` will display whether each operation is placed on GPU.
- `--conf spark.rapids.sql.explain=NONE` will disable the log of `rapids.sql.explain`.
sinkinben marked this conversation as resolved.
Show resolved Hide resolved
- `--conf spark.rapids.sql.explain=NOT_ON_GPU` will display only parts that did not go on the GPU,
and it's the default setting.

See the deployment specific sections for more details and restrictions. Note that
`spark.task.resource.gpu.amount` can be a decimal amount, so if you want multiple tasks to be run
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1276,7 +1276,7 @@ object RapidsConf {
"values are ALL: print everything, NONE: print nothing, NOT_ON_GPU: print only parts of " +
"a query that did not go on the GPU")
.stringConf
.createWithDefault("NONE")
.createWithDefault("NOT_ON_GPU")

val SHIMS_PROVIDER_OVERRIDE = conf("spark.rapids.shims-provider-override")
.internal()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4142,7 +4142,10 @@ object GpuOverrides extends Logging {
* GPUs.
*/
private def explainCatalystSQLPlan(updatedPlan: SparkPlan, conf: RapidsConf): Unit = {
val explainSetting = if (conf.shouldExplain) {
// Since we set "NOT_ON_GPU" as the default value of spark.rapids.sql.explain, here we keep
// "ALL" as default value of "explainSetting", unless spark.rapids.sql.explain is changed
// by the user.
val explainSetting = if (conf.isConfExplicitlySet(RapidsConf.EXPLAIN.key)) {
conf.explain
} else {
"ALL"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,12 @@ object RapidsPluginUtils extends Logging {
if (conf.isSqlEnabled && conf.isSqlExecuteOnGPU) {
logWarning("RAPIDS Accelerator is enabled, to disable GPU " +
s"support set `${RapidsConf.SQL_ENABLED}` to false.")

if (conf.explain != "NONE") {
logWarning(s"spark.rapids.sql.explain is set to `${conf.explain}`. Set it to 'NONE' to " +
"suppress the diagnostics logging about the query placement on the GPU.")
}

} else if (conf.isSqlEnabled && conf.isSqlExplainOnlyEnabled) {
logWarning("RAPIDS Accelerator is in explain only mode, to disable " +
s"set `${RapidsConf.SQL_ENABLED}` to false. To change the mode, " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1279,7 +1279,7 @@ object RapidsConf {
"values are ALL: print everything, NONE: print nothing, NOT_ON_GPU: print only parts of " +
"a query that did not go on the GPU")
.stringConf
.createWithDefault("NONE")
.createWithDefault("NOT_ON_GPU")
sinkinben marked this conversation as resolved.
Show resolved Hide resolved

val SHIMS_PROVIDER_OVERRIDE = conf("spark.rapids.shims-provider-override")
.internal()
Expand Down Expand Up @@ -1965,4 +1965,11 @@ class RapidsConf(conf: Map[String, String]) extends Logging {
// user-provided value takes precedence, then look in defaults map
conf.get(key).orElse(optimizerDefaults.get(key)).map(toDouble(_, key))
}

/**
* To judge whether "key" is explicitly set by the users.
*/
def isConfExplicitlySet(key: String): Boolean = {
shouldExplain && conf.contains(key)
sinkinben marked this conversation as resolved.
Show resolved Hide resolved
}
}