Skip to content

Commit

Permalink
Branch 0.4 FAQ reorg (NVIDIA#1725)
Browse files Browse the repository at this point in the history
* Reorganization of FAQ, addition of .toc

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Fix links and capitalization in the FAQ

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>
  • Loading branch information
sameerz authored Feb 17, 2021
1 parent 6bef3a0 commit 0dfc2a8
Showing 1 changed file with 97 additions and 93 deletions.
190 changes: 97 additions & 93 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,76 @@ nav_order: 11
---
# Frequently Asked Questions

### Why does `explain()` show that the GPU will be used even after setting `spark.rapids.sql.enabled` to `false`?
* TOC
{:toc}

Apache Spark caches what is used to build the output of the `explain()` function. That cache has no
knowledge about configs, so it may return results that are not up to date with the current config
settings. This is true of all configs in Spark. If you changed
`spark.sql.autoBroadcastJoinThreshold` after running `explain()` on a `DataFrame`, the resulting
query would not change to reflect that config and still show a `SortMergeJoin` even though the new
config might have changed to be a `BroadcastHashJoin` instead. When actually running something like
with `collect`, `show` or `write` a new `DataFrame` is constructed causing Spark to re-plan the
query. This is why `spark.rapids.sql.enabled` is still respected when running, even if explain shows
stale results.
### What versions of Apache Spark does the RAPIDS Accelerator for Apache Spark support?

The RAPIDS Accelerator for Apache Spark requires version 3.0.0 or 3.0.1 of Apache Spark. Because the
plugin replaces parts of the physical plan that Apache Spark considers to be internal the code for
those plans can change even between bug fix releases. As a part of our process, we try to stay on
top of these changes and release updates as quickly as possible.

### Which distributions are supported?

The RAPIDS Accelerator for Apache Spark officially supports [Apache
Spark](get-started/getting-started-on-prem.md), [AWS EMR
6.2.0](get-started/getting-started-aws-emr.md), [Databricks Runtime
7.3](get-started/getting-started-databricks.md) and [Google Cloud
Dataproc](get-started/getting-started-gcp.md). Most distributions based off of Apache Spark 3.0.0
should work, but because the plugin replaces parts of the physical plan that Apache Spark considers
to be internal the code for those plans can change from one distribution to another. We are working
with most cloud service providers to set up testing and validation on their distributions.

### What CUDA versions are supported?

CUDA 10.1, 10.2 and 11.0 are currently supported, but you need to download the cudf jar that
corresponds to the version you are using. Please look [here](download.md) for download
links for the latest release.

### What is the right hardware setup to run GPU accelerated Spark?

Reference architectures should be available around Q1 2021.

### What parts of Apache Spark are accelerated?

Currently a limited set of SQL and DataFrame operations are supported, please see the
[configs](configs.md) and [supported operations](supported_ops.md) for a more complete list of what
is supported. Some of structured streaming is likely to be accelerated, but it has not been an area
of focus right now. Other areas like MLLib, GraphX or RDDs are not accelerated.

### What is the road-map like?

Please look at the github repository
[https://github.com/nvidia/spark-rapids](https://github.com/nvidia/spark-rapids). It contains issue
tracking and planning for sprints and releases.

### How much faster will my query run?

Any single operator isn’t a fixed amount faster. So there is no simple algorithm to see how much
faster a query will run. In addition, Apache Spark can store intermediate data to disk and send it
across the network, both of which we typically see as bottlenecks in real world queries. Generally
for complicated queries where all the processing can run on the GPU we see between 3x and 7x
speedup, with a 4x speedup typical. We have seen as high as 100x in some specific cases.

### What operators are best suited for the GPU?

* Group by operations with high cardinality
* Joins with a high cardinality
* Sorts with a high cardinality
* Window operations, especially for large windows
* Complicated processing
* Writing Parquet/ORC
* Reading CSV
* Transcoding (reading an input file and doing minimal processing before writing it out again,
possibly in a different format, like CSV to Parquet)

### Are there initialization costs?

From our tests the GPU typically takes about 2 to 3 seconds to initialize when an executor first
starts. If you are only going to run a single query that only takes a few seconds to run this can
be problematic. In general if you are going to do 30 seconds or more of processing within a single
session the overhead can be amortized.

### How can I tell what will run on the GPU and what will not run on it?
<a name="explain"></a>
Expand Down Expand Up @@ -88,7 +147,7 @@ few places where this is not the case.
this we want to process larger batches of data when possible. These operators will try to combine
smaller batches of data into fewer, larger batches to process more efficiently.

* `SortMergeJoin` - The RAPIDS accelerator does not support sort merge joins yet. For now, we
* `SortMergeJoin` - The RAPIDS Accelerator does not support sort merge joins yet. For now, we
translate sort merge joins into shuffled hash joins. Because of this there are times when sorts
may be removed or other sorts added to meet the ordering requirements of the query.

Expand All @@ -98,89 +157,21 @@ few places where this is not the case.
these operations into a single stage the metrics were confusing to understand. Instead, we split
the single stage up into multiple smaller parts, so the metrics are clearer.

### What versions of Apache Spark does the RAPIDS Accelerator for Apache Spark support?

The RAPIDS Accelerator for Apache Spark requires version 3.0.0 or 3.0.1 of Apache Spark. Because the
plugin replaces parts of the physical plan that Apache Spark considers to be internal the code for
those plans can change even between bug fix releases. As a part of our process, we try to stay on
top of these changes and release updates as quickly as possible.

### Which distributions are supported?

The RAPIDS Accelerator for Apache Spark officially supports [Apache
Spark](get-started/getting-started-on-prem.md), [AWS EMR
6.2.0](get-started/getting-started-aws-emr.md), [Databricks Runtime
7.3](get-started/getting-started-databricks.md) and [Google Cloud
Dataproc](get-started/getting-started-gcp.md). Most distributions based off of Apache Spark 3.0.0
should work, but because the plugin replaces parts of the physical plan that Apache Spark considers
to be internal the code for those plans can change from one distribution to another. We are working
with most cloud service providers to set up testing and validation on their distributions.

### What is the right hardware setup to run GPU accelerated Spark?

Reference architectures should be available around Q1 2021.

### What CUDA versions are supported?

CUDA 10.1, 10.2 and 11.0 are currently supported, but you need to download the cudf jar that
corresponds to the version you are using. Please look [here][download.md] for download
links for the latest release.

### What parts of Apache Spark are accelerated?

Currently a limited set of SQL and DataFrame operations are supported, please see the
[configs](configs.md) for a more complete list of what is supported. Some of structured streaming
is likely to be accelerated, but it has not been an area of focus right now. Other areas like
MLLib, GraphX or RDDs are not accelerated.

### What is the road-map like?

Please look at the github repository https://github.com/nvidia/spark-rapids It contains
issue tracking and planning for sprints and releases.

### How much faster will my query run?

Any single operator isn’t a fixed amount faster. So there is no simple algorithm to see how much
faster a query will run. In addition, Apache Spark can store intermediate data to disk and send it
across the network, both of which we typically see as bottlenecks in real world queries. Generally
for complicated queries where all the processing can run on the GPU we see between 3x and 7x
speedup, with a 4x speedup typical. We have seen as high as 100x in some specific cases.

### What operators are best suited for the GPU?

* Group by operations with high cardinality
* Joins with a high cardinality
* Sorts with a high cardinality
* Window operations, especially for large windows
* Complicated processing
* Writing Parquet/ORC
* Reading CSV
* Transcoding (reading an input file and doing minimal processing before writing it out again,
possibly in a different format, like CSV to Parquet)

### Are there initialization costs?

From our tests the GPU typically takes about 2 to 3 seconds to initialize when an executor first
starts. If you are only going to run a single query that only takes a few seconds to run this can
be problematic. In general if you are going to do 30 seconds or more of processing within a single
session the overhead can be amortized.

### Why is the size of my output Parquet/ORC file different?
### Why does `explain()` show that the GPU will be used even after setting `spark.rapids.sql.enabled` to `false`?

This can come down to a number of factors. The GPU version often compresses data in smaller chunks
to get more parallelism and performance. This can result in larger files in some instances. We have
also seen instances where the ordering of the data can have a big impact on the output size of the
files. Spark tends to prefer sort based joins, and in some cases sort based aggregations, whereas
the GPU versions are all hash based. This means that the resulting data can come out in a different
order for the CPU and the GPU. This is not wrong, but can make the size of the output data
different because of compression. Users can turn on
(spark.rapids.sql.hashOptimizeSort.enabled)[configs.md#sql.hashOptimizeSort.enabled] to have
the GPU try to replicate more closely what the output ordering would have been if sort were used,
like on the CPU.
Apache Spark caches what is used to build the output of the `explain()` function. That cache has no
knowledge about configs, so it may return results that are not up to date with the current config
settings. This is true of all configs in Spark. If you changed
`spark.sql.autoBroadcastJoinThreshold` after running `explain()` on a `DataFrame`, the resulting
query would not change to reflect that config and still show a `SortMergeJoin` even though the new
config might have changed to be a `BroadcastHashJoin` instead. When actually running something like
with `collect`, `show` or `write` a new `DataFrame` is constructed causing Spark to re-plan the
query. This is why `spark.rapids.sql.enabled` is still respected when running, even if explain shows
stale results.

### How are failures handled?

The RAPIDS accelerator does not change the way failures are normally handled by Apache Spark.
The RAPIDS Accelerator does not change the way failures are normally handled by Apache Spark.

### How does the Spark scheduler decide what to do on the GPU vs the CPU?

Expand Down Expand Up @@ -230,7 +221,7 @@ Yes
The GPU is not needed on the driver and there is no benefit to having one available on the driver
for the RAPIDS plugin.

### How does the performance compare to DataBricks' DeltaEngine?
### How does the performance compare to Databricks' DeltaEngine?

We have not evaluated the performance yet. DeltaEngine is not open source, so any analysis needs to
be done with Databricks in some form. When DeltaEngine is generally available and the terms of
Expand Down Expand Up @@ -266,7 +257,7 @@ There are multiple reasons why this a problematic configuration:
- CUDA context switches between processes sharing a single GPU can be expensive
- Each executor would have a fraction of the GPU memory available for processing

### Is [Multi Instance GPU (MIG)](https://docs.nvidia.com/cuda/mig/index.html) supported?
### Is [Multi-Instance GPU (MIG)](https://docs.nvidia.com/cuda/mig/index.html) supported?

Yes, but it requires support from the underlying cluster manager to isolate the MIG GPU instance
for each executor (e.g.: by setting `CUDA_VISIBLE_DEVICES` or other means).
Expand Down Expand Up @@ -299,6 +290,19 @@ The Scala UDF byte-code analyzer is disabled by default and must be enabled by t
[`spark.rapids.sql.udfCompiler.enabled`](configs.md#sql.udfCompiler.enabled) configuration
setting.

### Why is the size of my output Parquet/ORC file different?

This can come down to a number of factors. The GPU version often compresses data in smaller chunks
to get more parallelism and performance. This can result in larger files in some instances. We have
also seen instances where the ordering of the data can have a big impact on the output size of the
files. Spark tends to prefer sort based joins, and in some cases sort based aggregations, whereas
the GPU versions are all hash based. This means that the resulting data can come out in a different
order for the CPU and the GPU. This is not wrong, but can make the size of the output data
different because of compression. Users can turn on
[spark.rapids.sql.hashOptimizeSort.enabled](configs.md#sql.hashOptimizeSort.enabled) to have
the GPU try to replicate more closely what the output ordering would have been if sort were used,
like on the CPU.

### Why am I getting an error when trying to use pinned memory?

```
Expand All @@ -318,7 +322,7 @@ To fix it you can either disable the IOMMU, or you can disable using pinned memo

### Is speculative execution supported?

Yes, speculative execution in Spark is fine with the RAPIDS accelerator plugin.
Yes, speculative execution in Spark is fine with the RAPIDS Accelerator plugin.

As with all speculative execution, it may or may not be beneficial depending on the nature of why a
particular task is slow and how easily speculation is triggered. You should monitor your Spark jobs
Expand Down

0 comments on commit 0dfc2a8

Please sign in to comment.