NVIDIA · viadea · Jun 3, 2022 · May 25, 2022 · May 25, 2022 · May 25, 2022
diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -307,11 +307,15 @@ Yes
 
 ### Are the R APIs for Spark supported?
 
-Yes, but we don't actively test them.
+Yes, but we don't actively test them, because the RAPIDS Accelerator hooks into Spark not at 
+the various language APIs but at the Catalyst level after all the various APIs have converged into 
+the DataFrame API.
 
 ### Are the Java APIs for Spark supported?
 
-Yes, but we don't actively test them.
+Yes, but we don't actively test them, because the RAPIDS Accelerator hooks into Spark not at
+the various language APIs but at the Catalyst level after all the various APIs have converged into
+the DataFrame API.
 
 ### Are the Scala APIs for Spark supported?
 
@@ -410,6 +414,14 @@ The Scala UDF byte-code analyzer is disabled by default and must be enabled by t
 [`spark.rapids.sql.udfCompiler.enabled`](configs.md#sql.udfCompiler.enabled) configuration
 setting.
 
+#### Optimize a row-based UDF in a GPU operation
+
+If the UDF can not be implemented by RAPIDS Accelerated UDFs or be automatically translated to
+Apache Spark operations, the RAPIDS Accelerator has an experimental feature to transfer only the
+data it needs between GPU and CPU inside a query operation, instead of falling this operation back 
+to CPU. This feature can be enabled by setting `spark.rapids.sql.rowBasedUDF.enabled` to true.
+
+
 ### Why is the size of my output Parquet/ORC file different?
 
 This can come down to a number of factors.  The GPU version often compresses data in smaller chunks
@@ -501,6 +513,12 @@ Below are some troubleshooting tips on GPU query performance issue:
   `spark.sql.files.maxPartitionBytes` and `spark.rapids.sql.concurrentGpuTasks` as these configurations can affect performance of queries significantly.
   Please refer to [Tuning Guide](./tuning-guide.md) for more details.
 
+
+### What is the default RMM pool allocator?
+
+Starting from 22.06, the default value for `spark.rapids.memory.gpu.pool` is changed to `ASYNC` from
+`ARENA` for CUDA 11.5+. For CUDA 11.4 and older, it will fall back to `ARENA`.
+
 ### I have more questions, where do I go? 
 We use github to track bugs, feature requests, and answer questions. File an
 [issue](https://github.com/NVIDIA/spark-rapids/issues/new/choose) for a bug or feature request. Ask

diff --git a/docs/additional-functionality/rapids-shuffle.md b/docs/additional-functionality/rapids-shuffle.md
@@ -298,7 +298,7 @@ In this section, we are using a docker container built using the sample dockerfi
     --conf spark.shuffle.manager=com.nvidia.spark.rapids.[shim package].RapidsShuffleManager \
     --conf spark.shuffle.service.enabled=false \
     --conf spark.dynamicAllocation.enabled=false \
-    --conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR} \
+    --conf spark.executor.extraClassPath=${SPARK_RAPIDS_PLUGIN_JAR} \
     --conf spark.executorEnv.UCX_ERROR_SIGNALS= \
     --conf spark.executorEnv.UCX_MEMTYPE_CACHE=n
     ```
@@ -310,7 +310,7 @@ In this section, we are using a docker container built using the sample dockerfi
     --conf spark.shuffle.manager=com.nvidia.spark.rapids.[shim package].RapidsShuffleManager \
     --conf spark.shuffle.service.enabled=false \
     --conf spark.dynamicAllocation.enabled=false \
-    --conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR} \
+    --conf spark.executor.extraClassPath=${SPARK_RAPIDS_PLUGIN_JAR} \
     --conf spark.executorEnv.UCX_ERROR_SIGNALS= \
     --conf spark.executorEnv.UCX_MEMTYPE_CACHE=n \
     --conf spark.executorEnv.UCX_IB_RX_QUEUE_LEN=1024 \

diff --git a/docs/additional-functionality/rapids-udfs.md b/docs/additional-functionality/rapids-udfs.md
@@ -189,7 +189,7 @@ exclusive mode to assign GPUs under Spark. To disable exclusive mode, use
 
     ```shell
     ...
-    --conf spark.rapids.python.gpu.enabled=true \
+    --conf spark.rapids.sql.python.gpu.enabled=true \
     ```
 
 Please note: every type of Pandas UDF on Spark is run by a specific Spark execution plan. RAPIDS

diff --git a/docs/demo/GCP/mortgage-xgboost4j-gpu-scala.ipynb b/docs/demo/GCP/mortgage-xgboost4j-gpu-scala.ipynb
@@ -62,7 +62,7 @@
     {
       "cell_type": "markdown",
       "metadata": {},
-      "source": "## Create a new spark session and load data\n\nA new spark session should be created to continue all the following spark operations.\n\nNOTE: in this notebook, the dependency jars have been loaded when installing toree kernel. Alternatively the jars can be loaded into notebook by [%AddJar magic](https://toree.incubator.apache.org/docs/current/user/faq/). However, there\u0027s one restriction for `%AddJar`: the jar uploaded can only be available when `AddJar` is called just after a new spark session is created. Do it as below:\n\n```scala\nimport org.apache.spark.sql.SparkSession\nval spark \u003d SparkSession.builder().appName(\"mortgage-GPU\").getOrCreate\n%AddJar file:/data/libs/cudf-XXX-cuda10.jar\n%AddJar file:/data/libs/rapids-4-spark-XXX.jar\n%AddJar file:/data/libs/xgboost4j_3.0-XXX.jar\n%AddJar file:/data/libs/xgboost4j-spark_3.0-XXX.jar\n// ...\n```\n\n##### Please note the new jar \"rapids-4-spark-XXX.jar\" is only needed for GPU version, you can not add it to dependence list for CPU version."
+      "source": "## Create a new spark session and load data\n\nA new spark session should be created to continue all the following spark operations.\n\nNOTE: in this notebook, the dependency jars have been loaded when installing toree kernel. Alternatively the jars can be loaded into notebook by [%AddJar magic](https://toree.incubator.apache.org/docs/current/user/faq/). However, there\u0027s one restriction for `%AddJar`: the jar uploaded can only be available when `AddJar` is called just after a new spark session is created. Do it as below:\n\n```scala\nimport org.apache.spark.sql.SparkSession\nval spark \u003d SparkSession.builder().appName(\"mortgage-GPU\").getOrCreate\n%AddJar file:/data/libs/rapids-4-spark-XXX.jar\n%AddJar file:/data/libs/xgboost4j_3.0-XXX.jar\n%AddJar file:/data/libs/xgboost4j-spark_3.0-XXX.jar\n// ...\n```\n\n##### Please note the new jar \"rapids-4-spark-XXX.jar\" is only needed for GPU version, you can not add it to dependence list for CPU version."
     },
     {
       "cell_type": "code",

diff --git a/docs/demo/GCP/mortgage-xgboost4j-gpu-scala.zpln b/docs/demo/GCP/mortgage-xgboost4j-gpu-scala.zpln
@@ -250,7 +250,7 @@
       "$$hashKey": "object:11091"
     },
     {
-      "text": "%md\n## Create a new spark session and load data\n\nA new spark session should be created to continue all the following spark operations.\n\nNOTE: in this notebook, the dependency jars have been loaded when installing toree kernel. Alternatively the jars can be loaded into notebook by [%AddJar magic](https://toree.incubator.apache.org/docs/current/user/faq/). However, there's one restriction for `%AddJar`: the jar uploaded can only be available when `AddJar` is called just after a new spark session is created. Do it as below:\n\n```scala\nimport org.apache.spark.sql.SparkSession\nval spark = SparkSession.builder().appName(\"mortgage-GPU\").getOrCreate\n%AddJar file:/data/libs/cudf-XXX-cuda10.jar\n%AddJar file:/data/libs/rapids-4-spark-XXX.jar\n%AddJar file:/data/libs/xgboost4j_3.0-XXX.jar\n%AddJar file:/data/libs/xgboost4j-spark_3.0-XXX.jar\n// ...\n```\n\n##### Please note the new jar \"rapids-4-spark-XXX.jar\" is only needed for GPU version, you can not add it to dependence list for CPU version.",
+      "text": "%md\n## Create a new spark session and load data\n\nA new spark session should be created to continue all the following spark operations.\n\nNOTE: in this notebook, the dependency jars have been loaded when installing toree kernel. Alternatively the jars can be loaded into notebook by [%AddJar magic](https://toree.incubator.apache.org/docs/current/user/faq/). However, there's one restriction for `%AddJar`: the jar uploaded can only be available when `AddJar` is called just after a new spark session is created. Do it as below:\n\n```scala\nimport org.apache.spark.sql.SparkSession\nval spark = SparkSession.builder().appName(\"mortgage-GPU\").getOrCreate\n%AddJar file:/data/libs/rapids-4-spark-XXX.jar\n%AddJar file:/data/libs/xgboost4j_3.0-XXX.jar\n%AddJar file:/data/libs/xgboost4j-spark_3.0-XXX.jar\n// ...\n```\n\n##### Please note the new jar \"rapids-4-spark-XXX.jar\" is only needed for GPU version, you can not add it to dependence list for CPU version.",
       "user": "anonymous",
       "dateUpdated": "2020-07-13T02:18:47+0000",
       "config": {
@@ -274,7 +274,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "<div class=\"markdown-body\">\n<h2>Create a new spark session and load data</h2>\n<p>A new spark session should be created to continue all the following spark operations.</p>\n<p>NOTE: in this notebook, the dependency jars have been loaded when installing toree kernel. Alternatively the jars can be loaded into notebook by <a href=\"https://toree.incubator.apache.org/docs/current/user/faq/\">%AddJar magic</a>. However, there&rsquo;s one restriction for <code>%AddJar</code>: the jar uploaded can only be available when <code>AddJar</code> is called just after a new spark session is created. Do it as below:</p>\n<pre><code class=\"language-scala\">import org.apache.spark.sql.SparkSession\nval spark = SparkSession.builder().appName(&quot;mortgage-GPU&quot;).getOrCreate\n%AddJar file:/data/libs/cudf-XXX-cuda10.jar\n%AddJar file:/data/libs/rapids-4-spark-XXX.jar\n%AddJar file:/data/libs/xgboost4j_3.0-XXX.jar\n%AddJar file:/data/libs/xgboost4j-spark_3.0-XXX.jar\n// ...\n</code></pre>\n<h5>Please note the new jar &ldquo;rapids-4-spark-XXX.jar&rdquo; is only needed for GPU version, you can not add it to dependence list for CPU version.</h5>\n\n</div>"
+            "data": "<div class=\"markdown-body\">\n<h2>Create a new spark session and load data</h2>\n<p>A new spark session should be created to continue all the following spark operations.</p>\n<p>NOTE: in this notebook, the dependency jars have been loaded when installing toree kernel. Alternatively the jars can be loaded into notebook by <a href=\"https://toree.incubator.apache.org/docs/current/user/faq/\">%AddJar magic</a>. However, there&rsquo;s one restriction for <code>%AddJar</code>: the jar uploaded can only be available when <code>AddJar</code> is called just after a new spark session is created. Do it as below:</p>\n<pre><code class=\"language-scala\">import org.apache.spark.sql.SparkSession\nval spark = SparkSession.builder().appName(&quot;mortgage-GPU&quot;).getOrCreate\n%AddJar file:/data/libs/rapids-4-spark-XXX.jar\n%AddJar file:/data/libs/xgboost4j_3.0-XXX.jar\n%AddJar file:/data/libs/xgboost4j-spark_3.0-XXX.jar\n// ...\n</code></pre>\n<h5>Please note the new jar &ldquo;rapids-4-spark-XXX.jar&rdquo; is only needed for GPU version, you can not add it to dependence list for CPU version.</h5>\n\n</div>"
           }
         ]
       },

diff --git a/docs/dev/nvtx_profiling.md b/docs/dev/nvtx_profiling.md
@@ -10,22 +10,7 @@ once captured can be visually analyzed using
 [NVIDIA NSight Systems](https://developer.nvidia.com/nsight-systems).
 This document is specific to the RAPIDS Spark Plugin profiling.
 
-### STEP 1:
-
-In order to get NVTX ranges to work you need to recompile your cuDF with NVTX flag enabled:
-
-```
-//from the cpp/build directory
-
-cmake .. -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX -DCMAKE_CXX11_ABI=ON -DUSE_NVTX=1
-
-make -j <num_threads>
-```
-If you are using the java cuDF layer, recompile your jar as usual using maven.
-```
-mvn clean package -DskipTests
-```
-### STEP 2:
+### STEPS:
 
 We need to pass a flag to the spark executors / driver in order to enable NVTX collection.
 This can be done for spark shell by adding the following configuration keys: