Skip to content

Commit

Permalink
RAPIDS accelerated Spark Scala UDF support (#1636)
Browse files Browse the repository at this point in the history
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
  • Loading branch information
jlowe authored Feb 1, 2021
1 parent ef6d3a7 commit d6be108
Show file tree
Hide file tree
Showing 25 changed files with 507 additions and 118 deletions.
6 changes: 3 additions & 3 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,10 +252,10 @@ can throw at it.
The RAPIDS Accelerator provides the following solutions for running
user-defined functions on the GPU:

#### RAPIDS-Accelerated UDFs
#### RAPIDS Accelerated UDFs

UDFs can provide a RAPIDS-accelerated implementation which allows the RAPIDS Accelerator to perform
the operation on the GPU. See the [RAPIDS-accelerated UDF documentation](additional-functionality/rapids-udfs.md)
UDFs can provide a RAPIDS accelerated implementation which allows the RAPIDS Accelerator to perform
the operation on the GPU. See the [RAPIDS accelerated UDF documentation](additional-functionality/rapids-udfs.md)
for details.

#### Automatic Translation of Scala UDFs to Apache Spark Operations
Expand Down
59 changes: 35 additions & 24 deletions docs/additional-functionality/rapids-udfs.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
layout: page
title: RAPIDS-Accelerated User-Defined Functions
title: RAPIDS Accelerated User-Defined Functions
parent: Additional Functionality
nav_order: 3
---
# RAPIDS-Accelerated User-Defined Functions
# RAPIDS Accelerated User-Defined Functions

This document describes how UDFs can provide a RAPIDS-accelerated
This document describes how UDFs can provide a RAPIDS accelerated
implementation alongside the CPU implementation, enabling the
RAPIDS Accelerator to perform the user-defined operation on the GPU.

Expand All @@ -28,26 +28,26 @@ with the RAPIDS Accelerator. This implementation can then be invoked by the
RAPIDS Accelerator when a corresponding query step using the UDF executes
on the GPU.

## Limitations of RAPIDS-Accelerated UDFs
## Limitations of RAPIDS Accelerated UDFs

The RAPIDS Accelerator only supports RAPIDS-accelerated forms of regular
Hive UDFs. Other forms of Spark UDFs are not supported, such as:
The RAPIDS Accelerator only supports RAPIDS accelerated forms of the
following UDF types:
- Scala UDFs implementing a `Function` interface and registered via `SparkSession.udf.register`
- [Simple](https://github.com/apache/hive/blob/cb213d88304034393d68cc31a95be24f5aac62b6/ql/src/java/org/apache/hadoop/hive/ql/exec/UDF.java)
or
[Generic](https://github.com/apache/hive/blob/cb213d88304034393d68cc31a95be24f5aac62b6/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java)
Hive UDFs

Other forms of Spark UDFs are not supported, such as:
- Spark Java UDFs (i.e.: derived from `org.apache.spark.sql.api.java.UDF`* interfaces)
- Hive Aggregate Function (UDAF)
- Hive Tabular Function (UDTF)
- Lambda functions and others registered via `SparkSession.udf`
- Functions created with `org.apache.spark.sql.functions.udf`

## Adding GPU Implementations to Hive UDFs
- Lambda functions

As mentioned in the [Limitations](#limitations-of-rapids-accelerated-udfs)
section, the RAPIDS Accelerator only detects GPU implementations for Hive
regular UDFs. The Hive UDF can be either
[simple](https://github.com/apache/hive/blob/cb213d88304034393d68cc31a95be24f5aac62b6/ql/src/java/org/apache/hadoop/hive/ql/exec/UDF.java)
or
[generic](https://github.com/apache/hive/blob/cb213d88304034393d68cc31a95be24f5aac62b6/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
## Adding GPU Implementations to UDFs

The RAPIDS Accelerator will detect a GPU implementation if the UDF class
implements the
For supported UDFs, the RAPIDS Accelerator will detect a GPU implementation
if the UDF class implements the
[RapidsUDF](../sql-plugin/src/main/java/com/nvidia/spark/RapidsUDF.java)
interface. This interface requires implementing the following method:

Expand All @@ -71,7 +71,7 @@ must not make any assumptions on the number of input rows.

#### Scalar Inputs

Passing scalar inputs to a RAPIDS-accelerated UDF is supported with
Passing scalar inputs to a RAPIDS accelerated UDF is supported with
limitations. The scalar value will be replicated into a full column before
being passed to `evaluateColumnar`. Therefore the UDF implementation cannot
easily detect the difference between a scalar input and a columnar input.
Expand All @@ -92,19 +92,30 @@ cudf type to match the result type of the original UDF. For example, if the
CPU UDF returns a `double` then `evaluateColumnar` must return a column of
type `FLOAT64`.

### RAPIDS-Accelerated Hive UDF Examples
## RAPIDS Accelerated UDF Examples

Source code for examples of RAPIDS-accelerated Hive UDFs is provided
Source code for examples of RAPIDS accelerated Hive UDFs is provided
in the [udf-examples](../udf-examples) project.

- [URLDecode](../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/URLDecode.java)
### Spark Scala UDF Examples

- [URLDecode](../udf-examples/src/main/scala/com/nvidia/spark/rapids/udf/scala/URLDecode.scala)
decodes URL-encoded strings using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
- [URLEncode](../udf-examples/src/main/scala/com/nvidia/spark/rapids/udf/scala/URLEncode.scala)
URL-encodes strings using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)

### Hive UDF Examples

- [URLDecode](../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/URLDecode.java)
implements a Hive simple UDF using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
to decode URL-encoded strings
- [URLEncode](../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/URLEncode.java)
- [URLEncode](../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/URLEncode.java)
implements a Hive generic UDF using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
to URL-encode strings
- [StringWordCount](../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/StringWordCount.java)
- [StringWordCount](../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/StringWordCount.java)
implements a Hive simple UDF using
[native code](../udf-examples/src/main/cpp/src) to count words in strings
5 changes: 3 additions & 2 deletions docs/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ Name | SQL Function(s) | Description | Default Value | Notes
<a name="sql.expression.Rint"></a>spark.rapids.sql.expression.Rint|`rint`|Rounds up a double value to the nearest double equal to an integer|true|None|
<a name="sql.expression.Round"></a>spark.rapids.sql.expression.Round|`round`|Round an expression to d decimal places using HALF_UP rounding mode|true|None|
<a name="sql.expression.RowNumber"></a>spark.rapids.sql.expression.RowNumber|`row_number`|Window function that returns the index for the row within the aggregation window|true|None|
<a name="sql.expression.ScalaUDF"></a>spark.rapids.sql.expression.ScalaUDF| |User Defined Function, support requires the UDF to implement a RAPIDS accelerated interface|true|None|
<a name="sql.expression.Second"></a>spark.rapids.sql.expression.Second|`second`|Returns the second component of the string/timestamp|true|None|
<a name="sql.expression.ShiftLeft"></a>spark.rapids.sql.expression.ShiftLeft|`shiftleft`|Bitwise shift left (<<)|true|None|
<a name="sql.expression.ShiftRight"></a>spark.rapids.sql.expression.ShiftRight|`shiftright`|Bitwise shift right (>>)|true|None|
Expand Down Expand Up @@ -254,8 +255,8 @@ Name | SQL Function(s) | Description | Default Value | Notes
<a name="sql.expression.Min"></a>spark.rapids.sql.expression.Min|`min`|Min aggregate operator|true|None|
<a name="sql.expression.Sum"></a>spark.rapids.sql.expression.Sum|`sum`|Sum aggregate operator|true|None|
<a name="sql.expression.NormalizeNaNAndZero"></a>spark.rapids.sql.expression.NormalizeNaNAndZero| |Normalize NaN and zero|true|None|
<a name="sql.expression.HiveGenericUDF"></a>spark.rapids.sql.expression.HiveGenericUDF| |Hive Generic UDF, support requires the UDF to implement a RAPIDS-accelerated interface|true|None|
<a name="sql.expression.HiveSimpleUDF"></a>spark.rapids.sql.expression.HiveSimpleUDF| |Hive UDF, support requires the UDF to implement a RAPIDS-accelerated interface|true|None|
<a name="sql.expression.HiveGenericUDF"></a>spark.rapids.sql.expression.HiveGenericUDF| |Hive Generic UDF, support requires the UDF to implement a RAPIDS accelerated interface|true|None|
<a name="sql.expression.HiveSimpleUDF"></a>spark.rapids.sql.expression.HiveSimpleUDF| |Hive UDF, support requires the UDF to implement a RAPIDS accelerated interface|true|None|

### Execution

Expand Down
94 changes: 92 additions & 2 deletions docs/supported_ops.md
Original file line number Diff line number Diff line change
Expand Up @@ -10973,6 +10973,96 @@ Accelerator support is described below.
<td> </td>
</tr>
<tr>
<td rowSpan="4">ScalaUDF</td>
<td rowSpan="4"> </td>
<td rowSpan="4">User Defined Function, support requires the UDF to implement a RAPIDS accelerated interface</td>
<td rowSpan="4">None</td>
<td rowSpan="2">project</td>
<td>param</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S*</td>
<td>S</td>
<td>S*</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td><em>PS* (missing nested UDT)</em></td>
<td><em>PS* (missing nested UDT)</em></td>
<td><em>PS* (missing nested UDT)</em></td>
<td><b>NS</b></td>
</tr>
<tr>
<td>result</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S*</td>
<td>S</td>
<td>S*</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td><em>PS* (missing nested UDT)</em></td>
<td><em>PS* (missing nested UDT)</em></td>
<td><em>PS* (missing nested UDT)</em></td>
<td><b>NS</b></td>
</tr>
<tr>
<td rowSpan="2">lambda</td>
<td>param</td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
</tr>
<tr>
<td>result</td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
<td><b>NS</b></td>
</tr>
<tr>
<td rowSpan="4">Second</td>
<td rowSpan="4">`second`</td>
<td rowSpan="4">Returns the second component of the string/timestamp</td>
Expand Down Expand Up @@ -16778,7 +16868,7 @@ Accelerator support is described below.
<tr>
<td rowSpan="4">HiveGenericUDF</td>
<td rowSpan="4"> </td>
<td rowSpan="4">Hive Generic UDF, support requires the UDF to implement a RAPIDS-accelerated interface</td>
<td rowSpan="4">Hive Generic UDF, support requires the UDF to implement a RAPIDS accelerated interface</td>
<td rowSpan="4">None</td>
<td rowSpan="2">project</td>
<td>param</td>
Expand Down Expand Up @@ -16868,7 +16958,7 @@ Accelerator support is described below.
<tr>
<td rowSpan="4">HiveSimpleUDF</td>
<td rowSpan="4"> </td>
<td rowSpan="4">Hive UDF, support requires the UDF to implement a RAPIDS-accelerated interface</td>
<td rowSpan="4">Hive UDF, support requires the UDF to implement a RAPIDS accelerated interface</td>
<td rowSpan="4">None</td>
<td rowSpan="2">project</td>
<td>param</td>
Expand Down
8 changes: 4 additions & 4 deletions integration_tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ individually, so you don't risk running unit tests along with the integration te
http://www.scalatest.org/user_guide/using_the_scalatest_shell

```shell
spark-shell --jars rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT-tests.jar,rapids-4-spark-udf-examples-0.4.0-SNAPSHOT,rapids-4-spark-integration-tests_2.12-0.4.0-SNAPSHOT-tests.jar,scalatest_2.12-3.0.5.jar,scalactic_2.12-3.0.5.jar
spark-shell --jars rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT-tests.jar,rapids-4-spark-udf-examples_2.12-0.4.0-SNAPSHOT,rapids-4-spark-integration-tests_2.12-0.4.0-SNAPSHOT-tests.jar,scalatest_2.12-3.0.5.jar,scalactic_2.12-3.0.5.jar
```

First you import the `scalatest_shell` and tell the tests where they can find the test files you
Expand All @@ -131,7 +131,7 @@ If you just want to verify the SQL replacement is working you will need to add t
example assumes CUDA 10.1 is being used.

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,rapids-4-spark-udf-examples-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar" ./runtests.py
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,rapids-4-spark-udf-examples_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar" ./runtests.py
```

You don't have to enable the plugin for this to work, the test framework will do that for you.
Expand Down Expand Up @@ -183,7 +183,7 @@ The TPCxBB, TPCH, TPCDS, and Mortgage tests in this framework can be enabled by
As an example, here is the `spark-submit` command with the TPCxBB parameters on CUDA 10.1:

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,rapids-4-spark-udf-examples-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --tpcxbb_format="csv" --tpcxbb_path="/path/to/tpcxbb/csv"
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,rapids-4-spark-udf-examples_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --tpcxbb_format="csv" --tpcxbb_path="/path/to/tpcxbb/csv"
```

Be aware that running these tests with read data requires at least an entire GPU, and preferable several GPUs/executors
Expand Down Expand Up @@ -212,7 +212,7 @@ To run cudf_udf tests, need following configuration changes:
As an example, here is the `spark-submit` command with the cudf_udf parameter on CUDA 10.1:

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,rapids-4-spark-udf-examples-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" --conf spark.rapids.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.concurrentPythonWorkers=2 --py-files "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" --conf spark.executorEnv.PYTHONPATH="rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --cudf_udf
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,rapids-4-spark-udf-examples_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" --conf spark.rapids.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.concurrentPythonWorkers=2 --py-files "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" --conf spark.executorEnv.PYTHONPATH="rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --cudf_udf
```

## Writing tests
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@
</dependency>
<dependency>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-udf-examples</artifactId>
<artifactId>rapids-4-spark-udf-examples_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>
Expand Down
6 changes: 3 additions & 3 deletions integration_tests/src/main/python/rapids_udf_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ def test_hive_simple_udf():
with_spark_session(skip_if_no_hive)
data_gens = [["i", int_gen], ["s", StringGen('([^%]{0,1}(%[0-9A-F][0-9A-F]){0,1}){0,30}')]]
def evalfn(spark):
load_udf_or_skip_test(spark, "urldecode", "com.nvidia.spark.rapids.udf.URLDecode")
load_udf_or_skip_test(spark, "urldecode", "com.nvidia.spark.rapids.udf.hive.URLDecode")
return gen_df(spark, data_gens)
assert_gpu_and_cpu_are_equal_sql(
evalfn,
Expand All @@ -47,7 +47,7 @@ def test_hive_generic_udf():
with_spark_session(skip_if_no_hive)
data_gens = [["s", StringGen('.{0,30}')]]
def evalfn(spark):
load_udf_or_skip_test(spark, "urlencode", "com.nvidia.spark.rapids.udf.URLEncode")
load_udf_or_skip_test(spark, "urlencode", "com.nvidia.spark.rapids.udf.hive.URLEncode")
return gen_df(spark, data_gens)
assert_gpu_and_cpu_are_equal_sql(
evalfn,
Expand All @@ -59,7 +59,7 @@ def test_hive_simple_udf_native(enable_rapids_udf_example_native):
with_spark_session(skip_if_no_hive)
data_gens = [["s", StringGen('.{0,30}')]]
def evalfn(spark):
load_udf_or_skip_test(spark, "wordcount", "com.nvidia.spark.rapids.udf.StringWordCount")
load_udf_or_skip_test(spark, "wordcount", "com.nvidia.spark.rapids.udf.hive.StringWordCount")
return gen_df(spark, data_gens)
assert_gpu_and_cpu_are_equal_sql(
evalfn,
Expand Down
4 changes: 2 additions & 2 deletions jenkins/databricks/build.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash
#
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -46,7 +46,7 @@ CUDA_VERSION=`mvn help:evaluate -q -pl dist -Dexpression=cuda.version -DforceStd
# the version of spark used when we install the databricks jars in .m2
SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS=$BASE_SPARK_VERSION-databricks
RAPIDS_BUILT_JAR=rapids-4-spark_$SCALA_VERSION-$SPARK_PLUGIN_JAR_VERSION.jar
RAPIDS_UDF_JAR=rapids-4-spark-udf-examples-$SPARK_PLUGIN_JAR_VERSION.jar
RAPIDS_UDF_JAR=rapids-4-spark-udf-examples_$SCALA_VERSION-$SPARK_PLUGIN_JAR_VERSION.jar

echo "Scala version is: $SCALA_VERSION"
mvn -B -P${BUILD_PROFILES} clean package -DskipTests || true
Expand Down
6 changes: 3 additions & 3 deletions jenkins/spark-tests.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash
#
# Copyright (c) 2019-2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -33,7 +33,7 @@ $MVN_GET_CMD -DremoteRepositories=$CUDF_REPO \
$MVN_GET_CMD -DremoteRepositories=$PROJECT_REPO \
-DgroupId=com.nvidia -DartifactId=rapids-4-spark_$SCALA_BINARY_VER -Dversion=$PROJECT_VER
$MVN_GET_CMD -DremoteRepositories=$PROJECT_TEST_REPO \
-DgroupId=com.nvidia -DartifactId=rapids-4-spark-udf-examples -Dversion=$PROJECT_TEST_VER
-DgroupId=com.nvidia -DartifactId=rapids-4-spark-udf-examples_$SCALA_BINARY_VER -Dversion=$PROJECT_TEST_VER
$MVN_GET_CMD -DremoteRepositories=$PROJECT_TEST_REPO \
-DgroupId=com.nvidia -DartifactId=rapids-4-spark-integration-tests_$SCALA_BINARY_VER -Dversion=$PROJECT_TEST_VER
if [ "$CUDA_CLASSIFIER"x == x ];then
Expand All @@ -42,7 +42,7 @@ else
CUDF_JAR="$ARTF_ROOT/cudf-$CUDF_VER-$CUDA_CLASSIFIER.jar"
fi
RAPIDS_PLUGIN_JAR="$ARTF_ROOT/rapids-4-spark_${SCALA_BINARY_VER}-$PROJECT_VER.jar"
RAPIDS_UDF_JAR="$ARTF_ROOT/rapids-4-spark-udf-examples-$PROJECT_TEST_VER.jar"
RAPIDS_UDF_JAR="$ARTF_ROOT/rapids-4-spark-udf-examples_${SCALA_BINARY_VER}-$PROJECT_TEST_VER.jar"
RAPIDS_TEST_JAR="$ARTF_ROOT/rapids-4-spark-integration-tests_${SCALA_BINARY_VER}-$PROJECT_TEST_VER.jar"

$MVN_GET_CMD -DremoteRepositories=$PROJECT_TEST_REPO \
Expand Down
4 changes: 2 additions & 2 deletions sql-plugin/src/main/java/com/nvidia/spark/RapidsUDF.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, NVIDIA CORPORATION.
* Copyright (c) 2020-2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -18,7 +18,7 @@

import ai.rapids.cudf.ColumnVector;

/** A RAPIDS-accelerated version of a user-defined function (UDF). */
/** A RAPIDS accelerated version of a user-defined function (UDF). */
public interface RapidsUDF {
/**
* Evaluate a user-defined function with RAPIDS cuDF columnar inputs
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2192,11 +2192,13 @@ object GpuOverrides {
(a, conf, p, r) => new UnaryExprMeta[MakeDecimal](a, conf, p, r) {
override def convertToGpu(child: Expression): GpuExpression =
GpuMakeDecimal(child, a.precision, a.scale, a.nullOnOverflow)
})
}),
GpuScalaUDF.exprMeta
).map(r => (r.getClassFor.asSubclass(classOf[Expression]), r)).toMap

// Shim expressions should be last to allow overrides with shim-specific versions
val expressions: Map[Class[_ <: Expression], ExprRule[_ <: Expression]] =
commonExpressions ++ ShimLoader.getSparkShims.getExprs ++ GpuHiveOverrides.exprs
commonExpressions ++ GpuHiveOverrides.exprs ++ ShimLoader.getSparkShims.getExprs


def wrapScan[INPUT <: Scan](
Expand Down
Loading

0 comments on commit d6be108

Please sign in to comment.