Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to cudf 0.18-SNAPSHOT #1368

Merged
merged 1 commit into from
Dec 10, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The following is the list of options that `rapids-plugin-4-spark` supports.
On startup use: `--conf [conf key]=[conf value]`. For example:

```
${SPARK_HOME}/bin/spark --jars 'rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.17-SNAPSHOT-cuda10-1.jar' \
${SPARK_HOME}/bin/spark --jars 'rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar' \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--conf spark.rapids.sql.incompatibleOps.enabled=true
```
Expand Down
2 changes: 1 addition & 1 deletion docs/get-started/Dockerfile.cuda
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ COPY spark-3.0.1-bin-hadoop3.2/examples /opt/spark/examples
COPY spark-3.0.1-bin-hadoop3.2/kubernetes/tests /opt/spark/tests
COPY spark-3.0.1-bin-hadoop3.2/data /opt/spark/data

COPY cudf-0.17-SNAPSHOT-cuda10-1.jar /opt/sparkRapidsPlugin
COPY cudf-0.18-SNAPSHOT-cuda10-1.jar /opt/sparkRapidsPlugin
COPY rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar /opt/sparkRapidsPlugin
COPY getGpusResources.sh /opt/sparkRapidsPlugin

Expand Down
4 changes: 2 additions & 2 deletions docs/get-started/getting-started-on-prem.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,15 @@ CUDA and will not run on other versions. The jars use a maven classifier to keep
- CUDA 11.0 => classifier cuda11

For example, here is a sample version of the jars and cudf with CUDA 10.1 support:
- cudf-0.17-SNAPSHOT-cuda10-1.jar
- cudf-0.18-SNAPSHOT-cuda10-1.jar
- rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar


For simplicity export the location to these jars. This example assumes the sample jars above have
been placed in the `/opt/sparkRapidsPlugin` directory:
```shell
export SPARK_RAPIDS_DIR=/opt/sparkRapidsPlugin
export SPARK_CUDF_JAR=${SPARK_RAPIDS_DIR}/cudf-0.17-SNAPSHOT-cuda10-1.jar
export SPARK_CUDF_JAR=${SPARK_RAPIDS_DIR}/cudf-0.18-SNAPSHOT-cuda10-1.jar
export SPARK_RAPIDS_PLUGIN_JAR=${SPARK_RAPIDS_DIR}/rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar
```

Expand Down
6 changes: 3 additions & 3 deletions integration_tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ Most clusters probably will not have the RAPIDS plugin installed in the cluster
If you just want to verify the SQL replacement is working you will need to add the `rapids-4-spark` and `cudf` jars to your `spark-submit` command.

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.17-SNAPSHOT.jar" ./runtests.py
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT.jar" ./runtests.py
```

You don't have to enable the plugin for this to work, the test framework will do that for you.
Expand Down Expand Up @@ -180,7 +180,7 @@ The TPCxBB, TPCH, TPCDS, and Mortgage tests in this framework can be enabled by
As an example, here is the `spark-submit` command with the TPCxBB parameters:

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.17-SNAPSHOT.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --tpcxbb_format="csv" --tpcxbb_path="/path/to/tpcxbb/csv"
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --tpcxbb_format="csv" --tpcxbb_path="/path/to/tpcxbb/csv"
```

Be aware that running these tests with read data requires at least an entire GPU, and preferable several GPUs/executors
Expand Down Expand Up @@ -209,7 +209,7 @@ To run cudf_udf tests, need following configuration changes:
As an example, here is the `spark-submit` command with the cudf_udf parameter:

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.17-SNAPSHOT.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" --conf spark.rapids.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.concurrentPythonWorkers=2 --py-files "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" --conf spark.executorEnv.PYTHONPATH="rapids-4-spark_2.12-0.2.0-SNAPSHOT.jar" ./runtests.py --cudf_udf
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT.jar,rapids-4-spark-tests_2.12-0.4.0-SNAPSHOT.jar" --conf spark.rapids.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.concurrentPythonWorkers=2 --py-files "rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" --conf spark.executorEnv.PYTHONPATH="rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar" ./runtests.py --cudf_udf
```

## Writing tests
Expand Down
2 changes: 1 addition & 1 deletion jenkins/Dockerfile-blossom.integration.centos7
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
#
# Arguments:
# CUDA_VER=10.1 or 10.2
# CUDF_VER=0.16 or 0.17-SNAPSHOT
# CUDF_VER=0.16, 0.17-SNAPSHOT, or 0.18-SNAPSHOT
# URM_URL=<maven repo url>
###
ARG CUDA_VER=10.1
Expand Down
2 changes: 1 addition & 1 deletion jenkins/printJarVersion.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ function print_ver(){
SERVER_ID=$5

# Collect snapshot dependency info only in Jenkins build
# In dev build, print 'SNAPSHOT' tag without time stamp, e.g.: cudf-0.17-SNAPSHOT.jar
# In dev build, print 'SNAPSHOT' tag without time stamp, e.g.: cudf-0.18-SNAPSHOT.jar
if [[ "$VERSION" == *"-SNAPSHOT" && -n "$JENKINS_URL" ]]; then
PREFIX=${VERSION%-SNAPSHOT}
# List the latest SNAPSHOT jar file in the maven repo
Expand Down
2 changes: 1 addition & 1 deletion jenkins/version-def.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ for VAR in $OVERWRITE_PARAMS;do
done
IFS=$PRE_IFS

CUDF_VER=${CUDF_VER:-"0.17-SNAPSHOT"}
CUDF_VER=${CUDF_VER:-"0.18-SNAPSHOT"}
CUDA_CLASSIFIER=${CUDA_CLASSIFIER:-"cuda10-1"}
PROJECT_VER=${PROJECT_VER:-"0.4.0-SNAPSHOT"}
SPARK_VER=${SPARK_VER:-"3.0.0"}
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@
<maven.compiler.target>1.8</maven.compiler.target>
<spark.version>3.0.0</spark.version>
<cuda.version>cuda10-1</cuda.version>
<cudf.version>0.17-SNAPSHOT</cudf.version>
<cudf.version>0.18-SNAPSHOT</cudf.version>
<scala.binary.version>2.12</scala.binary.version>
<scala.version>2.12.8</scala.version>
<orc.version>1.5.8</orc.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -824,7 +824,7 @@ object RapidsConf {
|On startup use: `--conf [conf key]=[conf value]`. For example:
|
|```
|${SPARK_HOME}/bin/spark --jars 'rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.17-SNAPSHOT-cuda10-1.jar' \
|${SPARK_HOME}/bin/spark --jars 'rapids-4-spark_2.12-0.4.0-SNAPSHOT.jar,cudf-0.18-SNAPSHOT-cuda10-1.jar' \
|--conf spark.plugins=com.nvidia.spark.SQLPlugin \
|--conf spark.rapids.sql.incompatibleOps.enabled=true
|```
Expand Down