Skip to content

Commit

Permalink
Build against specified spark-rapids-jni snapshot jar [skip ci] (#5501)
Browse files Browse the repository at this point in the history
* Build against specified spark-rapids-jni snapshot jar

To fix issue : #5294

Build rapids against the specified spark-rapids-jni snapshot jar, to avoid pulling different versions of
spark-rapids-jni dependency jars during the compile.

Signed-off-by: Tim Liu <timl@nvidia.com>

* Support maven build options for databricks nightly build and build/buildall script

Signed-off-by: Tim Liu <timl@nvidia.com>

* Update according to review

1, Test case for cudf expected timestamp-seq version name vs actual SNAPSHOT match, e.g.
    cudf version "7.0.1-20220101.001122-123" is satisfied by "7.0.1-SNAPSHOT"
2, Use $MVN instead of "mvn" for every invocation of Maven,
    to include maven options for all the mvn commands.

Signed-off-by: Tim Liu <timl@nvidia.com>

* Fix scalastyle checking failure : line length exceeds 100 characters

Signed-off-by: Tim Liu <timl@nvidia.com>

* Support modifying M2DIR, we may overwrite it outside the script

Signed-off-by: Tim Liu <timl@nvidia.com>

* Copyright 2022

Signed-off-by: Tim Liu <timl@nvidia.com>
  • Loading branch information
NvTimLiu authored May 19, 2022
1 parent 049bbd4 commit c1ee8dc
Show file tree
Hide file tree
Showing 5 changed files with 37 additions and 20 deletions.
15 changes: 10 additions & 5 deletions build/buildall
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ function print_usage() {
echo " Build in parallel, N (4 by default) is passed via -P to xargs"
echo " --install"
echo " Intall the resulting jar instead of just building it"
echo " -o=MVN_OPT, --option=MVN_OPT"
echo " use this option to build project with maven. E.g., --option='-Dcudf.version=cuda11'"
}

function bloopInstall() {
Expand All @@ -60,7 +62,7 @@ function bloopInstall() {
mkdir -p "$bloop_config_dir"
rm -f "$bloop_config_dir"/*

mvn install ch.epfl.scala:maven-bloop_${BLOOP_SCALA_VERSION}:${BLOOP_VERSION}:bloopInstall -pl dist -am \
$MVN install ch.epfl.scala:maven-bloop_${BLOOP_SCALA_VERSION}:${BLOOP_VERSION}:bloopInstall -pl dist -am \
-Dbloop.configDirectory="$bloop_config_dir" \
-DdownloadSources=true \
-Dbuildver="$bv" \
Expand Down Expand Up @@ -139,6 +141,9 @@ shift

done

# include options to mvn command
export MVN="mvn ${MVN_OPT}"

DIST_PROFILE=${DIST_PROFILE:-"noSnapshots"}
[[ "$MODULE" != "" ]] && MODULE_OPT="--projects $MODULE --also-make" || MODULE_OPT=""

Expand Down Expand Up @@ -203,12 +208,12 @@ export BUILD_PARALLEL=${BUILD_PARALLEL:-4}

if [[ "$SKIP_CLEAN" != "1" ]]; then
echo Clean once across all modules
mvn -q clean
$MVN -q clean
fi

echo "Building a combined dist jar with Shims for ${SPARK_SHIM_VERSIONS[@]} ..."

export MVN_BASE_DIR=$(mvn help:evaluate -Dexpression=project.basedir -q -DforceStdout)
export MVN_BASE_DIR=$($MVN help:evaluate -Dexpression=project.basedir -q -DforceStdout)

function build_single_shim() {
set -x
Expand All @@ -234,7 +239,7 @@ function build_single_shim() {
fi

echo "#### REDIRECTING mvn output to $LOG_FILE ####"
mvn -U "$MVN_PHASE" \
$MVN -U "$MVN_PHASE" \
-DskipTests \
-Dbuildver="$BUILD_VER" \
-Drat.skip="$SKIP_CHECKS" \
Expand Down Expand Up @@ -270,7 +275,7 @@ time (
# a negligible increase of the build time by ~2 seconds.
joinShimBuildFrom="aggregator"
echo "Resuming from $joinShimBuildFrom build only using $BASE_VER"
mvn $FINAL_OP -rf $joinShimBuildFrom $MODULE_OPT $MVN_PROFILE_OPT $INCLUDED_BUILDVERS_OPT \
$MVN $FINAL_OP -rf $joinShimBuildFrom $MODULE_OPT $MVN_PROFILE_OPT $INCLUDED_BUILDVERS_OPT \
-Dbuildver="$BASE_VER" \
-DskipTests -Dskip -Dmaven.javadoc.skip
)
8 changes: 4 additions & 4 deletions jenkins/databricks/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ set -ex
SPARKSRCTGZ=$1
# version of Apache Spark we are building against
BASE_SPARK_VERSION=$2
BUILD_PROFILES=$3
MVN_OPT=$3
BASE_SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS=$4
BUILD_PROFILES=${BUILD_PROFILES:-'databricks312,!snapshot-shims'}
MVN_OPT=${MVN_OPT:-''}
BASE_SPARK_VERSION=${BASE_SPARK_VERSION:-'3.1.2'}
BUILDVER=$(echo ${BASE_SPARK_VERSION} | sed 's/\.//g')db
# the version of Spark used when we install the Databricks jars in .m2
Expand All @@ -34,7 +34,7 @@ SPARK_MAJOR_VERSION_STRING=spark_${SPARK_MAJOR_VERSION_NUM_STRING}

echo "tgz is $SPARKSRCTGZ"
echo "Base Spark version is $BASE_SPARK_VERSION"
echo "build profiles $BUILD_PROFILES"
echo "build profiles $MVN_OPT"
echo "BASE_SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS is $BASE_SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS"

sudo apt install -y maven rsync
Expand Down Expand Up @@ -442,7 +442,7 @@ mvn -B install:install-file \
-Dversion=$SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS \
-Dpackaging=jar

mvn -B -Ddatabricks -Dbuildver=$BUILDVER clean package -DskipTests
mvn -B -Ddatabricks -Dbuildver=$BUILDVER clean package -DskipTests $MVN_OPT

cd /home/ubuntu
tar -zcf spark-rapids-built.tgz spark-rapids
21 changes: 12 additions & 9 deletions jenkins/spark-nightly-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,15 @@ set -ex
. jenkins/version-def.sh

## export 'M2DIR' so that shims can get the correct Spark dependency info
export M2DIR="$WORKSPACE/.m2"
export M2DIR=${M2DIR:-"$WORKSPACE/.m2"}

## MVN_OPT : maven options environment, e.g. MVN_OPT='-Dspark-rapids-jni.version=xxx' to specify spark-rapids-jni dependency's version.
MVN="mvn ${MVN_OPT}"

TOOL_PL=${TOOL_PL:-"tools"}
DIST_PL="dist"
function mvnEval {
mvn help:evaluate -q -pl $DIST_PL $MVN_URM_MIRROR -Prelease311 -Dmaven.repo.local=$M2DIR -Dcuda.version=$CUDA_CLASSIFIER -DforceStdout -Dexpression=$1
$MVN help:evaluate -q -pl $DIST_PL $MVN_URM_MIRROR -Prelease311 -Dmaven.repo.local=$M2DIR -Dcuda.version=$CUDA_CLASSIFIER -DforceStdout -Dexpression=$1
}

ART_ID=$(mvnEval project.artifactId)
Expand Down Expand Up @@ -64,7 +67,7 @@ function distWithReducedPom {
;;
esac

mvn -B $mvnCmd $MVN_URM_MIRROR \
$MVN -B $mvnCmd $MVN_URM_MIRROR \
-Dcuda.version=$CUDA_CLASSIFIER \
-Dmaven.repo.local=$M2DIR \
-Dfile="${DIST_FPATH}.jar" \
Expand All @@ -76,9 +79,9 @@ function distWithReducedPom {
}

# build the Spark 2.x explain jar
mvn -B $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR -Dbuildver=24X clean install -DskipTests
$MVN -B $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR -Dbuildver=24X clean install -DskipTests
[[ $SKIP_DEPLOY != 'true' ]] && \
mvn -B deploy $MVN_URM_MIRROR \
$MVN -B deploy $MVN_URM_MIRROR \
-Dmaven.repo.local=$M2DIR \
-DskipTests \
-Dbuildver=24X
Expand All @@ -89,19 +92,19 @@ mvn -B $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR -Dbuildver=24X clean install -D
# Deploy jars unless SKIP_DEPLOY is 'true'

for buildver in "${SPARK_SHIM_VERSIONS[@]:1}"; do
mvn -U -B clean install -pl '!tools' $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR \
$MVN -U -B clean install -pl '!tools' $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR \
-Dcuda.version=$CUDA_CLASSIFIER \
-Dbuildver="${buildver}"
distWithReducedPom "install"
[[ $SKIP_DEPLOY != 'true' ]] && \
mvn -B deploy -pl '!tools,!dist' $MVN_URM_MIRROR \
$MVN -B deploy -pl '!tools,!dist' $MVN_URM_MIRROR \
-Dmaven.repo.local=$M2DIR \
-Dcuda.version=$CUDA_CLASSIFIER \
-DskipTests \
-Dbuildver="${buildver}"
done

mvn -B clean install -pl '!tools' \
$MVN -B clean install -pl '!tools' \
$DIST_PROFILE_OPT \
-Dbuildver=$SPARK_BASE_SHIM_VERSION \
$MVN_URM_MIRROR \
Expand All @@ -115,7 +118,7 @@ if [[ $SKIP_DEPLOY != 'true' ]]; then
distWithReducedPom "deploy"

# this deploy includes 'tools' that is unconditionally built with Spark 3.1.1
mvn -B deploy -pl '!dist' \
$MVN -B deploy -pl '!dist' \
-Dbuildver=$SPARK_BASE_SHIM_VERSION \
$MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR \
-Dcuda.version=$CUDA_CLASSIFIER \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -313,11 +313,18 @@ object RapidsExecutorPlugin {
* patch version then the actual patch version must be greater than or equal.
* For example, version 7.1 is not satisfied by version 7.2, but version 7.1 is satisfied by
* version 7.1.1.
* If the expected cudf version is a specified 'timestamp-seq' one, then it is satisfied by
* the SNAPSHOT version.
* For example, version 7.1-yyyymmdd.hhmmss-seq is satisfied by version 7.1-SNAPSHOT.
*/
def cudfVersionSatisfied(expected: String, actual: String): Boolean = {
val expHyphen = if (expected.indexOf('-') >= 0) expected.indexOf('-') else expected.length
val actHyphen = if (actual.indexOf('-') >= 0) actual.indexOf('-') else actual.length
if (actual.substring(actHyphen) != expected.substring(expHyphen)) return false
if (actual.substring(actHyphen) != expected.substring(expHyphen) &&
!(actual.substring(actHyphen) == "-SNAPSHOT" &&
expected.substring(expHyphen).matches("-([0-9]{8}).([0-9]{6})-([1-9][0-9]*)"))) {
return false
}

val (expMajorMinor, expPatch) = expected.substring(0, expHyphen).split('.').splitAt(2)
val (actMajorMinor, actPatch) = actual.substring(0, actHyphen).split('.').splitAt(2)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2021, NVIDIA CORPORATION.
* Copyright (c) 2021-2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -46,5 +46,7 @@ class RapidsExecutorPluginSuite extends FunSuite {
assert(RapidsExecutorPlugin.cudfVersionSatisfied("7.0.1-special", "7.0.2-special"))
assert(!RapidsExecutorPlugin.cudfVersionSatisfied("7.0.2.2.2", "7.0.2.2"))
assert(RapidsExecutorPlugin.cudfVersionSatisfied("7.0.2.2.2", "7.0.2.2.2"))
assert(RapidsExecutorPlugin.cudfVersionSatisfied("7.0.1-20220101.001122-12", "7.0.1-SNAPSHOT"))
assert(!RapidsExecutorPlugin.cudfVersionSatisfied("7.0.1-SNAPSHOT", "7.0.1-20220101.001122-12"))
}
}

0 comments on commit c1ee8dc

Please sign in to comment.