Skip to content

Commit

Permalink
Spark2 module upmerge, deploy script, and updates for Jenkins (#4585)
Browse files Browse the repository at this point in the history
* upmerge the spark2 code to latest sql-plugin

* update compare script to error out only if real diff

* Update deploy for spark2 explain jar

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

* change aggreaget diff to be file to make it easier to call from jenkins

* Add notes on how to deal with diffs

* Update scripts/rundiffspark2.sh

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

Co-authored-by: Jason Lowe <jlowe@nvidia.com>
  • Loading branch information
tgravescs and jlowe authored Jan 21, 2022
1 parent 2154180 commit 93d73e2
Show file tree
Hide file tree
Showing 9 changed files with 1,639 additions and 110 deletions.
18 changes: 17 additions & 1 deletion jenkins/deploy.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash
#
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2020-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -36,6 +36,8 @@ SIGN_FILE=$1
DATABRICKS=$2
VERSIONS_BUILT=$3

export M2DIR=${M2DIR:-"$WORKSPACE/.m2"}

###### Build the path of jar(s) to be deployed ######

cd $WORKSPACE
Expand Down Expand Up @@ -96,3 +98,17 @@ TOOL_DOC_JARS="-Dsources=${TOOL_FPATH}-sources.jar -Djavadoc=${TOOL_FPATH}-javad
$DEPLOY_CMD -Durl=$SERVER_URL -DrepositoryId=$SERVER_ID \
$TOOL_DOC_JARS \
-Dfile=$TOOL_FPATH.jar -DpomFile=${TOOL_PL}/pom.xml

###### Deploy Spark 2.x explain meta jar ######
SPARK2_PL=${SPARK2_PL:-"spark2-sql-plugin"}
SPARK2_ART_ID=`mvn help:evaluate -q -pl $SPARK2_PL -Dexpression=project.artifactId -DforceStdout -Dbuildver=24X`
SPARK2_ART_VER=`mvn help:evaluate -q -pl $SPARK2_PL -Dexpression=project.version -DforceStdout -Dbuildver=24X`
SPARK2_FPATH="$M2DIR/repository/com/nvidia/$SPARK2_ART_ID/$SPARK2_ART_VER/$SPARK2_ART_ID-$SPARK2_ART_VER"
SPARK2_DOC_JARS="-Dsources=${SPARK2_FPATH}-sources.jar -Djavadoc=${SPARK2_FPATH}-javadoc.jar"
# a bit ugly but just hardcode to spark24 for now since only version supported
SPARK2_CLASSIFIER='spark24'
SPARK2_CLASSIFIER_JAR="{$SPARK2_FPATH}-${SPARK2_CLASSIFIER}.jar"
$DEPLOY_CMD -Durl=$SERVER_URL -DrepositoryId=$SERVER_ID \
$SPARK2_DOC_JARS \
-Dclassifier=$SPARK2_CLASSIFIER \
-Dfile=$SPARK2_CLASSIFIER_JAR -DpomFile=${SPARK2_PL}/pom.xml
175 changes: 94 additions & 81 deletions scripts/rundiffspark2.sh

Large diffs are not rendered by default.

8 changes: 2 additions & 6 deletions scripts/spark2diffs/GpuHashJoin.diff
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,9 @@
< object GpuHashJoin {
---
> object GpuHashJoin extends Arm {
72c72
< meta: RapidsMeta[_, _],
---
> meta: RapidsMeta[_, _, _],
99a100
101a102
>
120c121
122c123
< }
---
>
16 changes: 11 additions & 5 deletions scripts/spark2diffs/GpuSortMergeJoinMeta.diff
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,28 @@
< parent: Option[RapidsMeta[_, _]],
---
> parent: Option[RapidsMeta[_, _, _]],
76a74,91
76a74,97
> }
>
> override def convertToGpu(): GpuExec = {
> val condition = conditionMeta.map(_.convertToGpu())
> val (joinCondition, filterCondition) = if (conditionMeta.forall(_.canThisBeAst)) {
> (condition, None)
> } else {
> (None, condition)
> }
> val Seq(left, right) = childPlans.map(_.convertIfNeeded())
> val joinExec = GpuShuffledHashJoinExec(
> leftKeys.map(_.convertToGpu()),
> rightKeys.map(_.convertToGpu()),
> join.joinType,
> buildSide,
> None,
> joinCondition,
> left,
> right,
> join.isSkewJoin)(
> join.leftKeys,
> join.rightKeys)
> // The GPU does not yet support conditional joins, so conditions are implemented
> // as a filter after the join when possible.
> condition.map(c => GpuFilterExec(c.convertToGpu(), joinExec)).getOrElse(joinExec)
> // For inner joins we can apply a post-join condition for any conditions that cannot be
> // evaluated directly in a mixed join that leverages a cudf AST expression
> filterCondition.map(c => GpuFilterExec(c, joinExec)).getOrElse(joinExec)
Loading

0 comments on commit 93d73e2

Please sign in to comment.