-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a shim for Spark 3.2.0 development #1704
Create a shim for Spark 3.2.0 development #1704
Conversation
Signed-off-by: Gera Shegalov <gera@apache.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need to update the docs and create RapidsShuffleManager.scala for 3.2.0.
I've been creating a RapidsShuffleManager.scala so that its easy for the user to set it to match the version, see:
https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html#enabling-rapidsshufflemanager
like:
Spark 3.1.0 (com.nvidia.spark.rapids.spark310.RapidsShuffleManager)
$ mvn verify -Pspark320tests -Dcuda.version=cuda11
$ find . -name scala-test-output.txt | xargs grep FAILED
./tests/target/surefire-reports/scala-test-output.txt:- IGNORE ORDER, WITH DECIMALS: short reduction aggs *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by string literal *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by float and string literal *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, NOT ON GPU[HashAggregateExec,AggregateExpression,AttributeReference,Alias,Literal,Min,Sum,Max,Average,Add,Multiply,Subtract,Cast,Count], WITH DECIMALS: partial on gpu: float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, NOT ON GPU[HashAggregateExec,AggregateExpression,AttributeReference,Alias,Literal,Min,Sum,Max,Average,Add,Multiply,Subtract,Cast,Count,KnownFloatingPointNormalized,NormalizeNaNAndZero], WITH DECIMALS: final on gpu: float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: nullable float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: sum(floats) group by more_floats 2 partitions *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: empty df: reduction aggs *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- Test all supported casts with in-range values *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- Join partitioned tables *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:*** 12 TESTS FAILED *** |
build |
so I assume by the above output that means tests are failing, did you look why? these pass for you with spark311 right? |
shims/spark320/src/main/scala/com/nvidia/spark/rapids/shims/spark320/Spark320Shims.scala
Show resolved
Hide resolved
shims/spark320/src/main/scala/com/nvidia/spark/rapids/spark320/RapidsShuffleManager.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Gera Shegalov <gera@apache.org>
@tgravescs yes, |
Status ✔️ FIXED ✔️ : FIXED
✔️ : FIXED
✔️ : FIXED
✔️ : FIXED
✔️ : FIXED `INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by more_floats *** FAILED ***
✔️ : FIXED
|
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
build |
@@ -1857,7 +1857,7 @@ object GpuOverrides { | |||
} | |||
} | |||
|
|||
override def convertToGpu(child: Expression): GpuExpression = GpuSum(child) | |||
override def convertToGpu(child: Expression): GpuExpression = GpuSum(child, a.dataType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this kind of change will be needed for avg for decimal support, the current tests are passing. Easiest way to identify expressions requiring this treatment in Spark:
➜ spark git:(master) ag 'def dataType: DataType = resultType'
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala
47: override def dataType: DataType = resultType
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala
51: override def dataType: DataType = resultType
sql-plugin/src/main/java/com/nvidia/spark/rapids/GpuColumnVector.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
...park311/src/main/scala/com/nvidia/spark/rapids/shims/spark311/SparkShimServiceProvider.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Gera Shegalov <gera@apache.org>
shims/spark320/src/main/scala/com/nvidia/spark/rapids/shims/spark320/Spark320Shims.scala
Show resolved
Hide resolved
Signed-off-by: Gera Shegalov <gera@apache.org>
build |
Signed-off-by: Gera Shegalov <gera@apache.org> Add a shim provider for Spark 3.2.0 development branch. Closes NVIDIA#1490 - fix overflows in aggregate buffer for GpuSum by wiring the explicit output column type - unit tests for the new shim - consolidate version profiles in the parent pom
Signed-off-by: Gera Shegalov <gera@apache.org> Add a shim provider for Spark 3.2.0 development branch. Closes NVIDIA#1490 - fix overflows in aggregate buffer for GpuSum by wiring the explicit output column type - unit tests for the new shim - consolidate version profiles in the parent pom
Signed-off-by: Gera Shegalov gera@apache.org
Add a shim provider for Spark 3.2 development branch.
Splitting this from #1688 to reduce its scope
Closes #1490