Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a shim for Spark 3.2.0 development #1704

Merged
merged 25 commits into from
Mar 2, 2021

Conversation

gerashegalov
Copy link
Collaborator

@gerashegalov gerashegalov commented Feb 10, 2021

Signed-off-by: Gera Shegalov gera@apache.org

Add a shim provider for Spark 3.2 development branch.

Splitting this from #1688 to reduce its scope

Closes #1490

Signed-off-by: Gera Shegalov <gera@apache.org>
@sameerz sameerz added the build Related to CI / CD or cleanly building label Feb 10, 2021
integration_tests/run_pyspark_from_build.sh Outdated Show resolved Hide resolved
integration_tests/pom.xml Outdated Show resolved Hide resolved
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to update the docs and create RapidsShuffleManager.scala for 3.2.0.

I've been creating a RapidsShuffleManager.scala so that its easy for the user to set it to match the version, see:
https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html#enabling-rapidsshufflemanager
like:
Spark 3.1.0 (com.nvidia.spark.rapids.spark310.RapidsShuffleManager)

pom.xml Show resolved Hide resolved
@gerashegalov
Copy link
Collaborator Author

$ mvn verify -Pspark320tests -Dcuda.version=cuda11
$ find . -name scala-test-output.txt | xargs grep FAILED
./tests/target/surefire-reports/scala-test-output.txt:- IGNORE ORDER, WITH DECIMALS: short reduction aggs *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by string literal *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by float and string literal *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, NOT ON GPU[HashAggregateExec,AggregateExpression,AttributeReference,Alias,Literal,Min,Sum,Max,Average,Add,Multiply,Subtract,Cast,Count], WITH DECIMALS: partial on gpu: float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, NOT ON GPU[HashAggregateExec,AggregateExpression,AttributeReference,Alias,Literal,Min,Sum,Max,Average,Add,Multiply,Subtract,Cast,Count,KnownFloatingPointNormalized,NormalizeNaNAndZero], WITH DECIMALS: final on gpu: float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: nullable float basic aggregates group by more_floats *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: sum(floats) group by more_floats 2 partitions *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- INCOMPAT, IGNORE ORDER, WITH DECIMALS: empty df: reduction aggs *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- Test all supported casts with in-range values *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:- Join partitioned tables *** FAILED ***
./tests/target/surefire-reports/scala-test-output.txt:*** 12 TESTS FAILED ***

@gerashegalov
Copy link
Collaborator Author

build

@tgravescs
Copy link
Collaborator

so I assume by the above output that means tests are failing, did you look why? these pass for you with spark311 right?

Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov
Copy link
Collaborator Author

gerashegalov commented Feb 11, 2021

so I assume by the above output that means tests are failing, did you look why? these pass for you with spark311 right?

@tgravescs yes, -Pspark311tests results are green ✔️ . I am going to look at the -Pspark320tests failures

@gerashegalov gerashegalov changed the base branch from branch-0.4 to branch-0.5 February 12, 2021 20:47
@gerashegalov
Copy link
Collaborator Author

gerashegalov commented Feb 13, 2021

Status

✔️ FIXED ClassNotFoundException: org.apache.spark.sql.catalyst.errors.package$

✔️ : FIXED IGNORE ORDER, WITH DECIMALS: short reduction aggs *** FAILED ***

  Running on the GPU and on the CPU did not match 
  CPU: WrappedArray(WrappedArray(163835, 114683, 8, 14335.375, 15.0))
  
  GPU: WrappedArray(WrappedArray(163835, -16389, 8, 14335.375, 15.0)) (SparkQueryCompareTestSuite.scala:708)

✔️ : FIXED INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by string literal *** FAILED ***

Cause: java.lang.AssertionError: Type conversion is not allowed from Table{columns=[ColumnVector{rows=1, type=INT32, nullCount=Optional.empty, offHeap=(ID: 22210 7f2590476f00)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22211 7f259047c420)}, ColumnVector{rows=1, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22212 7f2590476d90)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22213 7f259047d8d0)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22214 7f259047f3d0)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22215 7f259047c0d0)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22216 7f2590478ea0)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22217 7f2590480030)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22218 7f259047d6b0)}, ColumnVector{rows=1, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22219 7f2590475d20)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22220 7f2590475e30)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22221 7f2590478b90)}], cudfTable=139799310887712, rows=1} to [IntegerType, FloatType, DoubleType, FloatType, FloatType, FloatType, FloatType, DoubleType, DoubleType, DoubleType, LongType, LongType]�[0m
�[31m  at com.nvidia.spark.rapids.GpuColumnVectorFromBuffer.from(GpuColumnVectorFromBuffer.java:69)

✔️ : FIXED INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by float and string literal

Cause: java.lang.AssertionError: Type conversion is not allowed from Table{columns=[ColumnVector{rows=7, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22310 7f25900134b0)}, ColumnVector{rows=7, type=STRING, nullCount=Optional.empty, offHeap=(ID: 22311 7f2590016d80)}, ColumnVector{rows=7, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22312 7f2590474380)}, ColumnVector{rows=7, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22313 7f259046ba30)}, ColumnVector{rows=7, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22314 7f2590014890)}, ColumnVector{rows=7, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22315 7f25900147f0)}, ColumnVector{rows=7, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22316 7f25904745c0)}, ColumnVector{rows=7, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22317 7f2590016680)}, ColumnVector{rows=7, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22318 7f2590482e50)}, ColumnVector{rows=7, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22319 7f2590482ef0)}, ColumnVector{rows=7, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22320 7f2590482f90)}, ColumnVector{rows=7, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22321 7f2590483030)}, ColumnVector{rows=7, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22322 7f25904830d0)}], cudfTable=139799309909712, rows=7} to [FloatType, StringType, FloatType, DoubleType, FloatType, FloatType, FloatType, FloatType, DoubleType, DoubleType, DoubleType, LongType, LongType]�[0m

✔️ : FIXED INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by floats *** FAILED ***�[0m

Cause: java.lang.AssertionError: Type conversion is not allowed from Table{columns=[ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22478 7f2590013a80)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22479 7f2590287a90)}, ColumnVector{rows=6, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22480 7f25902fa170)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22481 7f25902b8a70)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22482 7f2590307800)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22483 7f25902f1fc0)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22484 7f25902f1f70)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22485 7f25902f1e90)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22486 7f25902ef040)}, ColumnVector{rows=6, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22487 7f2590491aa0)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22488 7f25902f1b70)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22489 7f25902f1c10)}], cudfTable=139799309532224, rows=6} to [FloatType, FloatType, DoubleType, FloatType, FloatType, FloatType, FloatType, DoubleType, DoubleType, DoubleType, LongType, LongType]

✔️ : FIXED `INCOMPAT, IGNORE ORDER, WITH DECIMALS: float basic aggregates group by more_floats *** FAILED ***

Cause: java.lang.AssertionError: Type conversion is not allowed from Table{columns=[ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22571 7f2590012c00)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22572 7f25901f41e0)}, ColumnVector{rows=6, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22573 7f25901f3fd0)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22574 7f25904986d0)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22575 7f2590498380)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22576 7f2590491750)}, ColumnVector{rows=6, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22577 7f2590491700)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22578 7f259049c170)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22579 7f25902f1df0)}, ColumnVector{rows=6, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22580 7f25902b8fc0)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22581 7f259049be50)}, ColumnVector{rows=6, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22582 7f259049bef0)}], cudfTable=139799311269296, rows=6} to [FloatType, FloatType, DoubleType, FloatType, FloatType, FloatType, FloatType, DoubleType, DoubleType, DoubleType, LongType, LongType]

✔️ : FIXED INCOMPAT, IGNORE ORDER, NOT ON GPU[HashAggregateExec,AggregateExpression,AttributeReference,Alias,Literal,Min,Sum,Max,Average,Add,Multiply,Subtract,Cast,Count], WITH DECIMALS: partial on gpu: float basic aggregates group by more_floats *** FAILED ***

Cause: java.lang.AssertionError: Type conversion is not allowed from Table{columns=[ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22664 7f259049ddf0)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22665 7f259049dd50)}, ColumnVector{rows=1, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22666 7f259049dcb0)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22667 7f259049dc10)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22668 7f25902b86f0)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22669 7f25901f4070)}, ColumnVector{rows=1, type=FLOAT32, nullCount=Optional.empty, offHeap=(ID: 22670 7f25902f53c0)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22671 7f2590013060)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22672 7f25902b85b0)}, ColumnVector{rows=1, type=FLOAT64, nullCount=Optional.empty, offHeap=(ID: 22673 7f25902b8330)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22674 7f25902b83d0)}, ColumnVector{rows=1, type=INT64, nullCount=Optional.empty, offHeap=(ID: 22675 7f25902b8650)}], cudfTable=139799311282960, rows=1} to [FloatType, FloatType, DoubleType, FloatType, FloatType, FloatType, FloatType, DoubleType, DoubleType, DoubleType, LongType, LongType]

etc scala-test-output.txt

Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov
Copy link
Collaborator Author

build

@@ -1857,7 +1857,7 @@ object GpuOverrides {
}
}

override def convertToGpu(child: Expression): GpuExpression = GpuSum(child)
override def convertToGpu(child: Expression): GpuExpression = GpuSum(child, a.dataType)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this kind of change will be needed for avg for decimal support, the current tests are passing. Easiest way to identify expressions requiring this treatment in Spark:

➜  spark git:(master) ag 'def dataType: DataType = resultType'
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala
47:  override def dataType: DataType = resultType

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala
51:  override def dataType: DataType = resultType

shims/spark320/pom.xml Outdated Show resolved Hide resolved
shims/spark320/pom.xml Outdated Show resolved Hide resolved
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov gerashegalov requested a review from jlowe March 1, 2021 18:40
Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov gerashegalov requested a review from jlowe March 1, 2021 21:00
pom.xml Show resolved Hide resolved
@gerashegalov
Copy link
Collaborator Author

build

@gerashegalov gerashegalov merged commit 51049a6 into NVIDIA:branch-0.5 Mar 2, 2021
@gerashegalov gerashegalov deleted the issue-1522-shims branch March 2, 2021 02:17
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Signed-off-by: Gera Shegalov <gera@apache.org>

Add a shim provider for Spark 3.2.0 development branch. Closes NVIDIA#1490
- fix overflows in aggregate buffer for GpuSum by wiring the explicit output column type
- unit tests for the new shim
- consolidate version profiles in the parent pom
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Signed-off-by: Gera Shegalov <gera@apache.org>

Add a shim provider for Spark 3.2.0 development branch. Closes NVIDIA#1490
- fix overflows in aggregate buffer for GpuSum by wiring the explicit output column type
- unit tests for the new shim
- consolidate version profiles in the parent pom
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add Apache Spark 3.2.0 shim layer
4 participants