Fix Failing Test test_multi_table_hash_join for Databricks 13.3 #9491

razajafri · 2023-10-19T20:44:31Z

Running the test_multi_table_hash_join leads to the following exception

E                   py4j.protocol.Py4JJavaError: An error occurred while calling o567.collectToPython.
E                   : java.lang.IllegalStateException: the broadcast must be on the GPU too
E                       at com.nvidia.spark.rapids.shims.GpuBroadcastJoinMeta.verifyBuildSideWasReplaced(GpuBroadcastJoinMeta.scala:69)
E                       at org.apache.spark.sql.rapids.execution.GpuBroadcastHashJoinMeta.convertToGpu(GpuBroadcastHashJoinExec.scala:58)
E                       at org.apache.spark.sql.rapids.execution.GpuBroadcastHashJoinMeta.convertToGpu(GpuBroadcastHashJoinExec.scala:39)
E                       at com.nvidia.spark.rapids.SparkPlanMeta.convertIfNeeded(RapidsMeta.scala:799)
E                       at com.nvidia.spark.rapids.GpuOverrides$.com$nvidia$spark$rapids$GpuOverrides$$doConvertPlan(GpuOverrides.scala:4278)
E                       at com.nvidia.spark.rapids.GpuOverrides.applyOverrides(GpuOverrides.scala:4623)
E                       at com.nvidia.spark.rapids.GpuOverrides.$anonfun$applyWithContext$3(GpuOverrides.scala:4483)
E                       at com.nvidia.spark.rapids.GpuOverrides$.logDuration(GpuOverrides.scala:452)
E                       at com.nvidia.spark.rapids.GpuOverrides.$anonfun$applyWithContext$1(GpuOverrides.scala:4480)
E                       at com.nvidia.spark.rapids.GpuOverrideUtil$.$anonfun$tryOverride$1(GpuOverrides.scala:4446)
E                       at com.nvidia.spark.rapids.GpuOverrides.applyWithContext(GpuOverrides.scala:4500)
E                       at com.nvidia.spark.rapids.GpuQueryStagePrepOverrides.$anonfun$apply$1(GpuOverrides.scala:4463)
E                       at com.nvidia.spark.rapids.GpuOverrideUtil$.$anonfun$tryOverride$1(GpuOverrides.scala:4446)
E                       at com.nvidia.spark.rapids.GpuQueryStagePrepOverrides.apply(GpuOverrides.scala:4466)
E                       at com.nvidia.spark.rapids.GpuQueryStagePrepOverrides.apply(GpuOverrides.scala:4459)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec$.$anonfun$executePhysicalRules$2(AdaptiveSparkPlanExec.scala:1545)
E                       at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec$.$anonfun$executePhysicalRules$1(AdaptiveSparkPlanExec.scala:1544)
E                       at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
E                       at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
E                       at scala.collection.immutable.List.foldLeft(List.scala:91)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec$.executePhysicalRules(AdaptiveSparkPlanExec.scala:1542)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec$.$anonfun$applyPhysicalRules$2(AdaptiveSparkPlanExec.scala:1530)
E                       at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
E                       at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:396)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec$.executePhase(AdaptiveSparkPlanExec.scala:1510)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec$.applyPhysicalRules(AdaptiveSparkPlanExec.scala:1530)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.reOptimize(AdaptiveSparkPlanExec.scala:1285)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$withFinalPlanUpdate$3(AdaptiveSparkPlanExec.scala:651)
E                       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
E                       at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:166)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$withFinalPlanUpdate$2(AdaptiveSparkPlanExec.scala:565)
E                       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
E                       at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1113)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$withFinalPlanUpdate$1(AdaptiveSparkPlanExec.scala:563)
E                       at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
E                       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:558)
E                       at org.apache.spark.sql.execution.qrc.ResultCacheManager.computeResult(ResultCacheManager.scala:563)
E                       at org.apache.spark.sql.execution.qrc.ResultCacheManager.$anonfun$getOrComputeResultInternal$1(ResultCacheManager.scala:426)
E                       at scala.Option.getOrElse(Option.scala:189)
E                       at org.apache.spark.sql.execution.qrc.ResultCacheManager.getOrComputeResultInternal(ResultCacheManager.scala:419)
E                       at org.apache.spark.sql.execution.qrc.ResultCacheManager.getOrComputeResult(ResultCacheManager.scala:313)
E                       at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeCollectResult$1(SparkPlan.scala:519)
E                       at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
E                       at org.apache.spark.sql.execution.SparkPlan.executeCollectResult(SparkPlan.scala:516)
E                       at org.apache.spark.sql.Dataset.$anonfun$collectToPython$1(Dataset.scala:4271)
E                       at org.apache.spark.sql.Dataset.$anonfun$withAction$3(Dataset.scala:4544)
E                       at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:935)
E                       at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:4542)
E                       at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:274)
E                       at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:498)
E                       at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:201)
E                       at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1113)
E                       at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:151)
E                       at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:447)
E                       at org.apache.spark.sql.Dataset.withAction(Dataset.scala:4542)
E                       at org.apache.spark.sql.Dataset.collectToPython(Dataset.scala:4269)
E                       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
E                       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
E                       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
E                       at java.lang.reflect.Method.invoke(Method.java:498)
E                       at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
E                       at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
E                       at py4j.Gateway.invoke(Gateway.java:306)
E                       at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
E                       at py4j.commands.CallCommand.execute(CallCommand.java:79)
E                       at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
E                       at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
E                       at java.lang.Thread.run(Thread.java:750)

To reproduce apply this patch and run jenkins/databricks/test.sh

diff --git a/jenkins/databricks/test.sh b/jenkins/databricks/test.sh
index 6c96e45ff..75130fae2 100755
--- a/jenkins/databricks/test.sh
+++ b/jenkins/databricks/test.sh
@@ -84,28 +84,9 @@ rapids_shuffle_smoke_test() {
 }
 
 ## limit parallelism to avoid OOM kill
-export TEST_PARALLEL=${TEST_PARALLEL:-4}
+export TEST_PARALLEL=${TEST_PARALLEL:-1}
 
 if [[ $TEST_MODE == "DEFAULT" ]]; then
-    bash integration_tests/run_pyspark_from_build.sh --runtime_env="databricks" --test_type=$TEST_TYPE
+    bash integration_tests/run_pyspark_from_build.sh --runtime_env="databricks" --test_type=$TEST_TYPE -k test_multi_table_hash_join 
 
-    ## Run cache tests
-    if [[ "$IS_SPARK_321_OR_LATER" -eq "1" ]]; then
-        PYSP_TEST_spark_sql_cache_serializer=${PCBS_CONF} \
-            bash integration_tests/run_pyspark_from_build.sh --runtime_env="databricks" --test_type=$TEST_TYPE -k cache_test
-    fi
-fi
-
-## Run tests with jars building from the spark-rapids source code
-if [ "$(pwd)" == "$SOURCE_PATH" ]; then
-    if [[ "$TEST_MODE" == "DEFAULT" || "$TEST_MODE" == "DELTA_LAKE_ONLY" ]]; then
-        ## Run Delta Lake tests
-        SPARK_SUBMIT_FLAGS="$SPARK_CONF $DELTA_LAKE_CONFS" TEST_PARALLEL=1 \
-            bash integration_tests/run_pyspark_from_build.sh --runtime_env="databricks"  -m "delta_lake" --delta_lake --test_type=$TEST_TYPE
-    fi
-
-    if [[ "$TEST_MODE" == "DEFAULT" || "$TEST_MODE" == "MULTITHREADED_SHUFFLE" ]]; then
-        ## Mutithreaded Shuffle test
-        rapids_shuffle_smoke_test
-    fi
 fi

The text was updated successfully, but these errors were encountered:

razajafri mentioned this issue Oct 19, 2023

[FEA] Support Databricks 13.3 #9175

Closed

razajafri changed the title ~~Fix Integration Test Failures related to join_test.py~~ Fix Integration Test Failures related to join_test.py for Databricks 13.3 Oct 19, 2023

razajafri changed the title ~~Fix Integration Test Failures related to join_test.py for Databricks 13.3~~ Fix Failing Test test_multi_table_hash_join for Databricks 13.3 Oct 19, 2023

sameerz added the task Work required that improves the product but is not user facing label Oct 24, 2023

razajafri self-assigned this Oct 27, 2023

razajafri mentioned this issue Nov 6, 2023

Added Support For Various Execs for Databricks 13.3 #9637

Merged

razajafri closed this as completed in #9637 Nov 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Failing Test test_multi_table_hash_join for Databricks 13.3 #9491

Fix Failing Test test_multi_table_hash_join for Databricks 13.3 #9491

razajafri commented Oct 19, 2023 •

edited

Loading

Fix Failing Test test_multi_table_hash_join for Databricks 13.3 #9491

Fix Failing Test test_multi_table_hash_join for Databricks 13.3 #9491

Comments

razajafri commented Oct 19, 2023 • edited Loading

razajafri commented Oct 19, 2023 •

edited

Loading