Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set spark.executor.cores for integration tests. #9177

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions integration_tests/run_pyspark_from_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,11 @@ else
then
TEST_TAGS="-m $TEST_TAGS"
fi

# Set per-executor cores, if unspecified.
# This prevents per-thread allocations (like Parquet read buffers) from overwhelming the heap.
export PYSP_TEST_spark_executor_cores=${PYSP_TEST_spark_executor_cores:-'10'}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 10? We have a few other places where we try to configure things for local mode already why are the number of executor cores out of sync with the LOCAL_PARALLEL or NUM_LOCAL_EXECS?

LOCAL_PARALLEL=$(( $CPU_CORES > 4 ? 4 : $CPU_CORES ))

On a side note are databricks tests being run in local mode and configured badly? In a regular databricks cluster will we also run into this type of a problem? If so this workaround feels very much like it is going in the wrong direction, like we need to really fix the underlying problem ASAP instead of trying to work around it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did see that. I figured we might consider lower parallelism for local-mode than cluster-mode, and that a more appropriate number might be suggested in the review. I have verified that this works with 4.


if [[ "${TEST_PARALLEL}" == "" ]];
then
# For integration tests we want to have at least
Expand Down Expand Up @@ -334,6 +339,7 @@ EOF

driverJavaOpts="$PYSP_TEST_spark_driver_extraJavaOptions"
gpuAllocSize="$PYSP_TEST_spark_rapids_memory_gpu_allocSize"
executorCores="$PYSP_TEST_spark_executor_cores"

# avoid double processing of variables passed to spark in
# spark_conf_init
Expand All @@ -343,11 +349,13 @@ EOF
unset PYSP_TEST_spark_jars_packages
unset PYSP_TEST_spark_jars_repositories
unset PYSP_TEST_spark_rapids_memory_gpu_allocSize
unset PYSP_TEST_spark_rapids_executor_cores

exec "$SPARK_HOME"/bin/spark-submit "${jarOpts[@]}" \
--driver-java-options "$driverJavaOpts" \
$SPARK_SUBMIT_FLAGS \
--conf 'spark.rapids.memory.gpu.allocSize='"$gpuAllocSize" \
--conf 'spark.executor.cores='"$executorCores" \
"${RUN_TESTS_COMMAND[@]}" "${TEST_COMMON_OPTS[@]}"
fi
fi