You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On YARN clusters, when a Spark cluster is spun up without explicitly setting spark.executor.cores, it is seen that the MultiFileReaderThreadPool is initialized to use all cores on the executor host. This results in the per-thread allocations overwhelming the executor's memory allocation, and queries failing. (One example was observed here: #9135.)
On YARN setups (and in local mode), one sees that this code falls back to Runtime.getRuntime.availableProcessors instead of using the default value for EXECUTOR_CORES_KEY. It appears that conf.getOption(RapidsPluginUtils.EXECUTOR_CORES_KEY) does not return the default, if unset.
The right thing to do might be to fetch the option via a ConfigEntry object, instead of using the conf key string.
The text was updated successfully, but these errors were encountered:
On YARN clusters, when a Spark cluster is spun up without explicitly setting
spark.executor.cores
, it is seen that theMultiFileReaderThreadPool
is initialized to use all cores on the executor host. This results in the per-thread allocations overwhelming the executor's memory allocation, and queries failing. (One example was observed here: #9135.)Part of the per-thread allocation problem will be tackled on this bug: #9269. But the core issue seems to be in
RapidsPluginUtils::estimateCoresOnExec()
:On YARN setups (and in local mode), one sees that this code falls back to
Runtime.getRuntime.availableProcessors
instead of using the default value forEXECUTOR_CORES_KEY
. It appears thatconf.getOption(RapidsPluginUtils.EXECUTOR_CORES_KEY)
does not return the default, if unset.The right thing to do might be to fetch the option via a
ConfigEntry
object, instead of using the conf key string.The text was updated successfully, but these errors were encountered: