-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use number of threads on executor instead of driver to set core count #9051
Conversation
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
build |
Co-authored-by: Thomas Graves <tgraves@apache.org>
build |
I just updated the APIs to use cores instead of tasks, because it is more accurate on what we are returning. |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor wording nit, lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My previous nit still applies, but new API names look OK.
Co-authored-by: Jason Lowe <jlowe@nvidia.com>
799fbc8
@jlowe @jbrennan333 @tgravescs please look again I committed the suggestion from @jlowe |
build |
val numThreads = Math.max(numThreadsFromConf, GpuDeviceManager.getNumCores) | ||
|
||
if (numThreadsFromConf != numThreads) { | ||
logWarning(s"Using $numThreads as the number of threads for the thread pool.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setting is orthogonal to cores, IMO. But either way: what do we expect the users to do upon seeing this message in the log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setting is orthogonal to cores
The reason for this originally was to avoid bad performance if a user set the thread pool size too small. The idea was that we should not have a thread pool that is smaller than the number of tasks, or else we would lose parallelism by enabling the multi-threaded reader. In this case the number of cores is being used as a stand-in for tasks, even though it is not necessarily a one to one mapping.
what do we expect the users to do upon seeing this message in the log?
I don't expect a user to do much of anything with this message right now. It is in the executor log and it is unlikely that a user will ever see it. It is more here so we have something that is equivalent to the log message on the driver, which is the one most likely to catch this. I also thought that it would be nice to know that this is happening, but you are right it is unlikely to ever be acted on. If you want me to remove it I will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for elaborating @revans2.
It is in the executor log and it is unlikely that a user will ever see it.
I would not count on it: Devops look at logs by querying log search engines using app id using the likes of Splunk, Sumo, and Kibana rather than raw individual logs. And they could track metrics for different log categories.
I also thought that it would be nice to know that this is happening, but you are right it is unlikely to ever be acted on. If you want me to remove it I will.
I think this message can be more useful if you add details to make it more self-contained.
"Configuring the file reader thread pool size with the max of ${CONF1}=$VAL1 and ${CONF2}=${VAL2}"
@gerashegalov please take another look |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There are a few cases where the user does not have to set the executor cores config. This is most typically in local mode, but can also happen in standalone mode too. Before this patch if it was not set, then fixup code that runs on the driver would use the number of CPU cores that the java runtime is aware of as a stand-in for it. But because this ran on the driver it would only fix cases when the driver and executor ran on the same host and in the same container. This updates the code to wait until the number of cores is known, on the executor, before making the final decision on the thread pool size.