Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use number of threads on executor instead of driver to set core count #9051

Merged
merged 7 commits into from
Aug 16, 2023

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Aug 15, 2023

There are a few cases where the user does not have to set the executor cores config. This is most typically in local mode, but can also happen in standalone mode too. Before this patch if it was not set, then fixup code that runs on the driver would use the number of CPU cores that the java runtime is aware of as a stand-in for it. But because this ran on the driver it would only fix cases when the driver and executor ran on the same host and in the same container. This updates the code to wait until the number of cores is known, on the executor, before making the final decision on the thread pool size.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

build

Co-authored-by: Thomas Graves <tgraves@apache.org>
@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

build

@revans2 revans2 changed the title Use number of threads on executor instead of driver to set task count Use number of threads on executor instead of driver to set core count Aug 15, 2023
@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

I just updated the APIs to use cores instead of tasks, because it is more accurate on what we are returning.

@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

build

tgravescs
tgravescs previously approved these changes Aug 15, 2023
jbrennan333
jbrennan333 previously approved these changes Aug 15, 2023
Copy link
Collaborator

@jbrennan333 jbrennan333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor wording nit, lgtm.

jlowe
jlowe previously approved these changes Aug 15, 2023
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous nit still applies, but new API names look OK.

@revans2 revans2 dismissed stale reviews from jlowe, jbrennan333, and tgravescs via 799fbc8 August 15, 2023 15:35
@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

@jlowe @jbrennan333 @tgravescs please look again I committed the suggestion from @jlowe

@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

build

jlowe
jlowe previously approved these changes Aug 15, 2023
val numThreads = Math.max(numThreadsFromConf, GpuDeviceManager.getNumCores)

if (numThreadsFromConf != numThreads) {
logWarning(s"Using $numThreads as the number of threads for the thread pool.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting is orthogonal to cores, IMO. But either way: what do we expect the users to do upon seeing this message in the log?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gerashegalov

This setting is orthogonal to cores

The reason for this originally was to avoid bad performance if a user set the thread pool size too small. The idea was that we should not have a thread pool that is smaller than the number of tasks, or else we would lose parallelism by enabling the multi-threaded reader. In this case the number of cores is being used as a stand-in for tasks, even though it is not necessarily a one to one mapping.

what do we expect the users to do upon seeing this message in the log?

I don't expect a user to do much of anything with this message right now. It is in the executor log and it is unlikely that a user will ever see it. It is more here so we have something that is equivalent to the log message on the driver, which is the one most likely to catch this. I also thought that it would be nice to know that this is happening, but you are right it is unlikely to ever be acted on. If you want me to remove it I will.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for elaborating @revans2.

It is in the executor log and it is unlikely that a user will ever see it.

I would not count on it: Devops look at logs by querying log search engines using app id using the likes of Splunk, Sumo, and Kibana rather than raw individual logs. And they could track metrics for different log categories.

I also thought that it would be nice to know that this is happening, but you are right it is unlikely to ever be acted on. If you want me to remove it I will.

I think this message can be more useful if you add details to make it more self-contained.

"Configuring the file reader thread pool size with the max of ${CONF1}=$VAL1 and ${CONF2}=${VAL2}"

@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

@gerashegalov please take another look

@revans2
Copy link
Collaborator Author

revans2 commented Aug 15, 2023

build

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sameerz sameerz added the task Work required that improves the product but is not user facing label Aug 16, 2023
@revans2 revans2 merged commit c9e23a8 into NVIDIA:branch-23.10 Aug 16, 2023
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants