Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup shutdown logging for UCX shuffle #1546

Merged
merged 2 commits into from
Jan 20, 2021

Conversation

abellina
Copy link
Collaborator

@abellina abellina commented Jan 18, 2021

In this PR we clean up the logging in UCX scala such that we don't log a backtrace in the benign case that the management socket, which is constantly in accept(), is closed due to a normal tear down.

Java will throw a SocketException in this case, to unblock the accept() call, and the application must distinguish this exception from other potential exceptions during the call.

This PR also cleans up code around an extra lock we had, and makes the initialized volatile.

Signed-off-by: Alessandro Bellina abellina@nvidia.com

Closes #1155

@abellina abellina added the shuffle things that impact the shuffle plugin label Jan 18, 2021
@abellina abellina added this to the Jan 18 - Jan 29 milestone Jan 18, 2021
@jlowe
Copy link
Member

jlowe commented Jan 19, 2021

@abellina could you elaborate more on what's being fixed here in the PR description?

@jlowe
Copy link
Member

jlowe commented Jan 19, 2021

Would be good to update the PR headline as it's used as-is in the changelog. As it is now, users aren't going to know what it's talking about.

@abellina abellina changed the title On close, do not log SocketException in the good case. Clean up use o… Cleanup logging when close called on an accepting socket, and minor cleanup in ucx.scala Jan 19, 2021
@abellina
Copy link
Collaborator Author

@jlowe let me know if this looks ok, I'll rebase it if so (and at least sign off the last commit, or squash). It looks like I confused the sign-off bot.

@jlowe jlowe changed the title Cleanup logging when close called on an accepting socket, and minor cleanup in ucx.scala Cleanup shutdown logging for UCX shuffle Jan 19, 2021
jlowe
jlowe previously approved these changes Jan 19, 2021
Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
@abellina
Copy link
Collaborator Author

build

1 similar comment
@pxLi
Copy link
Collaborator

pxLi commented Jan 20, 2021

build

@pxLi
Copy link
Collaborator

pxLi commented Jan 20, 2021

build

1 similar comment
@pxLi
Copy link
Collaborator

pxLi commented Jan 20, 2021

build

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
@abellina
Copy link
Collaborator Author

build

@abellina
Copy link
Collaborator Author

@gerashegalov let me know if this looks OK to you. Thanks!

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abellina abellina merged commit 2b18d10 into NVIDIA:branch-0.4 Jan 20, 2021
@abellina abellina deleted the shuffle/init_close_cleanup branch January 20, 2021 19:24
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Cleanup shutdown logging for UCX shuffle

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

* Address code review comments

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Cleanup shutdown logging for UCX shuffle

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

* Address code review comments

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
…IDIA#1546)

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] on shutdown don't print Socket closed exception when shutting down UCX.scala
4 participants