Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified handling of GPU core dumps #9238

Merged
merged 9 commits into from
Oct 4, 2023
Merged

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Sep 13, 2023

Adds configs for simplified GPU core dump handling. This approach creates a named pipe and listens to the pipe with a background thread to copy the GPU core file to a path that is built with the configured coredump URI prefix. Coredumps can be configured as lightweight or full, and are by default compressed with zstd before uploading to the (potentially remote) filesystem.

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe self-assigned this Sep 13, 2023
@sameerz sameerz added the task Work required that improves the product but is not user facing label Sep 18, 2023
revans2
revans2 previously approved these changes Sep 19, 2023
@jlowe jlowe marked this pull request as ready for review October 2, 2023 19:59
@jlowe
Copy link
Member Author

jlowe commented Oct 2, 2023

build

revans2
revans2 previously approved these changes Oct 2, 2023
@jlowe
Copy link
Member Author

jlowe commented Oct 3, 2023

build

revans2
revans2 previously approved these changes Oct 3, 2023
gerashegalov
gerashegalov previously approved these changes Oct 3, 2023
Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

tgravescs
tgravescs previously approved these changes Oct 3, 2023
@jlowe jlowe dismissed stale reviews from tgravescs, gerashegalov, and revans2 via b4aeffd October 3, 2023 19:17
@jlowe
Copy link
Member Author

jlowe commented Oct 3, 2023

build

@jlowe jlowe merged commit 54f5073 into NVIDIA:branch-23.10 Oct 4, 2023
29 checks passed
@jlowe jlowe deleted the gpucoredump branch October 4, 2023 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants