Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add retry for Host Memory Usage in Parquet, ORC, and AVRO reads #8890

Open
revans2 opened this issue Jul 31, 2023 · 0 comments
Open

[FEA] Add retry for Host Memory Usage in Parquet, ORC, and AVRO reads #8890

revans2 opened this issue Jul 31, 2023 · 0 comments
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented Jul 31, 2023

Is your feature request related to a problem? Please describe.

#9862 should give us a limit on the amount of host memory being used, but if it fails we want to be able to retry the allocations.

For Parquet, Orc and Avro reads we have a number of different options that can use a thread pool, and of the options that do not use the pool, we often want to share code with a version that does use a thread pool. With the retry code this should not need to change, but we need to make sure that we

  1. add retry as needed around the host memory allocations, or blocks of code that allocate host memory.
  2. Update the thread pools, and any interactions we do with the thread pools so RmmSpark can keep track of potentially stuck threads.
  3. test all of the different threading options. That is probably already being done, but more stress testing would be good for this in particular.
@revans2 revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Jul 31, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 8, 2023
@revans2 revans2 changed the title [FEA] Limit Host Memory Usage for Parquet, ORC, and AVRO reads [FEA] Add retry for Host Memory Usage in Parquet, ORC, and AVRO reads Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing
Projects
None yet
Development

No branches or pull requests

2 participants