Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Reconsider 8MB copy buffer size in GpuParquetScan #9269

Open
mythrocks opened this issue Sep 19, 2023 · 3 comments
Open

[BUG] Reconsider 8MB copy buffer size in GpuParquetScan #9269

mythrocks opened this issue Sep 19, 2023 · 3 comments
Labels
performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@mythrocks
Copy link
Collaborator

mythrocks commented Sep 19, 2023

The GpuParquetScan has an 8MB copy buffer allocation per reader instance.

val copyBufferSize = conf.getInt("parquet.read.allocation.size", 8 * 1024 * 1024)

This might account for a significant portion of the executor's available memory, on setups with a high core count, and low heap allocation.

For instance, on machines with 128 cores, if the executor is set up with only 1GB of memory, then enabling the multi-file reader for Parquet could result in the entire 1GB of memory being consumed by the copy buffers across all threads. This leads to failures such as in #9135.

It would be good to evaluate whether this 8MB allocation can be lowered if possible. Or at least, accounted for.

@mythrocks mythrocks added bug Something isn't working ? - Needs Triage Need team to review and classify labels Sep 19, 2023
@mattahrens mattahrens added performance A performance related task/issue and removed ? - Needs Triage Need team to review and classify bug Something isn't working labels Nov 7, 2023
@sameerz sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Jul 23, 2024
@sameerz
Copy link
Collaborator

sameerz commented Jul 23, 2024

Recommendation from within the team is to reduce the 8MB size to see if it will address #11227 .

There are alternative ideas here: https://aws.amazon.com/blogs/storage/optimizing-performance-of-apache-spark-workloads-on-amazon-s3/

Suggestion is to make a code change in copyDataRange to right size the intermediate buffer for data transfer ( see https://github.com/NVIDIA/spark-rapids/blob/6d65548e5716f02ae7fee564d5091e05a0ac219a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala#L1596C57-L1596C71 )

Any changes will require rerunning all performance tests.

@gerashegalov
Copy link
Collaborator

The default of 8 MiB might have been inspired by Performance Design Patterns for S3 https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance-design-patterns.html#optimizing-performance-parallelization

recommending it as a granularity for byte range requests.

@mythrocks
Copy link
Collaborator Author

mythrocks commented Aug 4, 2024

Recommendation from within the team is to reduce the 8MB size to see if it will address #11227 .

Looking at #11227, one is reminded of a very similar failure (#9135) with Parquet-reader tests on CDH (in local mode). At the time, we had discussed either reducing the -Xmx or the thread-pool size.

That failure tended to happen in local mode. We had a PR to address the failure (#9177) by setting the thread-pool size. I suspect it might possibly address this OOM as well, depending on how many threads are running.

At the time, adjusting the core-count or memory was deemed a workaround, instead of fixing the reader. We might consider this again.

cc @razajafri, just in case this might be pertinent to #11227.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

4 participants