Extend the Parquet writer's dictionary encoding benchmark. #16591

mhaseeb123 · 2024-08-17T01:59:06Z

Description

This PR extends the data cardinality and run length range for the existing parquet writer's encoding benchmark.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…o pq-writer-dict-benchmark

copy-pr-bot · 2024-08-19T21:17:45Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

mhaseeb123 · 2024-08-19T21:41:07Z

cpp/benchmarks/io/parquet/parquet_writer.cpp

@@ -202,8 +202,8 @@ NVBENCH_BENCH_TYPES(BM_parq_write_encode, NVBENCH_TYPE_AXES(d_type_list))
  .set_name("parquet_write_encode")
  .set_type_axes_names({"data_type"})
  .set_min_samples(4)
-  .add_int64_axis("cardinality", {0, 1000})
-  .add_int64_axis("run_length", {1, 32});
+  .add_int64_axis("cardinality", {1, 1000, 10'000, 100'000, 1'000'000})


Are there any resource/infrastructure constraints that we need to consider here? We can perhaps remove one or two entries from here if needed.

I think we can drop 1. Given the table size, it's effectively the same as 1000.
Also, should 1M just be 0? Not sure if the goal is to have unique elements (AFAIK row count is typically lower than 1M).

mhaseeb123 · 2024-08-19T21:46:34Z

/ok to test

vuule · 2024-08-20T11:23:49Z

what's the reason for this change?

mhaseeb123 · 2024-08-20T17:44:27Z

what's the reason for this change?

First of all welcome back. Greg wanted me to push any updates I did to the benchmark for #16541. Though I think that all my local changes (even wider extended ranges) need not to be pushed upstream if not needed.

…::static_map` (#16541) Part of #12261. This PR refactors the dictionary encoding in Parquet writers to migrate from `cuco::legacy::static_map` to `cuco::static_map` to build the dictionaries. ### Performance Results The changes result in +0.08% average speed improvement and +16.22% average memory footprint increase (stems from the adjusted sizes by `cuco::make_window_extent` due to [prime gap](https://en.wikipedia.org/wiki/Prime_gap)) across the benchmark cases extended from #16591 Currently, we do see a roughly 8% speed improvement in map insert and find kernels which is counteracted by the map init and map collect kernels as they have to process 16.22% more slots. With a cuco bump, the average speed improvement will increase from 0.08% to +3% and the memory footprint change will go back from 16.22% to +0%. ### Hardware used for benchmarking ``` `NVIDIA RTX 5880 Ada Generation` * SM Version: 890 (PTX Version: 860) * Number of SMs: 110 * SM Default Clock Rate: 18446744071874 MHz * Global Memory: 23879 MiB Free / 48632 MiB Total * Global Memory Bus Peak: 960 GB/sec (384-bit DDR @10001MHz) * Max Shared Memory: 100 KiB/SM, 48 KiB/Block * L2 Cache Size: 98304 KiB * Maximum Active Blocks: 24/SM * Maximum Active Threads: 1536/SM, 1024/Block * Available Registers: 65536/SM, 65536/Block * ECC Enabled: No ``` Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: #16541

mhaseeb123 · 2024-09-06T23:16:25Z

/ok to test

mhaseeb123 · 2024-09-07T01:33:54Z

/ok to test

vuule · 2024-09-09T17:32:44Z

cpp/benchmarks/io/parquet/parquet_writer.cpp

-  .add_int64_axis("cardinality", {0, 1000})
-  .add_int64_axis("run_length", {1, 32});
+  .add_int64_axis("cardinality", {0, 1000, 10'000, 100'000})
+  .add_int64_axis("run_length", {1, 32, 64});


Sorry, missed this the first time around. To me, 32 is already a very high run length. I think we should instead add 4 or 8. I'm fine with 64 if you do see a significant difference in file size and/or performance compared to 32.

No significant differences with 64 for the benchmarks we ran either.

I vote for 4 or 8, then

mhaseeb123 · 2024-09-09T18:49:07Z

/ok to test

karthikeyann

Would be nice to know how much time this increases in benchmark runs.
If it is not available now, follow up with Randy on benchmark runs.

mhaseeb123 · 2024-09-09T21:39:51Z

/merge

mhaseeb123 · 2024-09-09T21:45:40Z

Would be nice to know how much time this increases in benchmark runs. If it is not available now, follow up with Randy on benchmark runs.

Results in #16541 (here) for which we are extending this. Each new benchmark in the matrix takes roughly 0.5s to run on my workstation (AMD Threadripper + RTX Ada 5880) so it should be roughly an 4s increase in total time (8x new benchmarks).

mhaseeb123 added 2 commits August 15, 2024 23:44

Initial commit for benchmark for PQ writer dict encoding

c7b13fd

Updates

f255fba

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Aug 17, 2024

mhaseeb123 mentioned this pull request Aug 17, 2024

Refactor dictionary encoding in PQ writer to migrate to the new cuco::static_map #16541

Merged

3 tasks

mhaseeb123 added 2 - In Progress Currently a work in progress non-breaking Non-breaking change cuIO cuIO issue improvement Improvement / enhancement to an existing function labels Aug 17, 2024

mhaseeb123 and others added 2 commits August 19, 2024 11:20

Merge branch 'branch-24.10' into pq-writer-dict-benchmark

340b88b

Merge branch 'branch-24.10' of https://github.com/mhaseeb123/cudf int…

26c3a89

…o pq-writer-dict-benchmark

mhaseeb123 and others added 4 commits August 19, 2024 14:25

Minor updates

c17a4b9

Update run length

1535fee

Updates

a29fb5f

Revert erroneous white space change

0cc2bd6

mhaseeb123 commented Aug 19, 2024

View reviewed changes

Merge branch 'branch-24.10' into pq-writer-dict-benchmark

d8f984d

mhaseeb123 marked this pull request as ready for review August 19, 2024 21:41

mhaseeb123 requested a review from a team as a code owner August 19, 2024 21:41

mhaseeb123 requested review from karthikeyann and vuule August 19, 2024 21:41

mhaseeb123 changed the title ~~Add a benchmark for Parquet writer's dictionary encoding.~~ Extend the Parquet writer's dictionary encoding benchmark. Aug 19, 2024

mhaseeb123 self-assigned this Aug 19, 2024

mhaseeb123 added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress cuIO cuIO issue labels Aug 19, 2024

mhaseeb123 changed the title ~~Extend the Parquet writer's dictionary encoding benchmark.~~ [Minor] Extend the Parquet writer's dictionary encoding benchmark. Aug 20, 2024

mhaseeb123 and others added 2 commits September 6, 2024 22:55

Code review updates

b00850d

Merge branch 'branch-24.10' into pq-writer-dict-benchmark

507ae3d

Add 100K cardinality

fef3179

vuule reviewed Sep 9, 2024

View reviewed changes

mhaseeb123 added 2 commits September 9, 2024 11:48

Updates from code review

bfe3b13

Merge branch 'branch-24.10' into pq-writer-dict-benchmark

97b766f

mhaseeb123 added 4 - Needs Review Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Sep 9, 2024

vuule approved these changes Sep 9, 2024

View reviewed changes

mhaseeb123 changed the title ~~[Minor] Extend the Parquet writer's dictionary encoding benchmark.~~ Extend the Parquet writer's dictionary encoding benchmark. Sep 9, 2024

karthikeyann approved these changes Sep 9, 2024

View reviewed changes

mhaseeb123 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs Review Waiting for reviewer to review or respond labels Sep 9, 2024

rapids-bot bot merged commit f21979e into rapidsai:branch-24.10 Sep 10, 2024
104 checks passed

mhaseeb123 deleted the pq-writer-dict-benchmark branch September 10, 2024 00:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend the Parquet writer's dictionary encoding benchmark. #16591

Extend the Parquet writer's dictionary encoding benchmark. #16591

mhaseeb123 commented Aug 17, 2024 •

edited

Loading

copy-pr-bot bot commented Aug 19, 2024

mhaseeb123 Aug 19, 2024

vuule Sep 6, 2024

mhaseeb123 commented Aug 19, 2024

vuule commented Aug 20, 2024

mhaseeb123 commented Aug 20, 2024 •

edited

Loading

mhaseeb123 commented Sep 6, 2024

mhaseeb123 commented Sep 7, 2024

vuule Sep 9, 2024

mhaseeb123 Sep 9, 2024

vuule Sep 9, 2024

mhaseeb123 Sep 9, 2024

mhaseeb123 commented Sep 9, 2024

karthikeyann left a comment

mhaseeb123 commented Sep 9, 2024

mhaseeb123 commented Sep 9, 2024 •

edited

Loading

Extend the Parquet writer's dictionary encoding benchmark. #16591

Extend the Parquet writer's dictionary encoding benchmark. #16591

Conversation

mhaseeb123 commented Aug 17, 2024 • edited Loading

Description

Checklist

copy-pr-bot bot commented Aug 19, 2024

mhaseeb123 Aug 19, 2024

Choose a reason for hiding this comment

vuule Sep 6, 2024

Choose a reason for hiding this comment

mhaseeb123 commented Aug 19, 2024

vuule commented Aug 20, 2024

mhaseeb123 commented Aug 20, 2024 • edited Loading

mhaseeb123 commented Sep 6, 2024

mhaseeb123 commented Sep 7, 2024

vuule Sep 9, 2024

Choose a reason for hiding this comment

mhaseeb123 Sep 9, 2024

Choose a reason for hiding this comment

vuule Sep 9, 2024

Choose a reason for hiding this comment

mhaseeb123 Sep 9, 2024

Choose a reason for hiding this comment

mhaseeb123 commented Sep 9, 2024

karthikeyann left a comment

Choose a reason for hiding this comment

mhaseeb123 commented Sep 9, 2024

mhaseeb123 commented Sep 9, 2024 • edited Loading

mhaseeb123 commented Aug 17, 2024 •

edited

Loading

mhaseeb123 commented Aug 20, 2024 •

edited

Loading

mhaseeb123 commented Sep 9, 2024 •

edited

Loading