Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use page statistics in Parquet reader #14973

Merged
merged 29 commits into from
Mar 7, 2024

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Feb 6, 2024

Description

#14000 added the ability to write new page statistics to the Parquet writer. This PR uses these new statistics to avoid some string size computations. Benchmarks show an improvement in read times of up to 20%.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@etseidl etseidl requested a review from a team as a code owner February 6, 2024 01:38
Copy link

copy-pr-bot bot commented Feb 6, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 6, 2024
@etseidl
Copy link
Contributor Author

etseidl commented Feb 6, 2024

parquet_read_decode benchmark

## [0] NVIDIA RTX A6000

|  data_type  |    io_type    |  cardinality  |  run_length  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |         Diff |   %Diff |
|-------------|---------------|---------------|--------------|------------|-------------|------------|-------------|--------------|---------|
|  INTEGRAL   | DEVICE_BUFFER |       0       |      1       |  14.942 ms |       0.74% |  14.989 ms |       0.76% |    46.778 us |   0.31% |
|  INTEGRAL   | DEVICE_BUFFER |     1000      |      1       |  14.115 ms |       0.50% |  14.201 ms |       0.46% |    85.853 us |   0.61% |
|  INTEGRAL   | DEVICE_BUFFER |       0       |      32      |  12.908 ms |       0.95% |  13.012 ms |       0.41% |   103.184 us |   0.80% |
|  INTEGRAL   | DEVICE_BUFFER |     1000      |      32      |  12.272 ms |       1.04% |  12.323 ms |       0.49% |    51.350 us |   0.42% |
|   STRING    | DEVICE_BUFFER |       0       |      1       |  46.516 ms |       0.45% |  45.672 ms |       0.50% |  -844.320 us |  -1.82% |
|   STRING    | DEVICE_BUFFER |     1000      |      1       |  13.762 ms |       0.68% |  13.015 ms |       0.71% |  -746.030 us |  -5.42% |
|   STRING    | DEVICE_BUFFER |       0       |      32      |  46.612 ms |       0.50% |  45.692 ms |       0.35% |  -920.004 us |  -1.97% |
|   STRING    | DEVICE_BUFFER |     1000      |      32      |  12.815 ms |       1.11% |  10.469 ms |       1.45% | -2345.521 us | -18.30% |
|    LIST     | DEVICE_BUFFER |       0       |      1       |  76.023 ms |       0.11% |  75.662 ms |       0.05% |  -361.155 us |  -0.48% |
|    LIST     | DEVICE_BUFFER |     1000      |      1       |  64.130 ms |       0.20% |  63.906 ms |       0.22% |  -224.179 us |  -0.35% |
|    LIST     | DEVICE_BUFFER |       0       |      32      |  55.348 ms |       0.13% |  55.170 ms |       0.09% |  -177.805 us |  -0.32% |
|    LIST     | DEVICE_BUFFER |     1000      |      32      |  56.574 ms |       0.21% |  56.434 ms |       0.15% |  -139.456 us |  -0.25% |
|   STRUCT    | DEVICE_BUFFER |       0       |      1       |  48.973 ms |       2.36% |  49.407 ms |       2.21% |   433.937 us |   0.89% |
|   STRUCT    | DEVICE_BUFFER |     1000      |      1       |  30.172 ms |       0.78% |  30.140 ms |       0.73% |   -32.744 us |  -0.11% |
|   STRUCT    | DEVICE_BUFFER |       0       |      32      |  49.314 ms |       2.83% |  50.072 ms |       2.74% |   758.959 us |   1.54% |
|   STRUCT    | DEVICE_BUFFER |     1000      |      32      |  28.299 ms |       0.37% |  27.150 ms |       0.37% | -1148.298 us |  -4.06% |

@etseidl
Copy link
Contributor Author

etseidl commented Feb 6, 2024

parquet_read_chunks benchmark

## [0] NVIDIA RTX A6000

|   T    |    io_type    |  cardinality  |  run_length  |  byte_limit  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |          Diff |   %Diff |
|--------|---------------|---------------|--------------|--------------|------------|-------------|------------|-------------|---------------|---------|
| STRING | DEVICE_BUFFER |       0       |      1       |      0       |  45.542 ms |       0.13% |  45.354 ms |       0.52% |   -187.811 us |  -0.41% |
| STRING | DEVICE_BUFFER |     1000      |      1       |      0       |  13.455 ms |       0.43% |  13.059 ms |       0.47% |   -396.648 us |  -2.95% |
| STRING | DEVICE_BUFFER |       0       |      32      |      0       |  45.564 ms |       0.17% |  45.433 ms |       0.31% |   -131.213 us |  -0.29% |
| STRING | DEVICE_BUFFER |     1000      |      32      |      0       |  12.537 ms |       1.16% |  10.671 ms |       4.28% |  -1865.298 us | -14.88% |
| STRING | DEVICE_BUFFER |       0       |      1       |    500000    | 225.866 ms |       0.04% | 184.424 ms |       0.09% | -41442.361 us | -18.35% |
| STRING | DEVICE_BUFFER |     1000      |      1       |    500000    |  77.381 ms |       0.35% |  74.072 ms |       0.27% |  -3309.740 us |  -4.28% |
| STRING | DEVICE_BUFFER |       0       |      32      |    500000    | 225.814 ms |       0.05% | 183.758 ms |       0.18% | -42056.079 us | -18.62% |
| STRING | DEVICE_BUFFER |     1000      |      32      |    500000    |  61.544 ms |       0.51% |  52.393 ms |       0.20% |  -9151.281 us | -14.87% |

and the same but forcing PLAIN encoding

|   T    |    io_type    |  cardinality  |  run_length  |  byte_limit  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |          Diff |   %Diff |
|--------|---------------|---------------|--------------|--------------|------------|-------------|------------|-------------|---------------|---------|
| STRING | DEVICE_BUFFER |       0       |      1       |      0       |  45.449 ms |       0.32% |  45.168 ms |       0.56% |   -281.006 us |  -0.62% |
| STRING | DEVICE_BUFFER |     1000      |      1       |      0       |  46.893 ms |       0.69% |  46.812 ms |       0.79% |    -80.940 us |  -0.17% |
| STRING | DEVICE_BUFFER |       0       |      32      |      0       |  45.666 ms |       0.47% |  45.352 ms |       0.25% |   -314.615 us |  -0.69% |
| STRING | DEVICE_BUFFER |     1000      |      32      |      0       |  24.542 ms |       0.16% |  24.602 ms |       0.38% |     59.229 us |   0.24% |
| STRING | DEVICE_BUFFER |       0       |      1       |    500000    | 227.223 ms |       0.04% | 183.026 ms |       0.13% | -44196.857 us | -19.45% |
| STRING | DEVICE_BUFFER |     1000      |      1       |    500000    | 230.145 ms |       0.11% | 185.712 ms |       0.13% | -44432.935 us | -19.31% |
| STRING | DEVICE_BUFFER |       0       |      32      |    500000    | 227.210 ms |       0.07% | 183.010 ms |       0.07% | -44200.238 us | -19.45% |
| STRING | DEVICE_BUFFER |     1000      |      32      |    500000    | 208.225 ms |       0.13% | 164.224 ms |       0.11% | -44001.523 us | -21.13% |

@GregoryKimball GregoryKimball added the cuIO cuIO issue label Feb 6, 2024
@davidwendt davidwendt added 2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 6, 2024
@davidwendt
Copy link
Contributor

/ok to test

@davidwendt davidwendt requested a review from vuule February 6, 2024 18:51
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already review the previous version of this feature, but going over it again since I didn't quite approve it back then (I almost did, IIRC).
Few nitpicks so far.

cpp/src/io/parquet/reader_impl_preprocess.cu Outdated Show resolved Hide resolved
cpp/src/io/parquet/reader_impl_preprocess.cu Outdated Show resolved Hide resolved
cpp/src/io/parquet/reader_impl_preprocess.cu Outdated Show resolved Hide resolved
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one comment that reads a bit odd to me.


// calculate num_nulls if not available from column index
if (not pg_info.num_nulls.has_value()) {
pg_info.num_nulls = std::reduce(h, h + max_def_level);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the least verbose use of a standard algorithm I've ever seen :D

// info is missing or insufficient, then just return without modifying the row_group_info.
if (not pg_info.num_nulls.has_value() or not pg_info.num_valid.has_value()) { return; }

// if using older page indexes that lack size info, don't use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// if using older page indexes that lack size info, don't use
// if using older page indexes that lack size info, don't use `pg_info`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned this up and added a more detailed TODO...will push once CI finishes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at a few other PRs, this might be an "if CI finishes" situation 😬

@vuule
Copy link
Contributor

vuule commented Feb 27, 2024

/ok to test

@vuule
Copy link
Contributor

vuule commented Mar 1, 2024

/ok to test

@etseidl
Copy link
Contributor Author

etseidl commented Mar 1, 2024

Looks like #15020 broke this (db 3 ets 0 😮 😭 🤣). Regrouping...

etseidl and others added 2 commits March 6, 2024 12:53
@PointKernel
Copy link
Member

/ok to test

@vuule vuule added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 2 - In Progress Currently a work in progress labels Mar 7, 2024
@PointKernel
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit efae666 into rapidsai:branch-24.04 Mar 7, 2024
74 checks passed
@etseidl etseidl deleted the page_stats branch March 7, 2024 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants