Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add microkernels for fixed-width and fixed-width dictionary in Parquet decode #15159

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
d66835d
rle_stream with dictionary support + micro kernels for fixed and fixed
abellina Feb 20, 2024
05600dd
Merge branch 'branch-24.04' into fixed_ukernel_rlestream_24.04_rebase
abellina Mar 4, 2024
1c4e0bf
namespace fixes, use cudf::detail::warp_size, comment removal, and re…
abellina Mar 4, 2024
d2344f4
Remove unused headers
abellina Mar 4, 2024
8e07a70
Add missing header
abellina Mar 4, 2024
65edb90
Move fixed-width kernel declarations to parquet_gpu.hpp
abellina Mar 5, 2024
851ea8e
Fix constness
abellina Mar 5, 2024
37a17af
more constness and remove old comments
abellina Mar 5, 2024
28d2011
valid_count, processed_count
abellina Mar 5, 2024
8779720
add block comments
abellina Mar 5, 2024
3da8f70
Ran code linter
abellina Mar 5, 2024
983df13
Add nullable optimization and remove string decoding code path
abellina Mar 5, 2024
6be16c4
East constness, comment nit and header include
abellina Mar 7, 2024
cc8c197
Ran code linter
abellina Mar 7, 2024
0171bb5
Avoid implicit !t !warp_id and others
abellina Mar 7, 2024
8e9a981
Remove USE_FIXED_OP and clean up code in page_hdr.cu
abellina Mar 7, 2024
dc0447f
Remove BOOLEAN and BYTE_ARRAY page decode as those are skipped by fix…
abellina Mar 7, 2024
48b1363
Take batch_size into account for thread validity
abellina Mar 8, 2024
f415107
has_nulls instead of has_no_nulls and invert logic
abellina Mar 9, 2024
f0211dd
Make gpuOutputGeneric inline
abellina Mar 9, 2024
d75f73c
Pass error_code and exit early if s->error
abellina Mar 10, 2024
6846ea6
Mark Fixed kernels as CUDF_KERNEL
abellina Mar 10, 2024
45000c2
Applied code linter
abellina Mar 10, 2024
af2693b
ensure both kernels have a function header
abellina Mar 10, 2024
0470a7d
Parenthesize arithmetic
abellina Mar 10, 2024
dbc4682
Introduce capped_target_value_count to not confuse with target_value_…
abellina Mar 10, 2024
a17cdf9
Introduce is_repeated_run and is_literal_run
abellina Mar 10, 2024
bb39e9f
Add function header for decode
abellina Mar 10, 2024
a748677
Add block comment at the top of fill_run_batch
abellina Mar 10, 2024
d7eb9fd
Merge branch 'branch-24.04' into fixed_ukernel_rlestream_24.04_rebase
abellina Mar 11, 2024
2f0eb0d
fix style issues
abellina Mar 11, 2024
b4e778a
Fix typo
abellina Mar 11, 2024
a578143
Merge branch 'fixed_ukernel_rlestream_24.04_rebase' of https://github…
nvdbaranec Mar 11, 2024
6005530
Merge branch 'branch-24.04' into ab_rle_stream
nvdbaranec Mar 11, 2024
d930865
Small comment update about loop pipelining in decode_next.
nvdbaranec Mar 11, 2024
09c83a1
Add another use of is_literal_run.
nvdbaranec Mar 11, 2024
57b14d7
Add is_last_decode_warp() for clarity. Removed extraneous if constexp…
nvdbaranec Mar 12, 2024
3049d98
Merge branch 'branch-24.04' into ab_rle_stream
nvdbaranec Mar 12, 2024
404c043
Add a __syncwarp() to handle a potential memory race in rle_stream.
nvdbaranec Mar 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,7 @@ add_library(
src/io/parquet/reader_impl_helpers.cpp
src/io/parquet/reader_impl_preprocess.cu
src/io/parquet/writer_impl.cu
src/io/parquet/decode_fixed.cu
src/io/statistics/orc_column_statistics.cu
src/io/statistics/parquet_column_statistics.cu
src/io/text/byte_range_info.cpp
Expand Down
492 changes: 492 additions & 0 deletions cpp/src/io/parquet/decode_fixed.cu

Large diffs are not rendered by default.

42 changes: 42 additions & 0 deletions cpp/src/io/parquet/decode_fixed.hpp
vuule marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include "parquet_gpu.hpp"

namespace cudf {
namespace io {
namespace parquet {
namespace detail {
vuule marked this conversation as resolved.
Show resolved Hide resolved
void DecodePageDataFixed(cudf::detail::hostdevice_vector<PageInfo>& pages,
cudf::detail::hostdevice_vector<ColumnChunkDesc> const& chunks,
std::size_t num_rows,
size_t min_row,
int level_type_size,
rmm::cuda_stream_view stream);

void DecodePageDataFixedDict(cudf::detail::hostdevice_vector<PageInfo>& pages,
cudf::detail::hostdevice_vector<ColumnChunkDesc> const& chunks,
std::size_t num_rows,
size_t min_row,
int level_type_size,
rmm::cuda_stream_view stream);

} // namespace detail
} // namespace parquet
} // namespace io
} // namespace cudf
11 changes: 4 additions & 7 deletions cpp/src/io/parquet/decode_preprocess.cu
Original file line number Diff line number Diff line change
Expand Up @@ -342,8 +342,8 @@ CUDF_KERNEL void __launch_bounds__(preprocess_block_size)
// the level stream decoders
__shared__ rle_run<level_t> def_runs[rle_run_buffer_size];
__shared__ rle_run<level_t> rep_runs[rle_run_buffer_size];
rle_stream<level_t, preprocess_block_size> decoders[level_type::NUM_LEVEL_TYPES] = {{def_runs},
{rep_runs}};
rle_stream<level_t, preprocess_block_size, rolling_buf_size>
decoders[level_type::NUM_LEVEL_TYPES] = {{def_runs}, {rep_runs}};

// setup page info
if (!setupLocalPageInfo(
Expand All @@ -353,20 +353,17 @@ CUDF_KERNEL void __launch_bounds__(preprocess_block_size)

// initialize the stream decoders (requires values computed in setupLocalPageInfo)
// the size of the rolling batch buffer
int const max_batch_size = rolling_buf_size;
level_t* rep = reinterpret_cast<level_t*>(pp->lvl_decode_buf[level_type::REPETITION]);
level_t* def = reinterpret_cast<level_t*>(pp->lvl_decode_buf[level_type::DEFINITION]);
level_t* rep = reinterpret_cast<level_t*>(pp->lvl_decode_buf[level_type::REPETITION]);
level_t* def = reinterpret_cast<level_t*>(pp->lvl_decode_buf[level_type::DEFINITION]);
nvdbaranec marked this conversation as resolved.
Show resolved Hide resolved
decoders[level_type::DEFINITION].init(s->col.level_bits[level_type::DEFINITION],
s->abs_lvl_start[level_type::DEFINITION],
s->abs_lvl_end[level_type::DEFINITION],
max_batch_size,
def,
s->page.num_input_values);
if (has_repetition) {
decoders[level_type::REPETITION].init(s->col.level_bits[level_type::REPETITION],
s->abs_lvl_start[level_type::REPETITION],
s->abs_lvl_end[level_type::REPETITION],
max_batch_size,
rep,
s->page.num_input_values);
}
Expand Down
Loading
Loading