Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test updates of CCCL (thrust, cub, libcudacxx) to 2.1.0. #13222

Closed
wants to merge 107 commits into from
Closed
Show file tree
Hide file tree
Changes from 83 commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
6630fd7
Test rapids-cmake branch with CCCL updates to 2.1.0.
bdice Apr 26, 2023
4f90cff
Update Thrust patches for 2.1.0.
bdice Apr 26, 2023
c557680
Remove outdated Thrust/CUB patches.
bdice Apr 26, 2023
205e4a9
Set variables even if CUDF_RAPIDS.cmake already exists.
bdice May 4, 2023
980f854
Add cuda::proclaim_return_type to tdigest_aggregation.cu.
bdice May 5, 2023
e532690
Revert "Add cuda::proclaim_return_type to tdigest_aggregation.cu."
bdice May 5, 2023
f6ee13d
Got tdigest_aggregation.cu to compile with explicit functors.
bdice May 5, 2023
1c5c2cf
Add proclaim for cluster_size lambda.
bdice May 8, 2023
e120422
Add return type proclamation for offsets device lambda.
bdice May 8, 2023
e3c4f07
Add return type proclamation workaround for is_stub_weight and is_stu…
bdice May 8, 2023
9e4c832
Use cuda::proclaim_return_type with a workaround in libcudacxx.
bdice May 9, 2023
2131aea
Merge remote-tracking branch 'upstream/branch-23.06' into cccl-update…
bdice May 9, 2023
3ab41be
Proclaim return types needed in binary_ops.cu.
bdice May 9, 2023
6d69e74
Proclaim return types needed in null_mask.cu.
bdice May 9, 2023
2fe4483
Proclaim return types needed in contiguous_split.cu.
bdice May 9, 2023
103cd88
Proclaim return types needed in copy.cu.
bdice May 9, 2023
1c291b6
Proclaim return types needed in gather.cu.
bdice May 10, 2023
57ef356
Proclaim return types needed in reverse.cu.
bdice May 10, 2023
41ed0bd
Proclaim return types needed in sample.cu.
bdice May 10, 2023
ab0135c
Proclaim return types needed in scatter.cu.
bdice May 10, 2023
1de0bf0
Proclaim return types needed in concatenate.cu.
bdice May 10, 2023
45bca2e
Proclaim return types needed in repeat.cu.
bdice May 10, 2023
3b9933d
Proclaim return types needed in hash/groupby.cu.
bdice May 10, 2023
30a7035
Proclaim return types needed in group_count.cu.
bdice May 10, 2023
9e48099
Proclaim return types needed in group_nth_element.cu.
bdice May 10, 2023
773e320
Proclaim return types needed in sort_helper.cu.
bdice May 10, 2023
c921d3d
Proclaim return types needed in json_column.cu.
bdice May 11, 2023
41cfdc8
Proclaim return types needed in json_tree.cu.
bdice May 11, 2023
0bee4dc
Intermediate work fixing more JSON TUs.
bdice May 11, 2023
c37bd6a
Intermediate work.
bdice May 12, 2023
86345b8
Merge branch 'branch-23.10' into cccl-update-2.1.0
bdice Jul 21, 2023
83afb72
Reapply contiguous_split.cu changes.
bdice Jul 21, 2023
3c9039d
Fix style.
bdice Aug 1, 2023
f6aaa1e
Merge branch 'branch-23.10' into cccl-update-2.1.0
bdice Aug 3, 2023
a587486
Proclaim return types needed in contiguous_split.cu.
bdice Aug 3, 2023
b2073ca
Proclaim return types needed in strings/detail/gather.cuh.
bdice Aug 3, 2023
edb5eaa
Proclaim return types needed in json_tree.cu.
bdice Aug 3, 2023
dec4ac2
Proclaim return types needed in write_json.cu.
bdice Aug 3, 2023
f474771
Proclaim return types needed in parquet/reader_impl_preprocess.cu.
bdice Aug 3, 2023
e35acf6
Proclaim return types needed in multibyte_split.cu and strings/detail…
bdice Aug 3, 2023
abc298a
Proclaim return types needed in lists/contains.cu.
bdice Aug 3, 2023
bf3f7d9
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 4, 2023
153a5fc
Proclaim return types needed in lists/contains.cu.
bdice Aug 5, 2023
d956025
Proclaim return types needed in lists/combine/concatenate_rows.cu.
bdice Aug 5, 2023
8470c56
Proclaim return types needed in lists/copying/segmented_gather.cu.
bdice Aug 5, 2023
79d061b
Proclaim return types needed in lists/copying/scatter_helper.cu.
bdice Aug 5, 2023
c7a4543
Proclaim return types needed in lists/dremel.cu, lists/explode.cu.
bdice Aug 5, 2023
c6f08b4
Proclaim return types needed in lists/interleave_columns.cu.
bdice Aug 5, 2023
6a61aab
Proclaim return types needed in merge.cu.
bdice Aug 5, 2023
deed129
Proclaim return types needed in merge.cu.
bdice Aug 5, 2023
a5e97f0
Proclaim return types needed in round_robin.cu.
bdice Aug 5, 2023
f55cf94
Proclaim return types needed in tdigest.cu.
bdice Aug 5, 2023
dd06562
Merge branch 'branch-23.10' into cccl-update-2.1.0
bdice Aug 7, 2023
834754e
Partial fix for quantile.cu. This won't compile when other fixes are …
bdice Aug 8, 2023
a2bb0d8
Proclaim return types needed in quantiles.cu.
bdice Aug 8, 2023
c4641e4
Proclaim return types needed in reductions/nth_element.cu.
bdice Aug 8, 2023
1899218
Proclaim return types needed in reductions/product.cu.
bdice Aug 8, 2023
f4da935
Proclaim return types needed in reductions/any.cu.
bdice Aug 8, 2023
d0d6782
Add casting step for binary operators, needed by CUB 2.0.0.
bdice Aug 9, 2023
7e0c5a0
Apply casting functor to scan_exclusive.cu and scan_inclusive.cu.
bdice Aug 9, 2023
88860a3
Fixed segmented simple reductions.
bdice Aug 9, 2023
d97f71b
Proclaim return types needed in clamp.cu.
bdice Aug 9, 2023
9c38f9f
Proclaim return types needed in interleave_columns.cu.
bdice Aug 9, 2023
7b4a2e8
Proclaim return types needed in rolling_fixed_window.cu.
bdice Aug 9, 2023
b5be872
Proclaim return types needed in grouped_rolling.cu.
bdice Aug 9, 2023
0349b60
Proclaim return types needed in contains_table.cu.
bdice Aug 9, 2023
fdbd273
Proclaim return types needed in rank.cu.
bdice Aug 9, 2023
2f75146
Proclaim return types needed in distinct.cu.
bdice Aug 9, 2023
bff946a
Proclaim return types needed in strings/attributes.cu.
bdice Aug 9, 2023
66ee76d
Proclaim return types needed in strings/extract/extract.cu.
bdice Aug 9, 2023
fdbfe38
Proclaim return types needed in strings/filling/fill.cu.
bdice Aug 9, 2023
9e34637
Proclaim return types needed in strings/replace/multi.cu.
bdice Aug 9, 2023
118a3e2
Proclaim return types needed in strings/replace/replace.cu.
bdice Aug 9, 2023
0fcf2cd
Proclaim return types needed in strings/split/split.cu.
bdice Aug 9, 2023
8267420
Proclaim return types needed in strings/split/split_record.cu.
bdice Aug 9, 2023
53eef61
Proclaim return types needed in text/generate_ngrams.cu.
bdice Aug 9, 2023
c47605d
Proclaim return types needed in text/ngrams_tokenize.cu.
bdice Aug 9, 2023
e9c40b9
Proclaim return types needed in tests/utilities/column_utilities.cu.
bdice Aug 9, 2023
37d6ae0
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 10, 2023
247800c
Proclaim return types needed in benchmarks/common/generate_input.cu.
bdice Aug 10, 2023
ebf46c1
Remove cuda::proclaim_return_type<void> because thrust::for_each does…
bdice Aug 10, 2023
e8bbd3d
Ensure rapids-cmake patches for thrust are also applied
robertmaynard Aug 11, 2023
f5bf35e
Don't use librmm nightlies as they come with an older version of thrust
robertmaynard Aug 11, 2023
9853476
Remove proclaimed return types wherever they are not strictly required.
bdice Aug 11, 2023
8e09b06
Apply casting function to reductions/min.cu and reductions/max.cu
robertmaynard Aug 14, 2023
32ca7dd
Apply casting function to segmented/min.cu and segmented/max.cu
robertmaynard Aug 14, 2023
1877b35
Correct compilation errors in quantile.cu
robertmaynard Aug 14, 2023
728fa02
Update the docs for segmented_reduction
robertmaynard Aug 14, 2023
7fa8f0c
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 14, 2023
1f94000
Proclaim missing return types from recent PRs.
bdice Aug 14, 2023
7cebee1
Merge branch 'branch-23.10' into cccl-update-2.1.0
bdice Aug 14, 2023
a4b359f
Move cast_functor to its own header.
bdice Aug 14, 2023
c9ed45c
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 16, 2023
c802331
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 17, 2023
df6811b
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 18, 2023
18a29f8
Proclaim return types needed in load_merges_file.cu.
bdice Aug 18, 2023
9f6591c
Add dependencies to test_java that are needed when rmm is built from …
robertmaynard Aug 21, 2023
f60da3f
Add dependencies that are needed when rmm is built from source
robertmaynard Aug 21, 2023
90c0795
Declare size_type return type.
bdice Aug 26, 2023
68afa1d
Use std::decay_t.
bdice Aug 26, 2023
5115c91
Inline functor creation.
bdice Aug 26, 2023
989b958
Fix off-by-one error in adjacent_difference.
bdice Aug 28, 2023
f060833
Proclaim return types needed in JNI code.
bdice Aug 28, 2023
325fc70
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Aug 29, 2023
dcfd5db
Merge remote-tracking branch 'upstream/branch-23.10' into cccl-update…
bdice Sep 13, 2023
e8f6f08
Merge branch 'branch-23.10' into cccl-update-2.1.0
bdice Dec 5, 2023
84755f9
Merge remote-tracking branch 'upstream/branch-24.02' into cccl-update…
bdice Dec 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ requirements:
- ninja
- sysroot_{{ target_platform }} {{ sysroot_version }}
host:
- librmm ={{ minor_version }}
# - librmm ={{ minor_version }}
- libkvikio ={{ minor_version }}
{% if cuda_major == "11" %}
- cudatoolkit
Expand Down Expand Up @@ -100,7 +100,7 @@ outputs:
{% endif %}
- cuda-version {{ cuda_spec }}
- nvcomp {{ nvcomp_version }}
- librmm ={{ minor_version }}
# - librmm ={{ minor_version }}
- libkvikio ={{ minor_version }}
- libarrow {{ libarrow_version }}
- dlpack {{ dlpack_version }}
Expand Down
21 changes: 12 additions & 9 deletions cpp/benchmarks/common/generate_input.cu
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@
#include <thrust/transform.h>
#include <thrust/tuple.h>

#include <cuda/functional>

#include <algorithm>
#include <cstdint>
#include <memory>
Expand Down Expand Up @@ -247,12 +249,12 @@ struct random_value_fn<T, std::enable_if_t<cudf::is_chrono<T>()>> {
sec.end(),
ns.begin(),
result.begin(),
[] __device__(int64_t sec_value, int64_t nanoseconds_value) {
cuda::proclaim_return_type<T>([] __device__(int64_t sec_value, int64_t nanoseconds_value) {
auto const timestamp_ns =
cudf::duration_s{sec_value} + cudf::duration_ns{nanoseconds_value};
// Return value in the type's precision
return T(cuda::std::chrono::duration_cast<typename T::duration>(timestamp_ns));
});
}));
return result;
}
};
Expand Down Expand Up @@ -367,12 +369,13 @@ rmm::device_uvector<cudf::size_type> sample_indices_with_run_length(cudf::size_t
// This is gather.
auto avg_repeated_sample_indices_iterator = thrust::make_transform_iterator(
thrust::make_counting_iterator(0),
[rb = run_lens.begin(),
re = run_lens.end(),
samples_indices = samples_indices.begin()] __device__(cudf::size_type i) {
auto sample_idx = thrust::upper_bound(thrust::seq, rb, re, i) - rb;
return samples_indices[sample_idx];
});
cuda::proclaim_return_type<cudf::size_type>(
[rb = run_lens.begin(),
re = run_lens.end(),
samples_indices = samples_indices.begin()] __device__(cudf::size_type i) {
auto sample_idx = thrust::upper_bound(thrust::seq, rb, re, i) - rb;
return samples_indices[sample_idx];
}));
rmm::device_uvector<cudf::size_type> repeated_sample_indices(num_rows,
cudf::get_default_stream());
thrust::copy(thrust::device,
Expand Down Expand Up @@ -513,7 +516,7 @@ std::unique_ptr<cudf::column> create_random_utf8_string_column(data_profile cons
lengths.end(),
null_mask.begin(),
lengths.begin(),
[] __device__(auto) { return 0; },
cuda::proclaim_return_type<cudf::size_type>([] __device__(auto) { return 0; }),
thrust::logical_not<bool>{});
auto valid_lengths = thrust::make_transform_iterator(
thrust::make_zip_iterator(thrust::make_tuple(lengths.begin(), null_mask.begin())),
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,29 +1,25 @@
diff --git a/thrust/system/cuda/detail/dispatch.h b/thrust/system/cuda/detail/dispatch.h
index d0e3f94..76774b0 100644
index d0e3f94e..5c32a9c6 100644
--- a/thrust/system/cuda/detail/dispatch.h
+++ b/thrust/system/cuda/detail/dispatch.h
@@ -32,9 +32,8 @@
@@ -32,8 +32,7 @@
status = call arguments; \
} \
else { \
- auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int64_t>(count); \
- status = call arguments; \
- }
+ throw std::runtime_error("THRUST_INDEX_TYPE_DISPATCH 64-bit count is unsupported in libcudf"); \
+ }
+ throw std::runtime_error("THRUST_INDEX_TYPE_DISPATCH 64-bit count is unsupported in libcudf"); \
}

/**
* Dispatch between 32-bit and 64-bit index based versions of the same algorithm
@@ -52,10 +51,8 @@
@@ -52,9 +51,7 @@
status = call arguments; \
} \
else { \
- auto THRUST_PP_CAT2(count1, _fixed) = static_cast<thrust::detail::int64_t>(count1); \
- auto THRUST_PP_CAT2(count2, _fixed) = static_cast<thrust::detail::int64_t>(count2); \
- status = call arguments; \
- }
+ throw std::runtime_error("THRUST_DOUBLE_INDEX_TYPE_DISPATCH 64-bit count is unsupported in libcudf"); \
+ }
+ throw std::runtime_error("THRUST_DOUBLE_INDEX_TYPE_DISPATCH 64-bit count is unsupported in libcudf"); \
}
/**
* Dispatch between 32-bit and 64-bit index based versions of the same algorithm
* implementation. This version allows using different token sequences for callables
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
diff --git a/dependencies/cub/cub/device/dispatch/dispatch_radix_sort.cuh b/dependencies/cub/cub/device/dispatch/dispatch_radix_sort.cuh
index b188c75f..3f36656f 100644
index 953b24b0..3a895323 100644
--- a/dependencies/cub/cub/device/dispatch/dispatch_radix_sort.cuh
+++ b/dependencies/cub/cub/device/dispatch/dispatch_radix_sort.cuh
@@ -736,7 +736,7 @@ struct DeviceRadixSortPolicy
@@ -738,7 +738,7 @@ struct DeviceRadixSortPolicy


/// SM60 (GP100)
Expand All @@ -12,28 +12,28 @@ index b188c75f..3f36656f 100644
enum {
PRIMARY_RADIX_BITS = (sizeof(KeyT) > 1) ? 7 : 5, // 6.9B 32b keys/s (Quadro P100)
diff --git a/dependencies/cub/cub/device/dispatch/dispatch_reduce.cuh b/dependencies/cub/cub/device/dispatch/dispatch_reduce.cuh
index e0470ccb..6a0c2ed6 100644
index de182a93..e2ba1f4b 100644
--- a/dependencies/cub/cub/device/dispatch/dispatch_reduce.cuh
+++ b/dependencies/cub/cub/device/dispatch/dispatch_reduce.cuh
@@ -280,7 +280,7 @@ struct DeviceReducePolicy
};
@@ -423,7 +423,7 @@ struct DeviceReducePolicy
};

/// SM60
- struct Policy600 : ChainedPolicy<600, Policy600, Policy350>
+ struct Policy600 : ChainedPolicy<600, Policy600, Policy600>
{
// ReducePolicy (P100: 591 GB/s @ 64M 4B items; 583 GB/s @ 256M 1B items)
typedef AgentReducePolicy<
/// SM60
- struct Policy600 : ChainedPolicy<600, Policy600, Policy350>
+ struct Policy600 : ChainedPolicy<600, Policy600, Policy600>
{
static constexpr int threads_per_block = 256;
static constexpr int items_per_thread = 16;
diff --git a/dependencies/cub/cub/device/dispatch/dispatch_scan.cuh b/dependencies/cub/cub/device/dispatch/dispatch_scan.cuh
index c2d04588..ac2d10e0 100644
index 0df89b7c..6c499d87 100644
--- a/dependencies/cub/cub/device/dispatch/dispatch_scan.cuh
+++ b/dependencies/cub/cub/device/dispatch/dispatch_scan.cuh
@@ -177,7 +177,7 @@ struct DeviceScanPolicy
};
@@ -240,7 +240,7 @@ struct DeviceScanPolicy
};

/// SM600
- struct Policy600 : ChainedPolicy<600, Policy600, Policy520>
+ struct Policy600 : ChainedPolicy<600, Policy600, Policy600>
{
typedef AgentScanPolicy<
128, 15, ///< Threads per block, items per thread
/// SM600
- struct Policy600 : ChainedPolicy<600, Policy600, Policy520>
+ struct Policy600 : ChainedPolicy<600, Policy600, Policy600>
{
typedef AgentScanPolicy<128,
15, ///< Threads per block, items per thread
17 changes: 6 additions & 11 deletions cpp/cmake/thirdparty/patches/thrust_override.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
"Thrust" : {
"patches" : [
{
"file" : "Thrust/install_rules.diff",
"issue" : "Thrust 1.X installs incorrect files [https://github.com/NVIDIA/thrust/issues/1790]",
"fixed_in" : "2.0.0"
"file" : "Thrust/reroot_support.diff",
"issue" : "Support conda-forge usage of CMake rerooting [https://github.com/NVIDIA/thrust/pull/1969]",
"fixed_in" : "2.2"
},
{
"file" : "${current_json_dir}/thrust_transform_iter_with_reduce_by_key.diff",
"issue" : "Support transform_output_iterator as output of reduce by key [https://github.com/NVIDIA/thrust/pull/1805]",
"fixed_in" : "2.1"
"file" : "Thrust/thrust_libcudacxx_2.1.0.diff",
"issue" : "Use libcudacxx 2.1.0 in Thrust config.",
"fixed_in" : "2.2"
},
{
"file" : "${current_json_dir}/thrust_disable_64bit_dispatching.diff",
Expand All @@ -27,11 +27,6 @@
"file" : "${current_json_dir}/thrust_faster_scan_compile_times.diff",
"issue" : "Improve Thrust scan compile times by reducing the number of kernels generated [https://github.com/rapidsai/cudf/pull/8183]",
"fixed_in" : ""
},
{
"file" : "${current_json_dir}/cub_segmented_sort_with_bool_key.diff",
"issue" : "Fix an error in CUB DeviceSegmentedSort when the keys are bool type [https://github.com/NVIDIA/cub/issues/594]",
"fixed_in" : "2.1"
}
]
}
Expand Down

This file was deleted.

40 changes: 22 additions & 18 deletions cpp/include/cudf/detail/null_mask.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@
#include <thrust/transform.h>
#include <thrust/tuple.h>

#include <cuda/functional>

#include <algorithm>
#include <iterator>
#include <optional>
Expand Down Expand Up @@ -330,20 +332,21 @@ rmm::device_uvector<size_type> segmented_count_bits(bitmask_type const* bitmask,
// set bits from the length of the segment.
auto segments_begin =
thrust::make_zip_iterator(first_bit_indices_begin, last_bit_indices_begin);
auto segment_length_iterator =
thrust::transform_iterator(segments_begin, [] __device__(auto const& segment) {
auto segment_length_iterator = thrust::transform_iterator(
segments_begin, cuda::proclaim_return_type<size_type>([] __device__(auto const& segment) {
auto const begin = thrust::get<0>(segment);
auto const end = thrust::get<1>(segment);
return end - begin;
});
}));
thrust::transform(rmm::exec_policy(stream),
segment_length_iterator,
segment_length_iterator + num_ranges,
d_bit_counts.data(),
d_bit_counts.data(),
[] __device__(auto segment_size, auto segment_bit_count) {
return segment_size - segment_bit_count;
});
cuda::proclaim_return_type<size_type>(
[] __device__(auto segment_size, auto segment_bit_count) {
return segment_size - segment_bit_count;
}));
}

CUDF_CHECK_CUDA(stream.value());
Expand Down Expand Up @@ -541,12 +544,12 @@ std::pair<rmm::device_buffer, size_type> segmented_null_mask_reduction(
{
auto const segments_begin =
thrust::make_zip_iterator(first_bit_indices_begin, last_bit_indices_begin);
auto const segment_length_iterator =
thrust::make_transform_iterator(segments_begin, [] __device__(auto const& segment) {
auto const segment_length_iterator = thrust::make_transform_iterator(
segments_begin, cuda::proclaim_return_type<size_type>([] __device__(auto const& segment) {
auto const begin = thrust::get<0>(segment);
auto const end = thrust::get<1>(segment);
return end - begin;
});
}));

auto const num_segments =
static_cast<size_type>(std::distance(first_bit_indices_begin, first_bit_indices_end));
Expand All @@ -555,9 +558,9 @@ std::pair<rmm::device_buffer, size_type> segmented_null_mask_reduction(
return cudf::detail::valid_if(
segment_length_iterator,
segment_length_iterator + num_segments,
[valid_initial_value] __device__(auto const& length) {
cuda::proclaim_return_type<bool>([valid_initial_value] __device__(auto const& length) {
return valid_initial_value.value_or(length > 0);
},
}),
stream,
mr);
}
Expand All @@ -575,13 +578,14 @@ std::pair<rmm::device_buffer, size_type> segmented_null_mask_reduction(
return cudf::detail::valid_if(
length_and_valid_count,
length_and_valid_count + num_segments,
[null_handling, valid_initial_value] __device__(auto const& length_and_valid_count) {
auto const length = thrust::get<0>(length_and_valid_count);
auto const valid_count = thrust::get<1>(length_and_valid_count);
return (null_handling == null_policy::EXCLUDE)
? (valid_initial_value.value_or(false) || valid_count > 0)
: (valid_initial_value.value_or(length > 0) && valid_count == length);
},
cuda::proclaim_return_type<bool>(
[null_handling, valid_initial_value] __device__(auto const& length_and_valid_count) {
auto const length = thrust::get<0>(length_and_valid_count);
auto const valid_count = thrust::get<1>(length_and_valid_count);
return (null_handling == null_policy::EXCLUDE)
? (valid_initial_value.value_or(false) || valid_count > 0)
: (valid_initial_value.value_or(length > 0) && valid_count == length);
}),
stream,
mr);
}
Expand Down
9 changes: 6 additions & 3 deletions cpp/include/cudf/detail/sizes_to_offsets_iterator.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@
#include <thrust/distance.h>
#include <thrust/scan.h>

#include <cuda/functional>

#include <stdexcept>

namespace cudf {
Expand Down Expand Up @@ -311,9 +313,10 @@ std::pair<std::unique_ptr<column>, size_type> make_offsets_child_column(
// using exclusive-scan technically requires count+1 input values even though
// the final input value is never used.
// The input iterator is wrapped here to allow the last value to be safely read.
auto map_fn = [begin, count] __device__(size_type idx) -> size_type {
return idx < count ? static_cast<size_type>(begin[idx]) : size_type{0};
};
auto map_fn =
cuda::proclaim_return_type<size_type>([begin, count] __device__(size_type idx) -> size_type {
return idx < count ? static_cast<size_type>(begin[idx]) : size_type{0};
});
auto input_itr = cudf::detail::make_counting_transform_iterator(0, map_fn);
// Use the sizes-to-offsets iterator to compute the total number of elements
auto const total_elements = sizes_to_offsets(input_itr, input_itr + count + 1, d_offsets, stream);
Expand Down
28 changes: 28 additions & 0 deletions cpp/include/cudf/detail/utilities/device_operators.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,34 @@ CUDF_HOST_DEVICE inline auto max(LHS const& lhs, RHS const& rhs)
{
return std::max(lhs, rhs);
}

/**
* @brief Functor that casts another functor's result to a specified type.
*
* CUB 2.0.0 reductions require that the binary operator returns the same type
* as the initial value type, so we wrap binary operators with this when used
* by CUB.
*/
template <typename ResultType, typename F>
struct cast_functor_fn {
bdice marked this conversation as resolved.
Show resolved Hide resolved
F f;

template <typename... Ts>
CUDF_HOST_DEVICE inline ResultType operator()(Ts&&... args)
{
return static_cast<ResultType>(f(std::forward<Ts>(args)...));
}
};

/**
* @brief Function creating a casting functor.
*/
template <typename ResultType, typename F>
cast_functor_fn<ResultType, F> cast_functor(F&& f)
{
return {std::forward<F>(f)};
}

} // namespace detail

/**
Expand Down
Loading
Loading