Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cuda::proclaim_return_type on device lambdas #1662

Merged

Conversation

hyperbolic2346
Copy link
Collaborator

@hyperbolic2346 hyperbolic2346 commented Dec 19, 2023

This PR makes spark-rapids-jni compatible with Thrust 2 by adding cuda::proclaim_return_type. This is following suit with cudf, who recently did this work with rapidsai/cudf#14577

closes #1639

Signed-off-by: Mike Wilson <knobby@burntsheep.com>
Signed-off-by: Mike Wilson <knobby@burntsheep.com>
@hyperbolic2346
Copy link
Collaborator Author

build

@@ -164,11 +166,11 @@ bool __device__ validate_ipv6(string_view s)
int address_char_count{0};
bool address_has_hex{false};

auto const leading_double_colon = [&]() {
auto const leading_double_colon = cuda::proclaim_return_type<bool>([&]() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not device lambda so don't need to change here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I just realized that this lambda is unused. The only place that refers to it is

// if (colon_count == max_colons && !leading_double_colon) { return false; }
which is commented out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I wrote it all to match the RFC, but Spark/Java don't follow it exactly and I need to match them. I documented the differences by commenting out the original path. I don't know the take on this and if I should remove it or not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you commented out the caller then you should also comment out the function too, since it is now unused.

Signed-off-by: Mike Wilson <knobby@burntsheep.com>
@ttnghia
Copy link
Collaborator

ttnghia commented Dec 19, 2023

FYI, CCCL 2 has been merged into cudf (rapidsai/cudf#14576). You can test locally this PR with the latest cudf commit to make sure this compiles successfully.

@mythrocks
Copy link
Collaborator

For the record, we'll also need to change CMakeLists.txt, to fetch CCCL 2:

diff --git a/src/main/cpp/CMakeLists.txt b/src/main/cpp/CMakeLists.txt
index 18c0cd1..a905297 100644
--- a/src/main/cpp/CMakeLists.txt
+++ b/src/main/cpp/CMakeLists.txt
@@ -94,11 +94,8 @@ include(cmake/Modules/ConfigureCUDA.cmake) # set other CUDA compilation flags
 # ##################################################################################################
 # * dependencies ----------------------------------------------------------------------------------
 
-# find libcu++
-include(${rapids-cmake-dir}/cpm/libcudacxx.cmake)
-
-# find thrust/cub
-include(${CUDF_DIR}/cpp/cmake/thirdparty/get_thrust.cmake)
+# find CCCL
+include(${CUDF_DIR}/cpp/cmake/thirdparty/get_cccl.cmake)
 
 # JNI
 find_package(JNI REQUIRED)

Signed-off-by: Mike Wilson <knobby@burntsheep.com>
@mythrocks
Copy link
Collaborator

mythrocks commented Dec 19, 2023

I think the changes in this PR should be good to go.

They now include the changes I had independently. (i.e. changes to CMakeLists.txt, map_utils.cu, and row_conversion.cu.)
(minimal.diff.txt, attached for reference.)

The changes in this PR are more comprehensive, if not yet strictly required.

I do see that we have trouble running tests/ROW_CONVERSION successfully with CCCL2. But that might well be a pre-existing problem.

@ttnghia
Copy link
Collaborator

ttnghia commented Dec 19, 2023

Yes the issue I was trying to solve (#1567) now shows up---my previous fix doesn't work anymore. We now need to fix it some other way.

@mythrocks
Copy link
Collaborator

My conservative vote would be not to change the thirdparty/cudf submodule commit hash to go to CCCL-2, for now.

But we need to, there might be value in commenting out RowToColumnTests.ManyStrings to allow the build to run through with CCCL-2. We can then chase down the inclusive_scan problem that row_conversion.cu runs into (#1579).

@abellina
Copy link
Collaborator

My conservative vote would be not to change the thirdparty/cudf submodule commit hash to go to CCCL-2, for now.

I am confused @mythrocks, why do you want to hold?

@mythrocks
Copy link
Collaborator

I'm 👍 on this change. We might need to rebase it for the latest in parse_uri.cu.

@mythrocks
Copy link
Collaborator

I am confused @mythrocks, why do you want to hold?

@abellina: Don't mind me. I'm irrationally apprehensive about disruptive changes near the holidays.

@hyperbolic2346
Copy link
Collaborator Author

I feel the same way. I don't see the value in checking in something and then going on holiday. So many are out until the new year already. If we run across some major issue, we may not be around to fix or triage it. I'd prefer to leave things running until January than check this in and have unknown issues crop up during that reduced productivity phase. I'm all for fast-following cudf, but I don't see that making this change to CCCL 2.2 now vs in 2 weeks makes a huge difference and it comes with a potential for extended disruptions.

This ship may have already sailed though since even with a pinned cudf, we are pulling in the latest rmm and our builds are broken.

@hyperbolic2346
Copy link
Collaborator Author

I'm 👍 on this change. We might need to rebase it for the latest in parse_uri.cu.

Done

@hyperbolic2346
Copy link
Collaborator Author

build

Signed-off-by: Mike Wilson <knobby@burntsheep.com>
@ttnghia
Copy link
Collaborator

ttnghia commented Dec 20, 2023

We need to wait until tomorrow so the changes in cudf can propagate here to build.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes thus far look good to me. One comment on size_type vs. int32_t.

src/main/cpp/src/row_conversion.cu Outdated Show resolved Hide resolved
mythrocks
mythrocks previously approved these changes Dec 20, 2023
Copy link
Collaborator

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, save for what @bdice suggested regarding size_type vs int32_t.

Signed-off-by: Mike Wilson <knobby@burntsheep.com>
Co-authored-by: Bradley Dice <bdice@bradleydice.com>
@hyperbolic2346
Copy link
Collaborator Author

build

@ttnghia
Copy link
Collaborator

ttnghia commented Dec 21, 2023

Depends on #1668.

ttnghia and others added 5 commits December 20, 2023 16:39
Signed-off-by: Nghia Truong <nghiat@nvidia.com>
Signed-off-by: Nghia Truong <nghiat@nvidia.com>
Signed-off-by: Mike Wilson <knobby@burntsheep.com>
Signed-off-by: Nghia Truong <nghiat@nvidia.com>
Signed-off-by: Nghia Truong <nghiat@nvidia.com>
# Conflicts:
#	src/main/cpp/src/row_conversion.cu
@ttnghia
Copy link
Collaborator

ttnghia commented Dec 21, 2023

build

@ttnghia ttnghia self-assigned this Dec 21, 2023
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CCCL 2.2.0 C++ and CMake changes look good to me.

@ttnghia ttnghia merged commit 763406c into NVIDIA:branch-24.02 Dec 21, 2023
3 checks passed
@hyperbolic2346 hyperbolic2346 deleted the mwilson/proclaim_return_type branch December 22, 2023 04:26
ttnghia added a commit that referenced this pull request Mar 5, 2024
This reverts commit 763406c.

# Conflicts:
#	src/main/cpp/benchmarks/row_conversion.cpp
#	src/main/cpp/src/bloom_filter.cu
#	src/main/cpp/src/map_utils.cu
#	src/main/cpp/src/xxhash64.cu
#	thirdparty/cudf

Signed-off-by: Nghia Truong <nghiat@nvidia.com>
ttnghia added a commit that referenced this pull request Mar 6, 2024
* Revert "Use cuda::proclaim_return_type on device lambdas (#1662)"

This reverts commit 763406c.

# Conflicts:
#	src/main/cpp/benchmarks/row_conversion.cpp
#	src/main/cpp/src/bloom_filter.cu
#	src/main/cpp/src/map_utils.cu
#	src/main/cpp/src/xxhash64.cu
#	thirdparty/cudf

Signed-off-by: Nghia Truong <nghiat@nvidia.com>

* Remove redundant header

Signed-off-by: Nghia Truong <nghiat@nvidia.com>

* Update copyright year

Signed-off-by: Nghia Truong <nghiat@nvidia.com>

---------

Signed-off-by: Nghia Truong <nghiat@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Audit code for proclaim return type changes
5 participants