Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully support nested types in lists::drop_list_duplicates #11224

Closed

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Jul 8, 2022

This reimplements lists::drop_list_duplicates, adding full support for nested types. The new implementation is significantly shorter than the old one by leveraging the existing label_segments and stable_distinct utilities APIs. Performance should also be improved significantly by completely avoiding sorting.

Closes #11093, closes #11053, and closes #9257.

Depends on:

Follow up work:


Progress:

  • Implementation.
  • Fix unit tests

@ttnghia ttnghia added feature request New feature or request 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS non-breaking Non-breaking change labels Jul 8, 2022
@ttnghia ttnghia self-assigned this Jul 8, 2022
@github-actions github-actions bot added the CMake CMake build issue label Jul 8, 2022
@ttnghia ttnghia closed this Jul 8, 2022
@ttnghia ttnghia deleted the reimplement_drop_list_duplicates branch July 14, 2022 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
1 participant