Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove lists::drop_list_duplicates #11236

Merged

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Jul 11, 2022

This PR completely removes cudf::lists::drop_list_duplicates. It is replaced by the new API cudf::list::distinct which has a simpler implementation but better performance. The replacements for internal cudf usage have all been merged before thus there is no side effect or breaking for the existing APIs in this work.

Closes #11114, #11093, #11053, #11034, and closes #9257.

Depends on:

@github-actions github-actions bot removed the Python Affects Python cuDF API. label Jul 19, 2022
@ttnghia ttnghia added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress 0 - Blocked Cannot progress due to external reasons labels Jul 19, 2022
@ttnghia ttnghia marked this pull request as ready for review July 19, 2022 02:47
@ttnghia ttnghia requested review from a team as code owners July 19, 2022 02:47
@codecov
Copy link

codecov bot commented Jul 19, 2022

Codecov Report

Merging #11236 (c1d05d9) into branch-22.08 (b2dd1bf) will increase coverage by 0.03%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##           branch-22.08   #11236      +/-   ##
================================================
+ Coverage         86.34%   86.37%   +0.03%     
================================================
  Files               144      144              
  Lines             22826    22826              
================================================
+ Hits              19708    19715       +7     
+ Misses             3118     3111       -7     
Impacted Files Coverage Δ
python/cudf/cudf/core/dataframe.py 93.57% <0.00%> (+0.04%) ⬆️
python/cudf/cudf/core/column/string.py 88.80% <0.00%> (+0.12%) ⬆️
python/cudf/cudf/core/groupby/groupby.py 91.02% <0.00%> (+0.21%) ⬆️
python/cudf/cudf/core/column/numerical.py 96.19% <0.00%> (+0.29%) ⬆️
python/cudf/cudf/core/tools/datetimes.py 84.49% <0.00%> (+0.30%) ⬆️
python/cudf/cudf/core/column/lists.py 91.70% <0.00%> (+0.97%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2dd1bf...c1d05d9. Read the comment docs.

Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes approved

@vuule
Copy link
Contributor

vuule commented Jul 21, 2022

Should the PR be named "Remove ..."? Deprecation means that the API is still available.

@bdice bdice changed the title Deprecate lists::drop_list_duplicates Remove lists::drop_list_duplicates Jul 22, 2022
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. -1,741 lines of code is a great feeling.

@bdice
Copy link
Contributor

bdice commented Jul 22, 2022

@ttnghia I think this can be merged when you're ready. You're more familiar with the state of all the linked issues than I am -- please close or comment on all the related issues once this is merged.

edit: Also, I fixed the labels -- PRs removing public APIs are always breaking.

@bdice bdice added breaking Breaking change and removed non-breaking Non-breaking change labels Jul 22, 2022
@ttnghia
Copy link
Contributor Author

ttnghia commented Jul 22, 2022

Thanks all for helping on this. I'm merging.

@ttnghia
Copy link
Contributor Author

ttnghia commented Jul 22, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit a541ffb into rapidsai:branch-22.08 Jul 22, 2022
@ttnghia ttnghia deleted the deprecate_drop_list_duplicates branch July 25, 2022 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
5 participants