Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detail interface for split and slice(table_view), refactors both function with host_span #9226

Merged
merged 15 commits into from
Oct 7, 2021

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Sep 14, 2021

cudf::detail::slice performs a segmented_count_unset_bits that requires stream ordering. The depending split interface does not have an internal version that accepts a stream argument. Similarly for slice(table_view). This PR fixes that.

Besides, slice/split interface is refactored to accept host_span to specify indices/splits, and is overloaded with std::initializer_list. This allows specifying the argument with both a container and a brace-init-list.

@isVoid isVoid requested a review from a team as a code owner September 14, 2021 03:59
@isVoid isVoid self-assigned this Sep 14, 2021
@isVoid isVoid added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 14, 2021
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Sep 14, 2021
@codecov
Copy link

codecov bot commented Sep 14, 2021

Codecov Report

Merging #9226 (aa67638) into branch-21.12 (ab4bfaa) will decrease coverage by 0.03%.
The diff coverage is 0.00%.

❗ Current head aa67638 differs from pull request most recent head d21b646. Consider uploading reports for the commit d21b646 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9226      +/-   ##
================================================
- Coverage         10.79%   10.75%   -0.04%     
================================================
  Files               116      116              
  Lines             18869    19481     +612     
================================================
+ Hits               2036     2096      +60     
- Misses            16833    17385     +552     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/lists.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/timedelta.py 0.00% <0.00%> (ø)
... and 76 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d7e69a...d21b646. Read the comment docs.

@ttnghia
Copy link
Contributor

ttnghia commented Sep 14, 2021

I think slice is used in more places than those currently modified in this PR. Did you try to search all?

Copy link
Contributor

@thomcom thomcom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the tests passed it would lgtm.

@vyasr
Copy link
Contributor

vyasr commented Sep 15, 2021

rerun tests

@isVoid
Copy link
Contributor Author

isVoid commented Sep 15, 2021

I think slice is used in more places than those currently modified in this PR. Did you try to search all?

Changed a couple more places with use of slice. What confused me at one point is that there are two slice interfaces:

This does not count nulls, so no need to use stream.

ColumnView slice(ColumnView const& input, cudf::size_type begin, cudf::size_type end)

This one does.

std::vector<column_view> slice(column_view const& input,
std::vector<size_type> const& indices,
rmm::cuda_stream_view stream = rmm::cuda_stream_default);

And I only refactor usages of the latter.

@jrhemstad
Copy link
Contributor

Since you're already touching it, can you change split/slice input parameters from std::vector to host_span?

@isVoid isVoid changed the title Add detail interface for split and slice(table_view) Add detail interface for split and slice(table_view), refactors both function with host_span Sep 17, 2021
@isVoid
Copy link
Contributor Author

isVoid commented Sep 17, 2021

The PR became a bit larger because of refactoring the indices specification with host_span. But most are just trivially replacing the brace initialization for vector with an explicit std::vector construction.

Vscode regex find and replace helped.

@isVoid isVoid changed the base branch from branch-21.10 to branch-21.12 September 28, 2021 16:13
@isVoid isVoid requested a review from thomcom September 29, 2021 21:48
@isVoid isVoid added the 3 - Ready for Review Ready for review by team label Sep 29, 2021
@isVoid
Copy link
Contributor Author

isVoid commented Sep 30, 2021

Added an overload of split/slice interface with initializer_list for the indices/splits argument. This allows no change in existing usage of the API.

@isVoid isVoid requested a review from davidwendt October 5, 2021 19:57
@isVoid isVoid requested a review from davidwendt October 5, 2021 20:55
@isVoid isVoid added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 6, 2021
@isVoid
Copy link
Contributor Author

isVoid commented Oct 7, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit c6bc111 into rapidsai:branch-21.12 Oct 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants