Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement scatter for struct columns #7752

Merged
merged 33 commits into from
Apr 1, 2021

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Mar 29, 2021

This PR implements scattering for struct columns. It also fixes #7638 when calling cudf::partitioning on struct data due to wrongly scattering data inside the partitioning kernels.

@ttnghia ttnghia added bug Something isn't working feature request New feature or request 3 - Ready for Review Ready for review by team tests Unit testing for project libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Mar 29, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 29, 2021

Currently I'm working on unit tests for struct scattering, which should be done shortly.

@ttnghia ttnghia removed the feature request New feature or request label Mar 29, 2021
Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the java side this fixes the issues I was seeing before.

@codecov
Copy link

codecov bot commented Mar 29, 2021

Codecov Report

Merging #7752 (22660bf) into branch-0.19 (7871e7a) will increase coverage by 0.82%.
The diff coverage is n/a.

❗ Current head 22660bf differs from pull request most recent head 871be28. Consider uploading reports for the commit 871be28 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7752      +/-   ##
===============================================
+ Coverage        81.86%   82.69%   +0.82%     
===============================================
  Files              101      101              
  Lines            16884    17437     +553     
===============================================
+ Hits             13822    14419     +597     
+ Misses            3062     3018      -44     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/lists.py 87.32% <0.00%> (-4.08%) ⬇️
python/dask_cudf/dask_cudf/backends.py 87.50% <0.00%> (-2.13%) ⬇️
python/cudf/cudf/core/column/decimal.py 93.84% <0.00%> (-1.03%) ⬇️
python/cudf/cudf/core/column/column.py 87.53% <0.00%> (-0.23%) ⬇️
python/cudf/cudf/utils/utils.py 85.36% <0.00%> (-0.07%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 78.71% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/struct.py 100.00% <0.00%> (ø)
... and 44 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 635dc9c...871be28. Read the comment docs.

@ttnghia ttnghia marked this pull request as ready for review March 30, 2021 23:58
@ttnghia ttnghia requested review from a team as code owners March 30, 2021 23:58
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 31, 2021

Rerun tests.

Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great overal, couple small things

cpp/include/cudf/detail/scatter.cuh Show resolved Hide resolved
Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM then 👍

@ttnghia ttnghia requested a review from jrhemstad March 31, 2021 22:31
Copy link
Collaborator

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake lgtm

@ttnghia
Copy link
Contributor Author

ttnghia commented Apr 1, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit e7153bb into rapidsai:branch-0.19 Apr 1, 2021
@ttnghia ttnghia self-assigned this Apr 25, 2021
@ttnghia ttnghia deleted the struct_scatter branch May 3, 2021 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change tests Unit testing for project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cudf::partition causing data corruption on structs
5 participants