Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics cleanup #7439

Merged
merged 18 commits into from
Mar 6, 2021
Merged

Statistics cleanup #7439

merged 18 commits into from
Mar 6, 2021

Conversation

kaatish
Copy link
Contributor

@kaatish kaatish commented Feb 24, 2021

Addresses #7347

@kaatish kaatish added 2 - In Progress Currently a work in progress code quality libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue non-breaking Non-breaking change labels Feb 24, 2021
@kaatish kaatish self-assigned this Feb 24, 2021
@kaatish kaatish added the improvement Improvement / enhancement to an existing function label Feb 24, 2021
@vuule vuule changed the title [FEA] Statistics cleanup Statistics cleanup Feb 24, 2021
@codecov
Copy link

codecov bot commented Feb 24, 2021

Codecov Report

Merging #7439 (39fdbd5) into branch-0.19 (7871e7a) will increase coverage by 0.41%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7439      +/-   ##
===============================================
+ Coverage        81.86%   82.27%   +0.41%     
===============================================
  Files              101      101              
  Lines            16884    17261     +377     
===============================================
+ Hits             13822    14202     +380     
+ Misses            3062     3059       -3     
Impacted Files Coverage Δ
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-1.14%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/struct.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/_version.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/hash_vocab_utils.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
... and 41 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 85c1f8f...39fdbd5. Read the comment docs.

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review

cpp/include/cudf/strings/string.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/strings/string.cuh Outdated Show resolved Hide resolved
cpp/src/io/orc/writer_impl.cu Outdated Show resolved Hide resolved
cpp/src/io/utilities/column_utils.cuh Outdated Show resolved Hide resolved
Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only requesting changes for the warp and block reduce comments. The first one is just a thought.

cpp/include/cudf/strings/string.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/strings/string.cuh Outdated Show resolved Hide resolved
cpp/src/io/statistics/column_stats.cu Outdated Show resolved Hide resolved
@kaatish kaatish marked this pull request as ready for review March 3, 2021 14:49
@kaatish kaatish requested a review from a team as a code owner March 3, 2021 14:49
cpp/src/io/statistics/column_stats.cu Outdated Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Outdated Show resolved Hide resolved
cpp/src/io/utilities/column_utils.cuh Outdated Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Outdated Show resolved Hide resolved
cpp/src/io/orc/writer_impl.cu Outdated Show resolved Hide resolved
Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving string_view changes.

@kaatish kaatish requested a review from vuule March 4, 2021 20:43
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Are there any changes in performance?

@kaatish
Copy link
Contributor Author

kaatish commented Mar 4, 2021

Looks good. Are there any changes in performance?

I can run the benchmarks and post it here.

@vuule
Copy link
Contributor

vuule commented Mar 4, 2021

rerun tests

@kaatish
Copy link
Contributor Author

kaatish commented Mar 5, 2021

Looks good. Are there any changes in performance?

@vuule Performance impact is captured in this gist.

@kaatish
Copy link
Contributor Author

kaatish commented Mar 5, 2021

rerun tests

@vuule
Copy link
Contributor

vuule commented Mar 5, 2021

Looks good. Are there any changes in performance?

@vuule Performance impact is captured in this gist.

At a glance it seems that performance did not change. Do you have the average time?

@kaatish
Copy link
Contributor Author

kaatish commented Mar 6, 2021

Looks good. Are there any changes in performance?

@vuule Performance impact is captured in this gist.

At a glance it seems that performance did not change. Do you have the average time?

Average of Slowdowns (as a ratio):

orc_writer =  0.0072650
parquet_writer =  0.0058364
parquet_chunked_writer = -0.014050

@vuule
Copy link
Contributor

vuule commented Mar 6, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit ab7fe05 into rapidsai:branch-0.19 Mar 6, 2021
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this pull request Mar 25, 2021
Addresses rapidsai#7347

Authors:
  - Kumar Aatish (@kaatish)

Approvers:
  - David (@davidwendt)
  - Devavret Makkar (@devavret)
  - Vukasin Milovanovic (@vuule)

URL: rapidsai#7439
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants