Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

device input generation in join bench #10277

Merged

Conversation

karthikeyann
Copy link
Contributor

Use device functions to move input generation to device in join benchmark.

Splitting PR #10109 for review

@karthikeyann karthikeyann added feature request New feature or request 3 - Ready for Review Ready for review by team tests Unit testing for project libcudf Affects libcudf (C++/CUDA) code. 4 - Needs Review Waiting for reviewer to review or respond Performance Performance related issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 14, 2022
@karthikeyann karthikeyann requested a review from a team as a code owner February 14, 2022 14:43
@karthikeyann karthikeyann self-assigned this Feb 14, 2022
@karthikeyann karthikeyann removed the improvement Improvement / enhancement to an existing function label Feb 14, 2022
@codecov
Copy link

codecov bot commented Feb 14, 2022

Codecov Report

Merging #10277 (9f4e4e1) into branch-22.04 (7f2a16f) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.04   #10277      +/-   ##
================================================
- Coverage         10.67%   10.67%   -0.01%     
================================================
  Files               122      122              
  Lines             20873    20878       +5     
================================================
  Hits               2228     2228              
- Misses            18645    18650       +5     
Impacted Files Coverage Δ
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/series.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/dataframe.py 0.00% <0.00%> (ø)
python/cudf/cudf/testing/_utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/indexed_frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column_accessor.py 0.00% <0.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f2a16f...9f4e4e1. Read the comment docs.

@@ -20,9 +20,14 @@
#include <nvbench/nvbench.cuh>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/rapidsai/cudf/blob/branch-22.04/cpp/docs/DEVELOPER_GUIDE.md#includes, headers should be included from the nearest to farthest.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor suggestion for you to consider.

cpp/benchmarks/join/join_common.hpp Show resolved Hide resolved
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@karthikeyann
Copy link
Contributor Author

rerun tests

@karthikeyann karthikeyann added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Feb 16, 2022
@karthikeyann
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit dffed18 into rapidsai:branch-22.04 Feb 16, 2022
rapids-bot bot pushed a commit that referenced this pull request Mar 22, 2022
To speedup generate benchmark input generation, move all data generation to device.
To address #5773 (comment)
This PR moves the random input generation to device.

Rest all of the original work in this PR was split to multiple PRs and merged.
#10277
#10278
#10279
#10280
#10281
#10300

With all of these changes, single iteration of all benchmark runs in <1000 seconds. (from 3067s to 964s).
Running more iterations would see higher benefit too because the benchmark is restarted several times during run which again calls benchmark input generation code.

closes #9857

Authors:
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Vukasin Milovanovic (https://github.com/vuule)
  - David Wendt (https://github.com/davidwendt)

URL: #10109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue tests Unit testing for project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants