Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gbenchmarks for strings filter functions #7438

Merged
merged 5 commits into from
Feb 26, 2021

Conversation

davidwendt
Copy link
Contributor

Reference #5698
This creates a gbenchmark for the cudf::strings::filter_characters, cudf::strings::filter_characters_of_type, and cudf::strings::strip functions.

This PR also includes changes to strip.cu and filter_chars to use the more efficient make_strings_children utility. This improved performance on these functions by 2x on average.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 24, 2021
@davidwendt davidwendt self-assigned this Feb 24, 2021
@davidwendt davidwendt requested review from a team as code owners February 24, 2021 18:11
@github-actions github-actions bot added the CMake CMake build issue label Feb 24, 2021
Copy link
Collaborator

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmake lgtm

@codecov
Copy link

codecov bot commented Feb 24, 2021

Codecov Report

Merging #7438 (a44098e) into branch-0.19 (43b44e1) will increase coverage by 0.03%.
The diff coverage is 96.66%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7438      +/-   ##
===============================================
+ Coverage        81.80%   81.84%   +0.03%     
===============================================
  Files              101      101              
  Lines            16695    16707      +12     
===============================================
+ Hits             13658    13674      +16     
+ Misses            3037     3033       -4     
Impacted Files Coverage Δ
python/cudf/cudf/core/frame.py 89.25% <ø> (ø)
python/cudf/cudf/core/column_accessor.py 95.31% <95.65%> (+2.37%) ⬆️
python/cudf/cudf/core/dataframe.py 90.46% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0e5aef...a44098e. Read the comment docs.

Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, other than a couple small things

cpp/benchmarks/string/filter_benchmark.cpp Outdated Show resolved Hide resolved
cpp/src/strings/strip.cu Outdated Show resolved Hide resolved
Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@kkraus14
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 862559f into rapidsai:branch-0.19 Feb 26, 2021
@davidwendt davidwendt deleted the benchmark-strings-strip branch March 2, 2021 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants