Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Fix contains check in string column #8834

Merged
merged 2 commits into from
Jul 23, 2021

Conversation

galipremsagar
Copy link
Contributor

Fixes: #8832

This PR fixes contains check in the StringColumn. We were using f"^{item}$" to generate a regex and do a contains_re to check for an exact match for item in the StringColumn, but this approach would break if item by itself has some regex special characters, so replaced these checks with libcudf.search.contains which does the exact check for item in the StringColumn.

@galipremsagar galipremsagar added bug Something isn't working 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer strings strings issues (C++ and Python) non-breaking Non-breaking change labels Jul 22, 2021
@galipremsagar galipremsagar self-assigned this Jul 22, 2021
@galipremsagar galipremsagar requested a review from a team as a code owner July 22, 2021 21:53
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Jul 22, 2021
@galipremsagar
Copy link
Contributor Author

Thanks @rgsl888prabhu & @charlesbluca for a quick review!

@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@quasiben
Copy link
Member

rerun tests

@codecov
Copy link

codecov bot commented Jul 23, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@eddb2f8). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 5c34e2b differs from pull request most recent head 065e171. Consider uploading reports for the commit 065e171 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #8834   +/-   ##
===============================================
  Coverage                ?   10.58%           
===============================================
  Files                   ?      116           
  Lines                   ?    18650           
  Branches                ?        0           
===============================================
  Hits                    ?     1974           
  Misses                  ?    16676           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eddb2f8...065e171. Read the comment docs.

@rapids-bot rapids-bot bot merged commit fc95992 into rapidsai:branch-21.08 Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API. strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]cudf.get_dummies fails if symbols ( $,( ) are present in data
4 participants