Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit about bitwidth difference between cudf boolean and arrow boolean #9192

Merged
merged 4 commits into from
Sep 22, 2021

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Sep 8, 2021

Currently, we map boolean type to pa.int8 because the bitwidth of cudf boolean mismatches that in arrow. However the implication of this mapping is subtle and may cause unwanted result such as:

>>> cudf.StructDtype({
    "a": np.bool_,
    "b": np.int8,
})
StructDtype({'a': dtype('int8'), 'b': dtype('int8')})

This PR changes the mapping back to pa.bool_, and use explicit type handling when we are dealing with type conversion to arrow.

@isVoid isVoid requested a review from a team as a code owner September 8, 2021 03:00
@isVoid isVoid self-assigned this Sep 8, 2021
@isVoid isVoid added the 3 - Ready for Review Ready for review by team label Sep 8, 2021
@github-actions github-actions bot added the Python Affects Python cuDF API. label Sep 8, 2021
@isVoid isVoid added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function tech debt labels Sep 8, 2021
@isVoid
Copy link
Contributor Author

isVoid commented Sep 8, 2021

rerun tests

@codecov
Copy link

codecov bot commented Sep 9, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@be858a7). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head b071b00 differs from pull request most recent head 6146ae5. Consider uploading reports for the commit 6146ae5 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.10    #9192   +/-   ##
===============================================
  Coverage                ?   10.83%           
===============================================
  Files                   ?      116           
  Lines                   ?    18779           
  Branches                ?        0           
===============================================
  Hits                    ?     2035           
  Misses                  ?    16744           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be858a7...6146ae5. Read the comment docs.

Copy link
Contributor

@thomcom thomcom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create a test that publicly exposes this change? This kind of bit managing is easy to overlook later and a test would reveal if it is missed.

@isVoid isVoid requested a review from thomcom September 17, 2021 17:27
@isVoid isVoid added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Sep 17, 2021
@isVoid
Copy link
Contributor Author

isVoid commented Sep 17, 2021

rerun tests

@galipremsagar
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 20713df into rapidsai:branch-21.10 Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants