Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize join internals around DataFrame #11184

Merged
merged 3 commits into from
Jul 5, 2022

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Jul 1, 2022

This PR un-deprecates the BaseIndex.join method, which was erroneously deprecated, but reimplements it by converting the index objects to DataFrames under the hood. This reimplementation allows us to simplify much of the internals around merging, providing a single main code path implemented only for DataFrames. The behavior of cudf.merge(Series, DataFrame) is preserved via an explicit upcast as well. These changes allow further simplification of the (currently extremely complex) internals of merging, which hopefully will eventually allow us to extract a fast and simple merge function for internal use from the complexities of the public DataFrame.merge API. This change also removes any vestigial accesses to the Frame._index to enable us to remove that.

@vyasr vyasr added 3 - Ready for Review Ready for review by team code quality Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 1, 2022
@vyasr vyasr added this to the CuDF Python Refactoring milestone Jul 1, 2022
@vyasr vyasr self-assigned this Jul 1, 2022
@vyasr vyasr requested a review from a team as a code owner July 1, 2022 00:02
@vyasr
Copy link
Contributor Author

vyasr commented Jul 1, 2022

rerun tests

1 similar comment
@vyasr
Copy link
Contributor Author

vyasr commented Jul 4, 2022

rerun tests

@codecov
Copy link

codecov bot commented Jul 4, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.08@ff63c0a). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.08   #11184   +/-   ##
===============================================
  Coverage                ?   86.31%           
===============================================
  Files                   ?      144           
  Lines                   ?    22714           
  Branches                ?        0           
===============================================
  Hits                    ?    19606           
  Misses                  ?     3108           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff63c0a...ed2e1ab. Read the comment docs.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 5, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 36ec9a7 into rapidsai:branch-22.08 Jul 5, 2022
@vyasr vyasr deleted the refactor/join_frame_usage branch July 13, 2022 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants