Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate index merging #10689

Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Apr 19, 2022

This PR deprecates support for merging Index objects. pandas only supports merging of DataFrames, so we should move towards that as well. The main internal implication of this change is that BaseIndex.union and BaseIndex.difference now require an internal conversion to a DataFrame followed by a conversion of the result back to the appropriate index type. Since the intermediate objects are not modified and don't involve additional memory allocations, this change just adds a little bit of Python overhead to index merging (10-50 us). Once the deprecated code is fully removed, though, we should be able to make this time back by simplifying the internals of joining, which currently has logic for handling Series and Index objects internally.

@vyasr vyasr added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 19, 2022
@vyasr vyasr added this to the CuDF Python Refactoring milestone Apr 19, 2022
@vyasr vyasr self-assigned this Apr 19, 2022
@vyasr vyasr requested a review from a team as a code owner April 19, 2022 22:29
@codecov
Copy link

codecov bot commented Apr 19, 2022

Codecov Report

Merging #10689 (2c83f3e) into branch-22.06 (94a5d41) will decrease coverage by 0.00%.
The diff coverage is 90.12%.

@@               Coverage Diff                @@
##           branch-22.06   #10689      +/-   ##
================================================
- Coverage         86.38%   86.38%   -0.01%     
================================================
  Files               142      142              
  Lines             22334    22341       +7     
================================================
+ Hits              19294    19300       +6     
- Misses             3040     3041       +1     
Impacted Files Coverage Δ
python/cudf/cudf/_fuzz_testing/json.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 89.22% <ø> (+0.12%) ⬆️
python/cudf/cudf/core/index.py 92.31% <ø> (ø)
python/cudf/cudf/utils/gpu_utils.py 50.00% <0.00%> (-4.29%) ⬇️
python/cudf/cudf/core/df_protocol.py 88.48% <33.33%> (ø)
python/cudf/cudf/core/algorithms.py 90.47% <50.00%> (ø)
python/cudf/cudf/core/subword_tokenizer.py 75.00% <50.00%> (ø)
python/cudf/cudf/core/reshape.py 89.82% <80.00%> (-0.27%) ⬇️
python/cudf/cudf/testing/_utils.py 93.85% <80.00%> (ø)
python/cudf/cudf/core/frame.py 93.41% <89.47%> (-0.26%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 31a5f44...2c83f3e. Read the comment docs.

@vyasr
Copy link
Contributor Author

vyasr commented Apr 26, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 75a675b into rapidsai:branch-22.06 Apr 26, 2022
@vyasr vyasr deleted the refactor/deprecate_index_merge branch June 27, 2022 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants