Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frame copy to use __class__ instead of type() #9397

Merged
merged 1 commit into from
Oct 8, 2021

Conversation

madsbk
Copy link
Member

@madsbk madsbk commented Oct 7, 2021

This PR address the asymmetry in the following line in Frame.copy():

new_frame = self.__class__.__new__(type(self))

The problem is that self.__class__ might not equal type(self), which is the case for proxy objects like the one used in Dask-CUDA.
Related Dask-CUDA PR: rapidsai/dask-cuda#751

cc. @randerzander

@madsbk madsbk requested a review from a team as a code owner October 7, 2021 08:05
@github-actions github-actions bot added the Python Affects Python cuDF API. label Oct 7, 2021
@madsbk madsbk added 3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 7, 2021
@codecov
Copy link

codecov bot commented Oct 7, 2021

Codecov Report

Merging #9397 (1766997) into branch-21.12 (ab4bfaa) will decrease coverage by 0.04%.
The diff coverage is 0.00%.

❗ Current head 1766997 differs from pull request most recent head 678e54b. Consider uploading reports for the commit 678e54b to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9397      +/-   ##
================================================
- Coverage         10.79%   10.74%   -0.05%     
================================================
  Files               116      116              
  Lines             18869    19081     +212     
================================================
+ Hits               2036     2051      +15     
- Misses            16833    17030     +197     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/lists.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/struct.py 0.00% <0.00%> (ø)
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6bc111...678e54b. Read the comment docs.

@galipremsagar
Copy link
Contributor

rerun tests

@madsbk madsbk added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 8, 2021
@galipremsagar
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit dda5210 into rapidsai:branch-21.12 Oct 8, 2021
rapids-bot bot pushed a commit to rapidsai/dask-cuda that referenced this pull request Oct 11, 2021
Adding testing of cudf dataframe copy, which fails prior to rapidsai/cudf#9397

#### UPDATE: This PR now handles the jit-unspill failures we have been seeing with cudf v21.12.

Some background, when we introduced `ProxyObject`, cudf was using `cudf._lib.table.Table` as the base class for all frame-like objects (dataframes, series, etc). `Table` was implemented in cython so our proxy object had to derive from `Table` in order to support cython functions. The result was to use the class `FrameProxyObject` as a workaround:
```python
    # In order to support the cuDF API implemented in Cython, we inherit from
    # `cudf._lib.table.Table`, which is the base class of Index, Series, and
    # Dataframes in cuDF.
    # Notice, the order of base classes matters. Since ProxyObject is the first
    # base class, ProxyObject.__init__() is called on creation, which doesn't
    # define the Table._data and Table._index attributes. Thus, accessing
    # FrameProxyObject._data and FrameProxyObject._index is pass-through to
    # ProxyObejct.__getattr__(), which is what we want.
    class FrameProxyObject(ProxyObject, cudf._lib.table.Table):
        pass
```
Now that cudf uses `Frame`, a pure Python class, we don't need this workaround anymore. In fact, using `FrameProxyObject` triggers some infinite recursion errors when used together with v21.12+

cc. @randerzander

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #751
@madsbk madsbk deleted the frame_copy_use_class branch April 5, 2022 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants