-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add list len support #7157
Labels
Comments
albert17
added
Needs Triage
Need team to review and classify
feature request
New feature or request
labels
Jan 15, 2021
kkraus14
added
Python
Affects Python cuDF API.
libcudf
Affects libcudf (C++/CUDA) code.
and removed
Needs Triage
Need team to review and classify
labels
Jan 15, 2021
I believe we'd need a new libcudf function of something like |
+1 we could definitely use this for spark too. |
rapids-bot bot
pushed a commit
that referenced
this issue
Jan 25, 2021
This adds the libcudf part of #7157 ``` std::unique_ptr<column> cudf::lists::count_elements( lists_column_view const& input, rmm::mr::device_memory_resource* mr); ``` Returns the size of each element in the input lists column. The PR also includes gtests for this new API. Authors: - David (@davidwendt) Approvers: - @nvdbaranec - AJ Schmidt (@ajschmidt8) - Karthikeyan (@karthikeyann) - Mark Harris (@harrism) URL: #7173
@shwina and I will pair on introducing the python API. |
kkraus14
removed
Spark
Functionality that helps Spark RAPIDS
libcudf
Affects libcudf (C++/CUDA) code.
labels
Jan 29, 2021
rapids-bot bot
pushed a commit
that referenced
this issue
Feb 4, 2021
Closes #7157 This PR adds `ListMethods.len()` API that returns an integer column that contains the length for each element in a `ListColumn`. Example: ```python >>> s = cudf.Series([[1,2], None, [3]]) >>> s 0 [1, 2] 1 None 2 [3] dtype: list >>> s.list.len() 0 2 1 <NA> 2 1 dtype: int32 ``` Authors: - Michael Wang (@isVoid) - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - @brandon-b-miller URL: #7283
Fixed by #7283 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
I need to be able to get list len for the entries of a column whose data type is list. @kkraus14 told me this is not implemented yet.
It works for string:
ddf[col] = ddf[col].map_partitions(lambda x: x.str.len())
But not for list:
ddf[col] = ddf[col].map_partitions(lambda x: x.list.len(), meta=(col, ddf_dtypes[col].dtype))
I get the errorException: AttributeError("'ListMethods' object has no attribute 'len'")
The text was updated successfully, but these errors were encountered: