Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Convert two columns into a new column as a dictionary #8129

Closed
MikeChenfu opened this issue Apr 30, 2021 · 5 comments
Closed

[FEA] Convert two columns into a new column as a dictionary #8129

MikeChenfu opened this issue Apr 30, 2021 · 5 comments
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@MikeChenfu
Copy link

Is your feature request related to a problem? Please describe.
Hello Guys, I see dict in cudf is supported when I instantiate the df. df = cudf.DataFrame({'a': [{'a':1},{'b':1}]}). Is there a method to convert two columns into a new column as a dictionary?

Describe the solution you'd like

df = cudf.DataFrame({'a': [1, 1, 1, 2, 2], 'b': [1, 1, 2, 2, 3], 'c': [1, 2, 3, 4, 5]})
df['d'] = {'a':df.a, 'b':df.b}

@MikeChenfu MikeChenfu added Needs Triage Need team to review and classify feature request New feature or request labels Apr 30, 2021
@harrism harrism added the Python Affects Python cuDF API. label May 4, 2021
@kkraus14 kkraus14 removed the Needs Triage Need team to review and classify label May 4, 2021
@kkraus14
Copy link
Collaborator

kkraus14 commented May 4, 2021

@MikeChenfu we're looking into this, but just to clarify, we don't support map types yet, which would be the equivalent of dict which allow arbitrary key-value pairs. We support struct types, which requires the same keys for every row of data.

@MikeChenfu
Copy link
Author

Thanks @kkraus14 for the information. Currently I am implementing hive named_struct using cudf. I am very interested in the struct type. Is cudf able to write it as orc file? Appreciate it if some docs or examples are provided.

@kkraus14
Copy link
Collaborator

kkraus14 commented May 5, 2021

Thanks @kkraus14 for the information. Currently I am implementing hive named_struct using cudf. I am very interested in the struct type. Is cudf able to write it as orc file? Appreciate it if some docs or examples are provided.

We don't supporting writing struct columns to ORC quite yet, but it's being worked on currently. Hopefully it should land in the next release or two. Parquet writing is supported though.

@MikeChenfu
Copy link
Author

Good to know that. Parquet is also good to me.:)

@beckernick
Copy link
Member

beckernick commented Jul 26, 2021

The core request here has been implemented in #8728 (df[["a","b"]].to_struct()) and struct writing to ORC is covered by #7830 . As a result, I'm going to close this issue to consolidate discussion in 7830.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

4 participants