Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support struct types in ORC writer #7830

Closed
1 of 2 tasks
vuule opened this issue Apr 2, 2021 · 2 comments · Fixed by #9025
Closed
1 of 2 tasks

[FEA] Support struct types in ORC writer #7830

vuule opened this issue Apr 2, 2021 · 2 comments · Fixed by #9025
Assignees
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.

Comments

@vuule
Copy link
Contributor

vuule commented Apr 2, 2021

Struct support:

@vuule vuule added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. cuIO cuIO issue labels Apr 2, 2021
@github-actions
Copy link

github-actions bot commented May 2, 2021

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@beckernick
Copy link
Member

beckernick commented Jul 19, 2021

As struct support in ORC reader was merged in #8599 and not linked to this issue, I'm going to update this issue to explicitly be about struct support in the writer

@beckernick beckernick changed the title [FEA] Support struct types in ORC reader and writer [FEA] Support struct types in ORC writer Jul 19, 2021
@vuule vuule self-assigned this Jul 22, 2021
rapids-bot bot pushed a commit that referenced this issue Sep 22, 2021
Fixes #7830, #8443

Features:
- Use the new table metadata type that matches the table hierarchy, `table_input_metadata`.
- Support struct columns in the writer.

Changes:
- Null masks are encoded as aligned rowgroups to avoid invalid bits when the number of encoded rows is not divisible by 8 (except for the last rowgroup in each stripe). This also affects list columns. The issue is equivalent to #6763 (boolean columns only).
- Added pushdown masks that are used to determine which child elements should not be encoded, including null mask bits.
- Use pushdown masks for rowgroup alignment, null mask encoding and value encoding.
- Separated the null mask encoding from value encoding - can be further moved to a separate kernel call.

Breaking because the table metadata type has changed.

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Jason Lowe (https://github.com/jlowe)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Robert (Bobby) Evans (https://github.com/revans2)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Devavret Makkar (https://github.com/devavret)
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

URL: #9025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants