-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix logical type issues in the Parquet writer #14322
Fix logical type issues in the Parquet writer #14322
Conversation
CC @galipremsagar @rjzamora for impact on the Python side |
…bug-write_parquet-adjusted-to-utc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Built a new libcudf with this change and verified the failing Spark Parquet predicate pushdown tests for timestamps are passing again.
…bug-write_parquet-adjusted-to-utc
…m/vuule/cudf into bug-write_parquet-adjusted-to-utc
Thank you @etseidl for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vuule Can address the todo here(https://github.com/rapidsai/cudf/pull/14327/files#diff-ba3ea8b1c9c17ff8301a86d6a844b075f4f49feaa782f4a8f190e631123b007eR484) since this PR is fixing #14326
…bug-write_parquet-adjusted-to-utc
…m/vuule/cudf into bug-write_parquet-adjusted-to-utc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/merge |
Description
Closes #14315
Closes #14326
Parquet writer writes time and timestamp types with logical type with
isAdjustedToUTC
asfalse
. However, timestamps in libcudf tables are implicitly in UTC and don't need to be adjusted.This PR changes the
isAdjustedToUTC
to true.Also added a writer option to write timestamps as local, as this is the expected behavior on the Python side.
Also changed the way logical type is handled for UNKNOWN type columns in
merge_row_group_metadata
- the logical type is excluded from merged metadata because of issues with type inference.Checklist