Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ORC reader for empty DataFrame/Table #7624

Merged
merged 4 commits into from
Mar 22, 2021

Conversation

rgsl888prabhu
Copy link
Contributor

@rgsl888prabhu rgsl888prabhu commented Mar 17, 2021

ff.types by default will have a main type as struct under which all other columns will originate. So, we need to skip first which is not a column and start with 1st index.
(Look for Type Information in ORC Specification)
Along with that, we should also take care of the scenario where user would specify specific column name to retrieve, but it doesn't exist in case of empty data frame/table.

Added test case to validate both scenario.

closes #7356

@rgsl888prabhu rgsl888prabhu added bug Something isn't working 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer non-breaking Non-breaking change labels Mar 17, 2021
@rgsl888prabhu rgsl888prabhu self-assigned this Mar 17, 2021
@rgsl888prabhu rgsl888prabhu requested review from a team as code owners March 17, 2021 11:09
@github-actions github-actions bot added Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Mar 17, 2021
@rgsl888prabhu rgsl888prabhu requested review from vuule, nvdbaranec and devavret and removed request for trxcllnt and jrhemstad March 17, 2021 11:10
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥
Just a minor suggestion and a question.

cpp/src/io/orc/orc.cpp Show resolved Hide resolved
python/cudf/cudf/tests/test_orc.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Mar 18, 2021

Codecov Report

Merging #7624 (9377ec1) into branch-0.19 (7871e7a) will increase coverage by 0.62%.
The diff coverage is n/a.

❗ Current head 9377ec1 differs from pull request most recent head 28e8eb2. Consider uploading reports for the commit 28e8eb2 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7624      +/-   ##
===============================================
+ Coverage        81.86%   82.49%   +0.62%     
===============================================
  Files              101      101              
  Lines            16884    17400     +516     
===============================================
+ Hits             13822    14354     +532     
+ Misses            3062     3046      -16     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/categorical.py 91.97% <ø> (+0.58%) ⬆️
python/cudf/cudf/core/column/column.py 87.86% <ø> (+0.10%) ⬆️
python/cudf/cudf/core/column/datetime.py 89.63% <ø> (+0.54%) ⬆️
python/cudf/cudf/core/column/decimal.py 92.75% <ø> (-2.12%) ⬇️
python/cudf/cudf/core/column/lists.py 92.17% <ø> (+0.77%) ⬆️
python/cudf/cudf/core/column/numerical.py 94.83% <ø> (-0.20%) ⬇️
python/cudf/cudf/core/column/string.py 86.79% <ø> (+0.30%) ⬆️
python/cudf/cudf/core/column/timedelta.py 88.57% <ø> (+0.33%) ⬆️
python/cudf/cudf/core/column_accessor.py 95.45% <ø> (+0.14%) ⬆️
python/cudf/cudf/core/dataframe.py 90.90% <ø> (+0.44%) ⬆️
... and 61 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d7767e...28e8eb2. Read the comment docs.

Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern but approve otherwise

python/cudf/cudf/tests/test_orc.py Show resolved Hide resolved
@vuule
Copy link
Contributor

vuule commented Mar 19, 2021

@gpucibot merge

Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python looks ✔️

@rapids-bot rapids-bot bot merged commit 8632ca0 into rapidsai:branch-0.19 Mar 22, 2021
@vyasr vyasr added 4 - Needs Review Waiting for reviewer to review or respond and removed 4 - Needs cuDF (Python) Reviewer labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] RuntimeError when there is an empty dataframe written to orc file
6 participants