Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small clean up to simplify column selection code in ORC reader #9444

Merged
merged 4 commits into from
Oct 18, 2021

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Oct 14, 2021

Remove has_nested_column from column selection and the impl class.
Deduce the existence of nested columns from the depth of the schema instead.

@vuule vuule added cuIO cuIO issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 14, 2021
@vuule vuule self-assigned this Oct 14, 2021
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 14, 2021
@codecov
Copy link

codecov bot commented Oct 15, 2021

Codecov Report

Merging #9444 (732dab7) into branch-21.12 (ab4bfaa) will increase coverage by 0.03%.
The diff coverage is n/a.

❗ Current head 732dab7 differs from pull request most recent head 4d625c8. Consider uploading reports for the commit 4d625c8 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9444      +/-   ##
================================================
+ Coverage         10.79%   10.82%   +0.03%     
================================================
  Files               116      117       +1     
  Lines             18869    19043     +174     
================================================
+ Hits               2036     2061      +25     
- Misses            16833    16982     +149     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/io/csv.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/orc.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/series.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/reshape.py 0.00% <0.00%> (ø)
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6caaf5...4d625c8. Read the comment docs.

@vuule vuule marked this pull request as ready for review October 15, 2021 19:06
@vuule vuule requested a review from a team as a code owner October 15, 2021 19:06
@vuule vuule requested review from vyasr and jrhemstad October 15, 2021 19:06
@vuule
Copy link
Contributor Author

vuule commented Oct 18, 2021

rerun tests

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

*/
void add_column(std::vector<std::vector<orc_column_meta>>& selection,
std::vector<SchemaType> const& types,
const size_t level,
const uint32_t id,
bool& has_timestamp_column,
bool& has_nested_column)
bool& has_timestamp_column)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you ever add more flags like this it might be a good place to use a set of bit flags instead of multiple bools. Not actionable now, just a thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to return pair/tuple of bools instead of having output argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping to remove the remaining flag at some point.

@vuule
Copy link
Contributor Author

vuule commented Oct 18, 2021

rerun tests

@vuule
Copy link
Contributor Author

vuule commented Oct 18, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 399d5b5 into rapidsai:branch-21.12 Oct 18, 2021
@vuule vuule deleted the orc-reader-remove-has-nested branch October 18, 2021 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants