Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow picking output columns from nested columns. #7248

Merged

Conversation

devavret
Copy link
Contributor

Only top level columns can be selected by name

Fixes #7229

Only top level columns can be selected by name
@devavret devavret requested a review from a team as a code owner January 28, 2021 22:10
@devavret devavret added bug Something isn't working non-breaking Non-breaking change labels Jan 28, 2021
@vuule
Copy link
Contributor

vuule commented Jan 28, 2021

As we discussed offline - you can use the file included in the issue for testing.

@jlowe
Copy link
Member

jlowe commented Jan 28, 2021

Only top level columns can be selected by name

To implement efficient queries on nested types we'll need the ability to specify a non-top-level column to load (e.g.: only loading one of the three fields in a top-level struct column). Will we be able to specify a fully-qualified path to that column from the root?

@devavret
Copy link
Contributor Author

Will we be able to specify a fully-qualified path to that column from the root?

That's exactly what we discussed offline. That we should probably enable that after this fix and the method to do that should be to specify not just the name but the path in schema e.g. "name.first"

@codecov
Copy link

codecov bot commented Jan 29, 2021

Codecov Report

Merging #7248 (12120c0) into branch-0.18 (8860baf) will increase coverage by 0.10%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #7248      +/-   ##
===============================================
+ Coverage        82.09%   82.19%   +0.10%     
===============================================
  Files               97       99       +2     
  Lines            16474    16841     +367     
===============================================
+ Hits             13524    13843     +319     
- Misses            2950     2998      +48     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/parquet.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/_typing.py 92.30% <ø> (ø)
python/cudf/cudf/core/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/core/abc.py 87.23% <ø> (ø)
python/cudf/cudf/core/buffer.py 80.00% <ø> (+0.95%) ⬆️
python/cudf/cudf/core/column/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/core/column/categorical.py 92.73% <ø> (-0.62%) ⬇️
python/cudf/cudf/core/column/column.py 87.75% <ø> (-0.39%) ⬇️
... and 68 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9672e3d...29a7d95. Read the comment docs.

@devavret devavret requested a review from a team as a code owner January 29, 2021 17:05
@vuule vuule added 4 - Needs cuDF (Python) Reviewer cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. labels Jan 29, 2021
@vuule
Copy link
Contributor

vuule commented Jan 29, 2021

rerun tests

@vuule
Copy link
Contributor

vuule commented Jan 29, 2021

Perhaps the title should have 'disallow', rather than 'disallowing'.

@devavret devavret changed the title Disallowing picking output columns from nested columns. Disallow picking output columns from nested columns. Jan 29, 2021
@devavret
Copy link
Contributor Author

devavret commented Feb 1, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 0ee8004 into rapidsai:branch-0.18 Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Needs Review Waiting for reviewer to review or respond bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Parquet reader segfaults loading file with nested type as map key
7 participants