Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Do not read the real data when readDataSchema is empty in Avro multi-threaded reading. #6219

Closed
firestarman opened this issue Aug 4, 2022 · 0 comments · Fixed by #6241
Labels
performance A performance related task/issue

Comments

@firestarman
Copy link
Collaborator

firestarman commented Aug 4, 2022

This is an improvement.

Currently the Avro multi-threaded reader in GpuAvroScan will still read the block data despite empty read schema (e.g. for count operation), however reading only block meta is enough for this case. This can be optimized.

This may also need to take care of the push-down filters.

@firestarman firestarman added feature request New feature or request ? - Needs Triage Need team to review and classify labels Aug 4, 2022
@firestarman firestarman changed the title [FEA] Do not read the real data when readDataSchema is empty in multi-threaded reading. [FEA] Do not read the real data when readDataSchema is empty in Avro multi-threaded reading. Aug 4, 2022
@sameerz sameerz added performance A performance related task/issue and removed feature request New feature or request ? - Needs Triage Need team to review and classify labels Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants