Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] In ParqueCachedBatchSerializer, serializing parquet buffers might blow up in certain cases #685

Closed
razajafri opened this issue Sep 8, 2020 · 1 comment
Labels
bug Something isn't working P1 Nice to have for release

Comments

@razajafri
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
While serializing the parquet buffers we are concatenating the buffers into a single array. This with blow up in cases where the cumulative buffer length is more than Int.MaxValue.

Describe the solution you'd like
We could write the ParquetBufferConsumer to return an array of byte arrays instead of returning a single byte array

@razajafri razajafri added feature request New feature or request ? - Needs Triage Need team to review and classify labels Sep 8, 2020
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Sep 8, 2020
@sameerz sameerz added bug Something isn't working and removed feature request New feature or request labels Sep 8, 2020
@sameerz sameerz changed the title [FEA] In ParqueCachedBatchSerializer, serializing parquet buffers might blow up in certain cases [BUG] In ParqueCachedBatchSerializer, serializing parquet buffers might blow up in certain cases Sep 8, 2020
@sameerz sameerz added the P1 Nice to have for release label Sep 8, 2020
@razajafri razajafri added this to the Nov 23 - Dec 4 milestone Nov 23, 2020
@razajafri
Copy link
Collaborator Author

This is being worked on as a part of #1265

@razajafri razajafri mentioned this issue Dec 4, 2020
12 tasks
@sameerz sameerz removed this from the Nov 23 - Dec 4 milestone Dec 5, 2020
@sameerz sameerz modified the milestones: Nov 23 - Dec 4, Dec 7 - Dec 18 Dec 5, 2020
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
…IDIA#685)

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Nice to have for release
Projects
None yet
Development

No branches or pull requests

2 participants