Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove batches if they are received after the iterator detects that t… #1180

Merged

Conversation

abellina
Copy link
Collaborator

Closes #1177

@abellina abellina added bug Something isn't working shuffle things that impact the shuffle plugin labels Nov 20, 2020
…he task has completed

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
@abellina abellina force-pushed the shuffle/limit_leak_short_term_solution branch from e7712f6 to 56cc008 Compare November 20, 2020 20:18
@sameerz sameerz added this to the Nov 23 - Dec 4 milestone Nov 24, 2020
Copy link
Collaborator Author

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the suggested changes, but I am having some environment issues to reproduce this in the wild. Leaving this in Draft mode until I can double check everything still works.

@abellina abellina marked this pull request as ready for review November 25, 2020 23:11
@abellina
Copy link
Collaborator Author

@jlowe Note that I no longer believe this is a limit issue. I updated #1177. Sorry for the confusion. This change triggers in error conditions for a fact, but I am now doubting whether it is useful in other conditions.

@abellina
Copy link
Collaborator Author

Updated #1177 again. Was able to reproduce, main thing that tricked me was the coalesce batches. Not sure if that is expected, where you have a shuffle -> coalesce -> local limit X. Should coalesce only coalesce to X? Either way, it's definitely seen if coalesce is adjusted to produce small batches, such that local limit finds what it needs early.

@abellina
Copy link
Collaborator Author

build

@jlowe
Copy link
Member

jlowe commented Nov 30, 2020

build

@abellina abellina merged commit d3d3456 into NVIDIA:branch-0.3 Dec 1, 2020
@abellina abellina deleted the shuffle/limit_leak_short_term_solution branch December 1, 2020 14:18
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
NVIDIA#1180)

* Remove batches if they are received after the iterator detects that the task has completed

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
NVIDIA#1180)

* Remove batches if they are received after the iterator detects that the task has completed

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
…IDIA#1180)

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] leaks possible in the rapids shuffle if batches are received after the task completes
3 participants