Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] very large shuffles can fail #45

Closed
revans2 opened this issue May 29, 2020 · 0 comments · Fixed by #8935
Closed

[BUG] very large shuffles can fail #45

revans2 opened this issue May 29, 2020 · 0 comments · Fixed by #8935
Assignees
Labels
bug Something isn't working P1 Nice to have for release reliability Features to improve reliability or bugs that severly impact the reliability of the plugin SQL part of the SQL/Dataframe plugin

Comments

@revans2
Copy link
Collaborator

revans2 commented May 29, 2020

Describe the bug
Spark has a limit of 2GB 2^31 for storing a single shuffle element. In some cases we can go over this, and we need to make sure that when we shuffle the data that the largest batch we are going to serialize is < 2GB. We cannot do this in the serializer, because it is too late at that point. We need to do it in the shuffle executor.

@revans2 revans2 added bug Something isn't working ? - Needs Triage Need team to review and classify SQL part of the SQL/Dataframe plugin labels May 29, 2020
@sameerz sameerz added P1 Nice to have for release and removed ? - Needs Triage Need team to review and classify labels Aug 18, 2020
wjxiz1992 pushed a commit to wjxiz1992/spark-rapids that referenced this issue Oct 29, 2020
* add initialization notebooks for databricks examples

* Remove spark.stop() from example notebook
@revans2 revans2 mentioned this issue Mar 4, 2021
14 tasks
@revans2 revans2 added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Apr 12, 2022
@revans2 revans2 self-assigned this Aug 2, 2023
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
liurenjie1024 pushed a commit to liurenjie1024/spark-rapids that referenced this issue Jul 8, 2024
* workable version without tests

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* doc

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* fix scala 2.13

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* fix compile

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* fix it

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* enable it

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* metric name

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* minor

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* change seed

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* fix comments

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

* minor

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

---------

Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>
Co-authored-by: Hongbin Ma (Mahone) <mahongbin@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Nice to have for release reliability Features to improve reliability or bugs that severly impact the reliability of the plugin SQL part of the SQL/Dataframe plugin
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants