Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to stop inheriting from HashJoin #1321

Merged
merged 2 commits into from
Dec 9, 2020

Conversation

andygrove
Copy link
Contributor

@andygrove andygrove commented Dec 8, 2020

Signed-off-by: Andy Grove andygrove@nvidia.com

Move GpuHashJoin out of the shims and do not extend from Spark's HashJoin.

@andygrove andygrove added this to the Dec 7 - Dec 18 milestone Dec 8, 2020
@andygrove andygrove self-assigned this Dec 8, 2020
@andygrove
Copy link
Contributor Author

build

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2019-2020, NVIDIA CORPORATION.
* Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github is seeing this as a file move but I actually manually created this new file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git looks for diffs and tries to guess what was a move and what was not.

@revans2
Copy link
Collaborator

revans2 commented Dec 8, 2020

Overall it looks good and I love that we deleted 700 lines of uneeded code.

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove andygrove changed the title [WIP] Refactor to stop inheriting from HashJoin Refactor to stop inheriting from HashJoin Dec 8, 2020
@andygrove
Copy link
Contributor Author

build

revans2
revans2 previously approved these changes Dec 9, 2020
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit. I am OK if we merge this as is.

@@ -27,8 +27,9 @@ import org.apache.spark.sql.catalyst.optimizer.{BuildLeft, BuildRight, BuildSide
import org.apache.spark.sql.catalyst.plans.JoinType
import org.apache.spark.sql.catalyst.plans.physical.{Distribution, HashClusteredDistribution}
import org.apache.spark.sql.execution.{BinaryExecNode, SparkPlan}
import org.apache.spark.sql.execution.joins.ShuffledHashJoinExec
import org.apache.spark.sql.execution.joins.{ShuffledHashJoinExec}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit don't need {} for a single import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. There are also a couple unused imports after this refactor so I will push a commit to clean this up.

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove
Copy link
Contributor Author

build

@andygrove andygrove merged commit d0c120d into NVIDIA:branch-0.4 Dec 9, 2020
@andygrove andygrove deleted the no-extend-hash-join branch December 9, 2020 19:34
@sameerz sameerz added the feature request New feature or request label Dec 10, 2020
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Move GpuHashJoin out of shims and stop extending HashJoin

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* clean up imports

Signed-off-by: Andy Grove <andygrove@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Move GpuHashJoin out of shims and stop extending HashJoin

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* clean up imports

Signed-off-by: Andy Grove <andygrove@nvidia.com>
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
compute-saniziter is a tool can detect some GPU memory relevant issues, this PR is to enable it in memory check mode for the JNI unit tests.

Some explanation for the parameters of the Compute Sanitizer.
--log-file should be used to avoid a corrupting output issue from the surefire plugin.
--error-exitcode is used to fail the build process if any error is caught by the Compute Sanitizer.
--launch-timeout is set to 10 minutes, and it should be enough since we monitor only the forked test processes.

---------

Signed-off-by: Liangcai Li <firestarmanllc@gmail.com>
Co-authored-by: Jason Lowe <jlowe@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants