Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply BroadcastMode key projections before interpreting key expressions in subqueries [databricks] #6268

Merged
merged 1 commit into from
Aug 10, 2022

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Aug 9, 2022

Fixes #6232. When the GPU broadcasts for a hash join, it does not broadcast a hash table like the CPU does, rather it just broadcasts the original build table. This is normally fine, but it can cause problems in DPP/subquery usage where the planning assumes that the HashedRelation key expressions have already been applied to the broadcasted data. In the GPU case this has not happened, and thus those expressions can be critical to proper interpretation of the broadcast data. #6232 shows a query where the HashedRelation keys are rearranging the original data, and failure to evaluate those expressions results in the wrong column being used in the subquery exec.

To fix it, we grab any expressions found on the BroadcastMode and evaluate them in the GpuSubqueryExec before trying to apply the normal BoundReference projection to extract the desired key.

…ns in subqueries

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe added this to the Aug 8 - Aug 19 milestone Aug 9, 2022
@jlowe jlowe self-assigned this Aug 9, 2022
@jlowe jlowe changed the title Apply BroadcastMode key projections before interpreting key expressions in subqueries Apply BroadcastMode key projections before interpreting key expressions in subqueries [databricks] Aug 9, 2022
@jlowe
Copy link
Member Author

jlowe commented Aug 9, 2022

build

1 similar comment
@jlowe
Copy link
Member Author

jlowe commented Aug 9, 2022

build

@jlowe jlowe merged commit 4a84c94 into NVIDIA:branch-22.08 Aug 10, 2022
@jlowe jlowe deleted the fix-dpp-hashed-relation-project branch August 10, 2022 13:29
@sameerz sameerz added the bug Something isn't working label Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Query failed with java.lang.NullPointerException when doing GpuSubqueryBroadcastExec
4 participants