Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Inconsistent CDH dependency overrides across submodules #9552

Closed
gerashegalov opened this issue Oct 26, 2023 · 0 comments · Fixed by #9508
Closed

[BUG] Inconsistent CDH dependency overrides across submodules #9552

gerashegalov opened this issue Oct 26, 2023 · 0 comments · Fixed by #9508
Assignees
Labels
bug Something isn't working build Related to CI / CD or cleanly building test Only impacts tests

Comments

@gerashegalov
Copy link
Collaborator

gerashegalov commented Oct 26, 2023

Submodule CDH overrides
api_validation 321cdh
datagen 321cdh, 330cdh
integration_tests 321cdh, 330cdh, 332cdh
shuffle-plugin 321cdh
sql-plugin 321cdh, 330cdh, 332cdh
sql-plugin-api N/A
tests 321cdh, 330cdh. 332cdh
udf-compiler 321cdh, 330cdh. 332cdh

Proof of the difference:

Execute on the branch-23.12

mvn package dependency:list -Dbuildver=321cdh -DskipTests -DoutputFile=target/deps-321cdh.txt
mvn package dependency:list -Dbuildver=332cdh -DskipTests -DoutputFile=target/deps-332cdh.txt

Then execute the same commands in a work tree for #9508

And check the difference between the dependency lists. For 321cdh the only difference is in sql-plugin-api which is expected because the CDH overrides are completely missing there.

$ find . -name deps-321cdh.txt | xargs -n 1 bash -c 'git diff --numstat -w $1 ../spark-rapids.worktrees/shimDepsSwitcherP
arent/$1' _
16      14      {. => ../spark-rapids.worktrees/shimDepsSwitcherParent/.}/sql-plugin-api/target/deps-321cdh.txt

but for 332cdh significant differences can be also seen for shuffle-plugin and datagen

$ find . -name deps-332cdh.txt | xargs -n 1 bash -c 'git diff --numstat -w $1 ../spark-rapids.worktrees/shimDepsSwitcherP
arent/$1' _
15      14      {. => ../spark-rapids.worktrees/shimDepsSwitcherParent/.}/sql-plugin-api/target/deps-332cdh.txt
99      12      {. => ../spark-rapids.worktrees/shimDepsSwitcherParent/.}/shuffle-plugin/target/deps-332cdh.txt
86      4       {. => ../spark-rapids.worktrees/shimDepsSwitcherParent/.}/datagen/target/deps-332cdh.txt

Compiling with incorrect dependencies may cause false bitwise identity results and/or wrong classes pulled into aggregator.

Originally posted by @gerashegalov in #9541 (comment)

@gerashegalov gerashegalov changed the title Inconsistent CDH dependency overrides across-submodules [BUG] Inconsistent CDH dependency overrides across-submodules Oct 26, 2023
@gerashegalov gerashegalov self-assigned this Oct 26, 2023
@gerashegalov gerashegalov added bug Something isn't working build Related to CI / CD or cleanly building test Only impacts tests labels Oct 26, 2023
@gerashegalov gerashegalov changed the title [BUG] Inconsistent CDH dependency overrides across-submodules [BUG] Inconsistent CDH dependency overrides across submodules Oct 26, 2023
gerashegalov added a commit that referenced this issue Oct 31, 2023
Factor out dependency switching profiles for different Spark builds into a single intermediate parent pom

Fixes #9552 

Signed-off-by: Gera Shegalov <gera@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build Related to CI / CD or cleanly building test Only impacts tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant