-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate Spark vendor shim dependency management [databricks] #9182
Consolidate Spark vendor shim dependency management [databricks] #9182
Conversation
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
build |
build |
<version>23.10.0-SNAPSHOT</version> | ||
<relativePath>../../pom.xml</relativePath> | ||
</parent> | ||
<artifactId>rapids-4-spark-cdh-bom</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does bom stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually it is bill of materials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can rename if needed. My original intention was to use a pure BOM pattern I found recommended somewhere on stackoverflow. However, dependencyManagement
does not support exclusions
used in the CDH context, and even if it would, we still need to do some consolidation to avoid repeating <dependencies>
. I found that the latter makes the former redundant. So the definition of a BOM in our project is not really the same what the Maven doc describes, but the same in spirit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its fine to leave as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that nothing needs to change with how we package, publish the resulting poms/jars?
build |
at least that is the goal |
I ran another round of verification:
revealing a difference in the presence of arrow classes in the before integration_tests assembly jar. Which is attributable to the It looks to me that the provided scope is correct but it looks it was intentionally set to compile https://github.com/NVIDIA/spark-rapids/pull/3335/files#diff-4b412ae5512cbf72928bdd8bfe6aa562a04f709fcbb7d97b04b6781e338ab987R213-R230 |
I do not, we have some arrow test classes for datasource v2 testing in the integration tests, maybe around that. |
thanks, merging as is, will put in an override if it breaks tests. |
replace dbdeps not enabled using profiles, this case is missed in NVIDIA#9182 Signed-off-by: Gera Shegalov <gera@apache.org>
- replace dbdeps not enabled using profiles, this case is missed in #9182 - remove outdated comments Signed-off-by: Gera Shegalov <gera@apache.org>
Replace numerous instance of duplicate dependency definitions for cloudera and databricks shims by aggregated definitions.
Verification along the lines :
buildall
and unjar all jars in separate dirsThe diff is because of the previous special-case compile-scope for arrow in integraion_tests just in the databricks prfoile. I think it may no longer be necessary. If a post-merge test breaks, will fix in a follow-up PR.
Signed-off-by: Gera Shegalov gera@apache.org