-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a bundled spark-rapids-jni dependency instead of external cudf dependency #5249
Conversation
…pendency Signed-off-by: Jason Lowe <jlowe@nvidia.com>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good. I'm assuming we will have IT builds to update to stop fetching the cudf jar as followups.
@@ -792,7 +792,7 @@ | |||
<spark.test.version>${spark.version}</spark.test.version> | |||
<spark.version.classifier>spark${buildver}</spark.version.classifier> | |||
<cuda.version>cuda11</cuda.version> | |||
<cudf.version>22.06.0-SNAPSHOT</cudf.version> | |||
<spark-rapids-jni.version>22.06.0-SNAPSHOT</spark-rapids-jni.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to update the deploy script to push the new jar with classifier. I assume we have to push both classifier and non-classifier version. @pxLi perhaps has more insight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take care of that after this one got merged for both github and internal repos #5258 |
hi @jlowe Do you think if we need to link the empty jar to some cuda classifier one (like cuda11) by default? Or just remove the empty jar after build to avoid potential confusion for developers? thx~ We can use jar w/ classifier to deploy for both in artifacts repo though. |
Hmm, I'm not sure how the empty jar is being built. I don't see it when build locally with either mvn or buildall.
We should do the same thing we do for cudf, i.e.: post the cuda11 classifier as the no-classifier build. IIRC I believe we have to have a no-classifier build to publish based on the repository policy, @GaryShen2008 may know the details there. If we don't have to publish a no-classifier version that's preferable, but if we do then we should have it match the base version classifier (i.e.: cuda11 for now). |
Correct, we have to have a no-classifier build to publish, otherwise, OSS package validation would FAILED on :
|
Fixes #3196.
This changes the RAPIDS Accelerator to be a self-contained jar, including the spark-rapids-jni and cudf code within the jar so a separate RAPIDS cudf jar is no longer necessary to be deployed. This enables the project to use Spark-specific logic in custom kernels implemented in the spark-rapids-jni project which is in turn built on top of the RAPIDS cudf project.
Note that because the RAPIDS Accelerator jar now contains native CUDA code, a CUDA version classifier has been added to the jar artifact.
Documentation has been updated to reflect the new model for using the RAPIDS Accelerator.
I left the check for cudf versions in place. Arguably it's no longer needed because we're now bundling cudf via the bundled spark-rapids-jni jar, but it could prove useful if the user drops an older cudf jar in front of the RAPIDS Accelerator jar in the classpath.
This also fixes the multiple copies of the jucx.so library that were included in the dist jar, one copy per supported Spark version.