Use a bundled spark-rapids-jni dependency instead of external cudf dependency #5249

jlowe · 2022-04-13T20:57:50Z

This changes the RAPIDS Accelerator to be a self-contained jar, including the spark-rapids-jni and cudf code within the jar so a separate RAPIDS cudf jar is no longer necessary to be deployed. This enables the project to use Spark-specific logic in custom kernels implemented in the spark-rapids-jni project which is in turn built on top of the RAPIDS cudf project.

Note that because the RAPIDS Accelerator jar now contains native CUDA code, a CUDA version classifier has been added to the jar artifact.

Documentation has been updated to reflect the new model for using the RAPIDS Accelerator.

I left the check for cudf versions in place. Arguably it's no longer needed because we're now bundling cudf via the bundled spark-rapids-jni jar, but it could prove useful if the user drops an older cudf jar in front of the RAPIDS Accelerator jar in the classpath.

This also fixes the multiple copies of the jucx.so library that were included in the dist jar, one copy per supported Spark version.

…pendency Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe · 2022-04-13T20:58:33Z

build

tgravescs

overall looks good. I'm assuming we will have IT builds to update to stop fetching the cudf jar as followups.

tgravescs · 2022-04-14T21:35:00Z

pom.xml

@@ -792,7 +792,7 @@
        <spark.test.version>${spark.version}</spark.test.version>
        <spark.version.classifier>spark${buildver}</spark.version.classifier>
        <cuda.version>cuda11</cuda.version>
-        <cudf.version>22.06.0-SNAPSHOT</cudf.version>
+        <spark-rapids-jni.version>22.06.0-SNAPSHOT</spark-rapids-jni.version>


we need to update the deploy script to push the new jar with classifier. I assume we have to push both classifier and non-classifier version. @pxLi perhaps has more insight.

sry misunderstood above part. Yeah, we will need to add a cuda classifier to plugin deployment (mostly for release script) also cc @NvTimLiu #5259

pxLi · 2022-04-15T01:42:51Z

overall looks good. I'm assuming we will have IT builds to update to stop fetching the cudf jar as followups.

I will take care of that after this one got merged for both github and internal repos #5258

pxLi · 2022-04-15T02:27:12Z

hi @jlowe

Do you think if we need to link the empty jar to some cuda classifier one (like cuda11) by default? Or just remove the empty jar after build to avoid potential confusion for developers? thx~

We can use jar w/ classifier to deploy for both in artifacts repo though.

jlowe · 2022-04-15T18:34:43Z

Hmm, I'm not sure how the empty jar is being built. I don't see it when build locally with either mvn or buildall.

Do you think if we need to link the empty jar to some cuda classifier one (like cuda11) by default? Or just remove the empty jar after build to avoid potential confusion for developers?

We should do the same thing we do for cudf, i.e.: post the cuda11 classifier as the no-classifier build. IIRC I believe we have to have a no-classifier build to publish based on the repository policy, @GaryShen2008 may know the details there. If we don't have to publish a no-classifier version that's preferable, but if we do then we should have it match the base version classifier (i.e.: cuda11 for now).

NvTimLiu · 2022-04-18T04:18:28Z

IIRC I believe we have to have a no-classifier build to publish based on the repository policy

Correct, we have to have a no-classifier build to publish, otherwise, OSS package validation would FAILED on :

Event: Failed: Javadoc Validation

failureMessage  Missing: no main jar artifact found in folder '/com/nvidia/rapids-4-spark_2.12/22.06.0'

Use a bundled spark-rapids-jni dependency instead of external cudf de…

f136d3c

…pendency Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe added this to the Apr 4 - Apr 15 milestone Apr 13, 2022

jlowe self-assigned this Apr 13, 2022

jlowe requested review from revans2, tgravescs, GaryShen2008 and NvTimLiu as code owners April 13, 2022 20:57

jlowe added the build Related to CI / CD or cleanly building label Apr 13, 2022

tgravescs approved these changes Apr 14, 2022

View reviewed changes

tgravescs reviewed Apr 14, 2022

View reviewed changes

pxLi mentioned this pull request Apr 15, 2022

[FEA] clean up cudf artifacts fetching for 22.06+ #5258

Closed

pxLi mentioned this pull request Apr 15, 2022

[FEA] release plugin w/ cuda classifier #5259

Closed

pxLi approved these changes Apr 18, 2022

View reviewed changes

sameerz modified the milestones: Apr 4 - Apr 15, Apr 18 - Apr 29 Apr 18, 2022

jlowe merged commit ddb83b2 into NVIDIA:branch-22.06 Apr 18, 2022

jlowe deleted the spark-rapids-jni-switch branch April 18, 2022 19:46

pxLi mentioned this pull request Apr 19, 2022

[BUG] Fix incorrect plugin nightly deployment and release [skip ci] #5280

Merged

gerashegalov mentioned this pull request May 20, 2022

[BUG] dist jar does not contain reduced pom, creates an unnecessary jar #5557

Closed

wjxiz1992 mentioned this pull request May 30, 2022

remove cuDF jar dependency NVIDIA/spark-rapids-ml#71

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a bundled spark-rapids-jni dependency instead of external cudf dependency #5249

Use a bundled spark-rapids-jni dependency instead of external cudf dependency #5249

jlowe commented Apr 13, 2022 •

edited

Loading

jlowe commented Apr 13, 2022

tgravescs left a comment

tgravescs Apr 14, 2022

pxLi Apr 15, 2022 •

edited

Loading

pxLi commented Apr 15, 2022 •

edited

Loading

pxLi commented Apr 15, 2022 •

edited

Loading

jlowe commented Apr 15, 2022

NvTimLiu commented Apr 18, 2022

Use a bundled spark-rapids-jni dependency instead of external cudf dependency #5249

Use a bundled spark-rapids-jni dependency instead of external cudf dependency #5249

Conversation

jlowe commented Apr 13, 2022 • edited Loading

jlowe commented Apr 13, 2022

tgravescs left a comment

Choose a reason for hiding this comment

tgravescs Apr 14, 2022

Choose a reason for hiding this comment

pxLi Apr 15, 2022 • edited Loading

Choose a reason for hiding this comment

pxLi commented Apr 15, 2022 • edited Loading

pxLi commented Apr 15, 2022 • edited Loading

jlowe commented Apr 15, 2022

NvTimLiu commented Apr 18, 2022

jlowe commented Apr 13, 2022 •

edited

Loading

pxLi Apr 15, 2022 •

edited

Loading

pxLi commented Apr 15, 2022 •

edited

Loading

pxLi commented Apr 15, 2022 •

edited

Loading