Build against specified spark-rapids-jni snapshot jar [skip ci] #5501

NvTimLiu · 2022-05-16T14:03:55Z

part of #5294

Build rapids against the specified spark-rapids-jni snapshot jar, to avoid pulling different versions of
spark-rapids-jni dependency jars during the compile.

Signed-off-by: Tim Liu timl@nvidia.com

jenkins/spark-nightly-build.sh

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

jlowe

Allowing the spark-rapids-jni version to be manually anchored is nice, but this does not solve the issue described in #5294. Even after this change, each Spark version will unpack and attempt to dedup the spark-rapids-jni jar and require manual sync of the unshim text files with the contents of that jar (and any other non-Spark-specific jar, like jucx).

Also shouldn't the buildall script be updated with similar changes? The nightly build should not be the only build that has the fix.

NvTimLiu · 2022-05-16T14:57:07Z

Still dedup the spark-rapids-jni jar in aggregator spark-shims -->

@jlowe Sorry I may not understand this issue clearly before.

As aggregator depends on sql_plugin, and sql_plugin compile spark-rapids-jni into its jar.
So do you mean we set scope for sql_plugin?
Then add JNI scope in dist to avoid aggregate diff spark versions of shims dedupe spark-rapids-jni?

Will update buildAll script, and follow the similar way for other non-Spark-specific jar, like jucx,

jlowe · 2022-05-16T15:25:30Z

sql_plugin compile spark-rapids-jni into its jar

sql_plugin needs spark-rapids-jni to compile, but it does not pull spark-rapids-jni into the sql_plugin jar artifact. The spark-rapids-jni jar gets pulled into the dist jar by the custom dist jar build process (see dist/pom.xml and supporting build-parallel-worlds.xml and binary-dedupe.sh for details).

The problem is with the way non-Spark-specific dependencies are handled. They're essentially assumed to be Spark-specific dependencies, which requires them to go through the parallel-world, binary-dedupe, and unshim processing. Ideally we should only do this for dependencies that are Spark-specific (i.e.: the aggregator jars but not all their dependencies). One potential solution is to manually exclude the dependencies we know are not Spark specific (like spark-rapids-jni, jucx, possibly others) and treat them just like a normal dependency that is included into the dist artifact.

To fix issue : NVIDIA#5294 Build rapids against the specified spark-rapids-jni snapshot jar, to avoid pulling different versions of spark-rapids-jni dependency jars during the compile. Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu · 2022-05-17T13:09:32Z

sql_plugin compile spark-rapids-jni into its jar

sql_plugin needs spark-rapids-jni to compile, but it does not pull spark-rapids-jni into the sql_plugin jar artifact. The spark-rapids-jni jar gets pulled into the dist jar by the custom dist jar build process (see dist/pom.xml and supporting build-parallel-worlds.xml and binary-dedupe.sh for details).

The problem is with the way non-Spark-specific dependencies are handled. They're essentially assumed to be Spark-specific dependencies, which requires them to go through the parallel-world, binary-dedupe, and unshim processing. Ideally we should only do this for dependencies that are Spark-specific (i.e.: the aggregator jars but not all their dependencies). One potential solution is to manually exclude the dependencies we know are not Spark specific (like spark-rapids-jni, jucx, possibly others) and treat them just like a normal dependency that is included into the dist artifact.

@jlowe Thanks a lot for your explanation.

Yes the duplicated class files of non-Spark-specific dependencies are mainly spark-rapids-jni and jucx related. We can NOT shade them into Spark-specific aggregator jars to avoid multiple binary-dedupe.

As we would build sql_plugin multi times when building with diff spark shim versions, and spark-rapids-jni dependency jar may change during building, e.g., new spark-rapids-jni jars are pushed to maven repo.

Suppose we can first patch for pulling the specified spark-rapids-jni dependency jar?

Will follow up removing non-Spark-specific dependencies from aggregator jars in other pull requests.
Also here is a issue, suppose we need to update our build scripts/pom.xml files, as there are some modules running tests against the aggregator jar. Simply remove non-Spark-specific dependencies from aggregator jars may causing tests FAILED.

https://github.com/NVIDIA/spark-rapids/blob/branch-22.06/tests/pom.xml#L56

https://github.com/NVIDIA/spark-rapids/blob/branch-22.06/api_validation/pom.xml#L148

…ildall script Signed-off-by: Tim Liu <timl@nvidia.com>

jenkins/databricks/build.sh

jlowe · 2022-05-17T13:27:20Z

Suppose we can first patch for pulling the specified spark-rapids-jni dependency jar?

Absolutely. I just wanted to call out that this PR should not state that it resolves the issue, since it alone does not.

Simply remove non-Spark-specific dependencies from aggregator jars may causing tests FAILED.

Yes, that is expected. We cannot simply remove these dependencies, they are needed at compile time and runtime. We need to process these dependencies differently when constructing the jars, not remove them entirely.

build/buildall

jenkins/spark-nightly-build.sh

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

1, Test case for cudf expected timestamp-seq version name vs actual SNAPSHOT match, e.g. cudf version "7.0.1-20220101.001122-123" is satisfied by "7.0.1-SNAPSHOT" 2, Use $MVN instead of "mvn" for every invocation of Maven, to include maven options for all the mvn commands. Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu · 2022-05-17T15:53:38Z

# use $MVN instead of "mvn" for every invocation of Maven
MVN="mvn ${MVN_OPT}"

Good suggestion, updated

Yes, test cases should be added for the new behavior for SNAPSHOT matching.

Updated

Signed-off-by: Tim Liu <timl@nvidia.com>

tests/src/test/scala/com/nvidia/spark/rapids/RapidsExecutorPluginSuite.scala

Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu · 2022-05-18T15:51:11Z

build

NvTimLiu added the build Related to CI / CD or cleanly building label May 16, 2022

NvTimLiu requested review from jlowe, revans2, tgravescs and GaryShen2008 as code owners May 16, 2022 14:03

NvTimLiu self-assigned this May 16, 2022

NvTimLiu commented May 16, 2022

View reviewed changes

jenkins/spark-nightly-build.sh Outdated Show resolved Hide resolved

NvTimLiu commented May 16, 2022

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Show resolved Hide resolved

jlowe reviewed May 16, 2022

View reviewed changes

Build against specified spark-rapids-jni snapshot jar

3cd0872

To fix issue : NVIDIA#5294 Build rapids against the specified spark-rapids-jni snapshot jar, to avoid pulling different versions of spark-rapids-jni dependency jars during the compile. Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu force-pushed the build-with-rapids-jni-snapshot branch from c9f660e to 9cc0976 Compare May 17, 2022 12:53

Support maven build options for databricks nightly build and build/bu…

4f26f8e

…ildall script Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu force-pushed the build-with-rapids-jni-snapshot branch from 9cc0976 to 4f26f8e Compare May 17, 2022 13:11

NvTimLiu commented May 17, 2022

View reviewed changes

jenkins/databricks/build.sh Show resolved Hide resolved

jlowe reviewed May 17, 2022

View reviewed changes

build/buildall Show resolved Hide resolved

jenkins/spark-nightly-build.sh Outdated Show resolved Hide resolved

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Show resolved Hide resolved

NvTimLiu added 2 commits May 18, 2022 09:27

Fix scalastyle checking failure : line length exceeds 100 characters

a98e22a

Signed-off-by: Tim Liu <timl@nvidia.com>

Support modifying M2DIR, we may overwrite it outside the script

646fb86

Signed-off-by: Tim Liu <timl@nvidia.com>

jlowe reviewed May 18, 2022

View reviewed changes

tests/src/test/scala/com/nvidia/spark/rapids/RapidsExecutorPluginSuite.scala Show resolved Hide resolved

jlowe approved these changes May 18, 2022

View reviewed changes

NvTimLiu merged commit c1ee8dc into NVIDIA:branch-22.06 May 19, 2022

res-life mentioned this pull request May 24, 2022

Skip unshim and dedup of external spark-rapids-jni and jucx [databricks] #5603

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build against specified spark-rapids-jni snapshot jar [skip ci] #5501

Build against specified spark-rapids-jni snapshot jar [skip ci] #5501

NvTimLiu commented May 16, 2022 •

edited by pxLi

Loading

jlowe left a comment

NvTimLiu commented May 16, 2022

jlowe commented May 16, 2022

NvTimLiu commented May 17, 2022

jlowe commented May 17, 2022

NvTimLiu commented May 17, 2022

NvTimLiu commented May 18, 2022

Build against specified spark-rapids-jni snapshot jar [skip ci] #5501

Build against specified spark-rapids-jni snapshot jar [skip ci] #5501

Conversation

NvTimLiu commented May 16, 2022 • edited by pxLi Loading

jlowe left a comment

Choose a reason for hiding this comment

NvTimLiu commented May 16, 2022

jlowe commented May 16, 2022

NvTimLiu commented May 17, 2022

jlowe commented May 17, 2022

NvTimLiu commented May 17, 2022

NvTimLiu commented May 18, 2022

NvTimLiu commented May 16, 2022 •

edited by pxLi

Loading