-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make new build default and combine into dist package #3411
Make new build default and combine into dist package #3411
Conversation
…ument Signed-off-by: Jason Lowe <jlowe@nvidia.com>
Signed-off-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Thomas Graves <tgraves@nvidia.com>
…apids into builddistnewwithclass2
Signed-off-by: Thomas Graves <tgraves@nvidia.com>
build |
<artifactId>apache-rat-plugin</artifactId> | ||
<configuration> | ||
<excludes> | ||
<exclude>com.nvidia.spark.rapids.SparkShimServiceProvider.*</exclude> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the shade plugin outputs dependency-reduced-pom.xml
under the pom's basedir by default since we were not using dependencyReducedPomLocation
config. As a consequence many developers who have used the old build in their local repo prior to this PR will have a stale file under dist that is not cleaned up by maven
I suggest to add it to the exclude list in this PR to minimize reported build failures.
In a follow up we can add dependencyReducedPomLocation to the shade plugin configs
there are other shuffle related classes I need to handle to make those non-shimmed, working through them. |
the shuffle stuff got very ugly, started requiring a lot of things to be unshimmed which then eventually broke using it without rapids shuffle, need to investigate a better way to handle shuffle |
<id>create-parallel-world</id> | ||
<configuration> | ||
<target> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO explore ant-contrib for loops and try to have a single execution configured where the missing input for unzip is skipped
- created profile representative shims for each FEATURE version and a vendor shim - Use VisibleShuffleManager mostly for tagging, so stop inheriting ShuffleManager - prevent ClassCastExceptions when casting to VisibleShuffleManager by not using it in the ShimLoader method signatures Signed-off-by: Gera Shegalov <gera@apache.org>
shuffle should be fixed if you merge tgravescs#3 into your branch |
Fix classloading via ProxyRapidsShuffleManager
build |
<goal>shade</goal> | ||
</goals> | ||
<configuration> | ||
<artifactSet> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The indentation under configuration appears to be off, only filters appears to be indented correctly.
<goal>shade</goal> | ||
</goals> | ||
<configuration> | ||
<artifactSet> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Indention appears to be off here.
<goals> | ||
<goal>shade</goal> | ||
</goals> | ||
<configuration> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried to put the common configuration under the plugin, and only the differences in the execution itself? That should hopefully make this much more readable, with less duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix in #3440
set -ex | ||
|
||
# Install all the versions we support | ||
mvn -U -Dbuildver=302 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this was discussed before, but if we don't want rat, or tests or javadocs/etc, why do we have them on by default in the maven build at all? I would much rather see a way to pass parameters to the shell script on to the maven build so I can decide what I want and what I don't instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah the build script is very basic right now just to get started, lots of improvements to it need to be done
# | ||
|
||
set -ex | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do something to make sure that the directory is the correct one? This assumes that you are in the root directory calling build/buildall
.
@@ -29,11 +29,13 @@ else | |||
if [ -d "$LOCAL_JAR_PATH" ]; then | |||
CUDF_JARS=$(echo "$LOCAL_JAR_PATH"/cudf-*.jar) | |||
PLUGIN_JARS=$(echo "$LOCAL_JAR_PATH"/rapids-4-spark_*.jar) | |||
# TODO - this should really only pick up the specific version being tested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a follow on issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its subitem in #3223
# Disabled until Spark 3.2 source incompatibility fixed, see https://github.com/NVIDIA/spark-rapids/issues/2052 | ||
#mvn -U -B -Pspark320tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR -Dcuda.version=$CUDA_CLASSIFIER | ||
# Install all the versions we support | ||
mvn -U -B -Dbuildver=302 clean install $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR -Dcuda.version=$CUDA_CLASSIFIER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we made buildall
configurable then we could reuse it here instead of copying everything
@revans2 thanks for the review, if you are ok, I'd like to do nits and things in followup since ci passed here and would like to get it as base? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the issues I pointed out before can be done as follow on work.
contributes to #3223
This pr is based off of #3381 and includes:
The jar is much larger right now until we can dedup the code. it was at 115M
Default build with no targest builds spark301. to build another version you specify -Dbuildver=spark3XX.
I tested databricks builds and one of the apache spark version integration tests. all versions ran unit tests.
I can't test on cloudera right now because the override shim in pr 3381 isn't working and it needs these changes to test so I think we can fix this after they are merged.
There is more to do but I think the release jar is working for everything I've tested so would be good to switch to it and get a base.
I have not tested UCX or the udf examples.
In progress of righting a nightly build to see if that script works.