-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make new build default and combine into dist package #3411
Changes from all commits
53f665b
b80cb31
5e1cacb
07526a3
a8f5949
8ffc90e
8cb180e
d2453b7
9ee5eeb
0d5b11c
fbacdbe
72dd879
59503cd
aca6e8e
9fbe630
fb0399a
3331515
4e24306
34dd324
6ab79ad
4d8df72
9ffe27d
b8513e5
df4d48e
662a76f
120d359
ca832e7
c6611ff
64bf62d
1d7abb3
ac74331
cdbe13d
8974e60
8e8d61a
b5a30b8
6c4c788
c1a104f
dd2660a
43fab8a
381c624
6b82784
c823f8a
85418a3
451c293
ee30c59
3194a65
2ce74fc
34cb2ed
0e5ad4e
e59f44f
e47c4fd
3746e9c
67e6c52
ae339a7
ef7f5f3
7ad7d94
c8a9772
d581ba8
f6c2980
aa472c9
04ba876
bd47113
317c051
32c78c8
792e83f
ec1c539
3bf66f6
6057940
c46a0c4
d38bfee
a041fe6
56effd6
21873ef
c89a65c
bf79e74
871e12d
b3a4df7
ed24fe9
698363e
0da6108
51db94e
5c4b5fb
b787b29
b6bc59d
15d534f
1f83a3d
9a43800
6ec561a
44fbfac
b70d527
7d43de1
a7db8e7
d905db3
9ab165f
a094c94
371329f
c4060a7
9411c30
62e7da8
60a3faa
5c9936a
78596ca
7dadfb7
ee15d31
9ded15a
48f281e
2297962
b42c158
339d10f
895216b
f20c79d
0d078a7
b13a61c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,212 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
Copyright (c) 2021, NVIDIA CORPORATION. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<parent> | ||
<groupId>com.nvidia</groupId> | ||
<artifactId>rapids-4-spark-parent</artifactId> | ||
<version>21.10.0-SNAPSHOT</version> | ||
</parent> | ||
<artifactId>rapids-4-spark-aggregator_2.12</artifactId> | ||
<name>RAPIDS Accelerator for Apache Spark Aggregator</name> | ||
<description>Creates an aggregated shaded package of the RAPIDS plugin for Apache Spark</description> | ||
<version>21.10.0-SNAPSHOT</version> | ||
|
||
<properties> | ||
<rapids.shade.package>com.nvidia.shaded.${spark.version.classifier}.spark</rapids.shade.package> | ||
</properties> | ||
<dependencies> | ||
<dependency> | ||
<groupId>com.nvidia</groupId> | ||
<artifactId>rapids-4-spark-sql_${scala.binary.version}</artifactId> | ||
<version>${project.version}</version> | ||
<classifier>${spark.version.classifier}</classifier> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.nvidia</groupId> | ||
<artifactId>rapids-4-spark-shuffle_${scala.binary.version}</artifactId> | ||
<version>${project.version}</version> | ||
<classifier>${spark.version.classifier}</classifier> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.nvidia</groupId> | ||
<artifactId>rapids-4-spark-udf_${scala.binary.version}</artifactId> | ||
<version>${project.version}</version> | ||
<classifier>${spark.version.classifier}</classifier> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.nvidia</groupId> | ||
<artifactId>rapids-4-spark-shims-${spark.version.classifier}_${scala.binary.version}</artifactId> | ||
<version>${project.version}</version> | ||
</dependency> | ||
</dependencies> | ||
<build> | ||
<plugins> | ||
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-shade-plugin</artifactId> | ||
<executions> | ||
<!-- Unfortunately have to have 2 executions here to get dependency reduced pom. | ||
The shade plugin won't generate it when using the classifier and shadedArtifactAttached=true. | ||
--> | ||
<execution> | ||
<id>main</id> | ||
<phase>package</phase> | ||
<goals> | ||
<goal>shade</goal> | ||
</goals> | ||
<configuration> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you tried to put the common configuration under the plugin, and only the differences in the execution itself? That should hopefully make this much more readable, with less duplication. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will fix in #3440 |
||
<artifactSet> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: The indentation under configuration appears to be off, only filters appears to be indented correctly. |
||
<excludes>org.slf4j:*</excludes> | ||
</artifactSet> | ||
<transformers> | ||
<transformer | ||
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> | ||
</transformers> | ||
<createDependencyReducedPom>true</createDependencyReducedPom> | ||
<shadedArtifactAttached>false</shadedArtifactAttached> | ||
<relocations> | ||
<relocation> | ||
<pattern>org.apache.orc.</pattern> | ||
<shadedPattern>${rapids.shade.package}.orc.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.hadoop.hive.</pattern> | ||
<shadedPattern>${rapids.shade.package}.hadoop.hive.</shadedPattern> | ||
<excludes> | ||
<exclude>org.apache.hadoop.hive.conf.HiveConf</exclude> | ||
<exclude>org.apache.hadoop.hive.ql.exec.UDF</exclude> | ||
<exclude>org.apache.hadoop.hive.ql.udf.generic.GenericUDF</exclude> | ||
</excludes> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.hive.</pattern> | ||
<shadedPattern>${rapids.shade.package}.hive.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>io.airlift.compress.</pattern> | ||
<shadedPattern>${rapids.shade.package}.io.airlift.compress.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.commons.codec.</pattern> | ||
<shadedPattern>${rapids.shade.package}.org.apache.commons.codec.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.commons.lang.</pattern> | ||
<shadedPattern>${rapids.shade.package}.org.apache.commons.lang.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>com.google</pattern> | ||
<shadedPattern>${rapids.shade.package}.com.google</shadedPattern> | ||
</relocation> | ||
</relocations> | ||
<filters> | ||
<filter> | ||
<artifact>com.nvidia:rapids-4-spark-aggregator_2.12</artifact> | ||
<includes> | ||
<include>META-INF/**</include> | ||
</includes> | ||
<excludes> | ||
<exclude>META-INF/services/**</exclude> | ||
</excludes> | ||
</filter> | ||
</filters> | ||
</configuration> | ||
</execution> | ||
<execution> | ||
<id>classifierversion</id> | ||
<phase>package</phase> | ||
<goals> | ||
<goal>shade</goal> | ||
</goals> | ||
<configuration> | ||
<artifactSet> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Indention appears to be off here. |
||
<excludes>org.slf4j:*</excludes> | ||
</artifactSet> | ||
<transformers> | ||
<transformer | ||
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> | ||
</transformers> | ||
<!-- note that the classifier version won't generate dependency reduce pom due to shadedArtifactAttached=true --> | ||
<createDependencyReducedPom>true</createDependencyReducedPom> | ||
<shadedArtifactAttached>true</shadedArtifactAttached> | ||
<shadedClassifierName>${spark.version.classifier}</shadedClassifierName> | ||
<relocations> | ||
<relocation> | ||
<pattern>org.apache.orc.</pattern> | ||
<shadedPattern>${rapids.shade.package}.orc.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.hadoop.hive.</pattern> | ||
<shadedPattern>${rapids.shade.package}.hadoop.hive.</shadedPattern> | ||
<excludes> | ||
<exclude>org.apache.hadoop.hive.conf.HiveConf</exclude> | ||
<exclude>org.apache.hadoop.hive.ql.exec.UDF</exclude> | ||
<exclude>org.apache.hadoop.hive.ql.udf.generic.GenericUDF</exclude> | ||
</excludes> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.hive.</pattern> | ||
<shadedPattern>${rapids.shade.package}.hive.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>io.airlift.compress.</pattern> | ||
<shadedPattern>${rapids.shade.package}.io.airlift.compress.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.commons.codec.</pattern> | ||
<shadedPattern>${rapids.shade.package}.org.apache.commons.codec.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>org.apache.commons.lang.</pattern> | ||
<shadedPattern>${rapids.shade.package}.org.apache.commons.lang.</shadedPattern> | ||
</relocation> | ||
<relocation> | ||
<pattern>com.google</pattern> | ||
<shadedPattern>${rapids.shade.package}.com.google</shadedPattern> | ||
</relocation> | ||
</relocations> | ||
<filters> | ||
<filter> | ||
<artifact>com.nvidia:rapids-4-spark-aggregator_2.12</artifact> | ||
<includes> | ||
<include>META-INF/**</include> | ||
</includes> | ||
<excludes> | ||
<exclude>META-INF/services/**</exclude> | ||
</excludes> | ||
</filter> | ||
</filters> | ||
</configuration> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
<plugin> | ||
<groupId>net.alchim31.maven</groupId> | ||
<artifactId>scala-maven-plugin</artifactId> | ||
</plugin> | ||
<plugin> | ||
<groupId>org.apache.rat</groupId> | ||
<artifactId>apache-rat-plugin</artifactId> | ||
</plugin> | ||
</plugins> | ||
</build> | ||
|
||
</project> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
#!/bin/bash | ||
# | ||
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
set -ex | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we do something to make sure that the directory is the correct one? This assumes that you are in the root directory calling |
||
# Install all the versions we support | ||
mvn -U -Dbuildver=302 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this was discussed before, but if we don't want rat, or tests or javadocs/etc, why do we have them on by default in the maven build at all? I would much rather see a way to pass parameters to the shell script on to the maven build so I can decide what I want and what I don't instead. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah the build script is very basic right now just to get started, lots of improvements to it need to be done |
||
mvn -U -Dbuildver=303 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=304 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=311 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=312 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=313 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=311cdh clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=320 clean install -Drat.skip=true -DskipTests -Dmaven.javadoc.skip=true -Dskip -pl aggregator -am | ||
mvn -U -Dbuildver=301 clean install -DskipTests -Psnapshots |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
layout: page | ||
title: Testing | ||
nav_order: 1 | ||
parent: Developer Overview | ||
--- | ||
# RAPIDS Accelerator for Apache Spark Distribution Packaging | ||
|
||
The distribution module creates a jar with support for the Spark versions you need combined into a single jar. | ||
|
||
See the [CONTRIBUTING.md](../CONTRIBUTING.md) doc for details on building and profiles available to build an uber jar. | ||
|
||
Note that when you use the profiles to build an uber jar there are currently some hardcoded service provider files that get put into place. One file for each of the | ||
above profiles. Please note that you will need to update these if adding or removing support for a Spark version. | ||
|
||
Files are: `com.nvidia.spark.rapids.SparkShimServiceProvider.sparkNonSnapshot`, `com.nvidia.spark.rapids.SparkShimServiceProvider.sparkSnapshot`, `com.nvidia.spark.rapids.SparkShimServiceProvider.sparkNonSnapshotDB`, and `com.nvidia.spark.rapids.SparkShimServiceProvider.sparkSnapshotDB`. | ||
|
||
The new uber jar is structured like: | ||
|
||
1. Base common classes are user visible classes. For these we use Spark 3.0.1 versions | ||
2. META-INF/services. This is a file that has to list all the shim versions supported by this jar. The files talked about above for each profile are put into place here for uber jars. | ||
3. META-INF base files are from 3.0.1 - maven, LICENSE, NOTICE, etc | ||
4. shaded dependencies for Spark 3.0.1 in case the base common classes needed them. | ||
5. Spark specific directory for each version of Spark supported in the jar. ie spark301/, spark302/, spark311/, etc. | ||
|
||
If you have to change the contents of the uber jar the following files control what goes into the base jar as classes that are not shaded. | ||
|
||
1. `unshimmed-base.txt` - this has classes and files that should go into the base jar with their normal package name (not shaded). This includes user visible classes (ie com/nvidia/spark/SQLPlugin), python files, and other files that aren't version specific. Uses Spark 3.0.1 built jar for these base classes. | ||
2. `unshimmed-extras.txt` - This is applied to all the individual Spark specific verson jars to pull any files that need to go into the base of the jar and not into the Spark specific directory from all of the other Spark version jars. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
com.nvidia.spark.rapids.shims.spark301.SparkShimServiceProvider | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note hardcoded these for now, hopefully can make this smarter and generate on the fly later. |
||
com.nvidia.spark.rapids.shims.spark302.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark303.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark312.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311cdh.SparkShimServiceProvider |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
com.nvidia.spark.rapids.shims.spark301.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark302.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark303.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark312.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311cdh.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark301db.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311db.SparkShimServiceProvider |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
com.nvidia.spark.rapids.shims.spark301.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark302.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark303.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark304.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark312.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark313.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark311cdh.SparkShimServiceProvider | ||
com.nvidia.spark.rapids.shims.spark320.SparkShimServiceProvider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
-Drat.skip=true
not needed for building 301 ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want it to run with one of the builds