Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Dist jar pom lists aggregator jar as dependency #4347

Closed
jlowe opened this issue Dec 11, 2021 · 1 comment · Fixed by #4355 or #4371
Closed

[BUG] Dist jar pom lists aggregator jar as dependency #4347

jlowe opened this issue Dec 11, 2021 · 1 comment · Fixed by #4355 or #4371
Assignees
Labels
bug Something isn't working P0 Must have for release

Comments

@jlowe
Copy link
Member

jlowe commented Dec 11, 2021

A recent spark-rapids-ml test job failed with the following error:

java.lang.ClassNotFoundException: org.apache.spark.sql.catalyst.expressions.TimeSub

This was caused because the spark301 plugin jars were on the classpath yet the Spark version being used was Spark 3.1.2. It loos like the dist jar was pulling in the aggregator jar, and the pom specifies it defaults to spark.version=3.0.1.

The incorrect aggregator dependency on the published pom should have been fixed by #4265, but the pom being published for dist snapshots still has the aggregator jars. From the most recent snapshot pom on 1206:

    <dependencies>
        <dependency>
            <groupId>com.nvidia</groupId>
            <artifactId>rapids-4-spark-aggregator_${scala.binary.version}</artifactId>
            <version>${project.version}</version>
            <classifier>${spark.version.classifier}</classifier>
            <scope>compile</scope>
        </dependency>

        <!--
            manually promoting provided cudf as a direct dependency
        -->
        <dependency>
            <groupId>ai.rapids</groupId>
            <artifactId>cudf</artifactId>
            <version>${cudf.version}</version>
            <classifier>${cuda.version}</classifier>
            <scope>compile</scope>
        </dependency>
    </dependencies>

I checked the rapids4spark-version-info.properties in the dist snapshot jar:

version=21.12.0-SNAPSHOT
cudf_version=21.12.0
user=
revision=7bea7c80362fe54055b7e0cd7ffa672098118f76
branch=HEAD
date=2021-12-06T08:04:04Z
url=https://github.com/NVIDIA/spark-rapids.git

which indicates it should have been built with the fix in #4265 yet the published dist pom contains the aggregator jar dependency.

@jlowe jlowe added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 11, 2021
@jlowe
Copy link
Member Author

jlowe commented Dec 11, 2021

I checked the official 21.12.0 artifact in the staging directory, and that pom is correct. It appears to be specific to snapshot artifacts only. Moving this to 22.02.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release
Projects
None yet
2 participants