Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] compile-time references to classes potentially unavailable at run time #5648

Closed
Tracked by #5757
gerashegalov opened this issue May 26, 2022 · 0 comments · Fixed by #5723
Closed
Tracked by #5757

[BUG] compile-time references to classes potentially unavailable at run time #5648

gerashegalov opened this issue May 26, 2022 · 0 comments · Fixed by #5723
Assignees
Labels
bug Something isn't working reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@gerashegalov
Copy link
Collaborator

gerashegalov commented May 26, 2022

Describe the bug
There is code that uses class loading by String name

Try(loader.loadClass("org.apache.spark.sql.v2.avro.AvroScan")) match {

to check whether an optional module like spark-avro is available at run time

However, at the same time it contains compile-time references to AvroScan.

I don't think there is any guarantee that a more aggressive JIT compile or classloading in some JVM implementation would not trigger loading of AvroScan using the classloader of object ExternalSource which is not necessarily the right one.

It would be cleaner, if ExternalSource loaded by name some other class "GpuAvroScans" after checking hasSparkAvroJar
using Utils.getContextOrSparkClassLoader. This class may safely use compile-time references.
Something to the tune of:

ScansProvider.scala

trait ScansProvider {
  def getScans: Map[Class[_ <: Scan], ScanRule[_ <: Scan]]
}

AvroScansProvider.scala

class AvroScansProvider extends  ScansProvider {
   def getScans: Map[Class[_ <: Scan], ScanRule[_ <: Scan]] =  Seq(
        GpuOverrides.scan[AvroScan](...))
}

Then change ExternalSource:

def getScans: Map[Class[_ <: Scan], ScanRule[_ <: Scan]] = {
    if (hasSparkAvroJar) {
      ShimLoader.newInstance[ScansProvider]("com.nvidia.spark.rapids.AvroScansProvider")
        .getScans
    } else Map.empty  
}

Steps/Code to reproduce bug

Expected behavior
ExternalSource bytecode should not have any compile-time references to AvroScan

Environment details (please complete the following information)
Any

Additional context
Similar pattern in GpuHiveOverrides

@gerashegalov gerashegalov added bug Something isn't working ? - Needs Triage Need team to review and classify labels May 26, 2022
@gerashegalov gerashegalov changed the title [BUG] Direct references to classes potentially unavailable at run time [BUG] compile-time references to classes potentially unavailable at run time May 26, 2022
@sameerz sameerz added reliability Features to improve reliability or bugs that severly impact the reliability of the plugin and removed ? - Needs Triage Need team to review and classify labels May 31, 2022
@res-life res-life self-assigned this Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
3 participants