-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] test_select_complex_field fails in MT tests #9103
Comments
ExecutionPlanCaptureCallback, an unshimmed class, is referencing GpuFileSourceScanExec directly which is a shimmed class. At this point I recommend refactoring ExecutionPlanCaptureCallback so it is just a wrapper class that loads an underlying ExecutionPlanCaptureCallbackImpl class via ShimLoader to service the public methods. Otherwise every time we update this class, we're likely to accidentally touch a shimmed class from this unshimmed class. A better solution is to refactor the code so that unshimmed classes are compiled in a separate module and the shimmed classes are in a separate module that depend on that submodule. Then the build will automatically fail if any unshimmed class tries to reference a shimmed class directly. I thought there was already an issue for this, but I couldn't find it. Filed #9104 to track that approach. |
I have a patch that implements the first suggestion from @jlowe (https://github.com/NVIDIA/spark-rapids/compare/branch-23.10...abellina:instantiate_ExecutionPlanCaptureCallback_via_ShimLoader?expand=1). But it doesn't work:
I think it is because of how the app is getting launched likely. I had to add the jar to extraClassPath in order to cause this exception, so this is likely an order-of-instantiation + class loader issue that also plagues the shuffle manager. In the python tests it is getting added to |
The problem with your patch is that the unshimmed rules are picking up not only ExecutionPlanCaptureCallback but also ExecutionPlanCaptureCallbackImpl, so even the implementation is unshimmed. Yeah, not the best suggestion of a class name on my part. So either we need to update dist/unshimmed-common-from-spark311.txt to only pick up the ExecutionPlanCaptureCallback object class and not the Impl as well, or change the implementation class name to something the wildcard unshim rule for ExecutionPlanCaptureCallback won't also pick up. |
Thanks @jlowe! I am testing changing the name of the impl class to have "Shimmed" as a prefix in the name. |
The MT standalone CI is consistently failing with this test that was added last week (#9018):
Here's the captured stdout with a NoClassDefFoundError. I think we need to move some of the code that was added here under the shim, so we can load the classes. I guess we could also add
GpuFileSourceScanExec
to the unshimmed list, but I am not sure what the ramifications are of that, I'd have to spend more time looking at this.The text was updated successfully, but these errors were encountered: