Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] RAPIDS accelerated ScalaUDF #1594

Closed
jlowe opened this issue Jan 26, 2021 · 0 comments · Fixed by #1636
Closed

[FEA] RAPIDS accelerated ScalaUDF #1594

jlowe opened this issue Jan 26, 2021 · 0 comments · Fixed by #1636
Assignees
Labels
feature request New feature or request performance A performance related task/issue

Comments

@jlowe
Copy link
Member

jlowe commented Jan 26, 2021

Is your feature request related to a problem? Please describe.
#1393 added support for users supplying an alternate RAPIDS implementation of a Hive UDF, and that works well for queries that are already in SQL. However if the query is written against the DataFrame API and using a Scala UDF (seen in the Catalyst plan as ScalaUDF) then there isn't an option to provide a RAPIDS alternative implementation in the UDF. Spark wraps the user's code in at least one layer of lambda functions, so it's tricky to get access to the user's original class that implements their UDF code to see if it implements the RapidsUDF interface.

Describe the solution you'd like
Ideally the RAPIDS Accelerator plugin would be able to automatically identify whether the user code behind a ScalaUDF instance implements the RapidsUDF interface and therefore has a RAPIDS-accelerated implementation. Due to the use of wrapping lambdas, we may need to examine the bytecode and 'peel off' the lambda layers (if this is possible).

Describe alternatives you've considered
We could provide a separate method, specific to the RAPIDS Accelerator plugin, where users could register their UDFs, but that has two main drawbacks:

  • It requires the users to modify their code to additionally register the UDF
  • We'd still have to match up the code found in a ScalaUDF instance with the code registered via the separate interface.
@jlowe jlowe added feature request New feature or request ? - Needs Triage Need team to review and classify performance A performance related task/issue labels Jan 26, 2021
@jlowe jlowe removed the ? - Needs Triage Need team to review and classify label Jan 29, 2021
@jlowe jlowe self-assigned this Jan 29, 2021
@jlowe jlowe changed the title [FEA] RAPIDS-accelerated ScalaUDF [FEA] RAPIDS accelerated ScalaUDF Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request performance A performance related task/issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant