Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support conv function #8511

Open
2 of 4 tasks
nvliyuan opened this issue Jun 6, 2023 · 2 comments · Fixed by #8925
Open
2 of 4 tasks

[FEA] Support conv function #8511

nvliyuan opened this issue Jun 6, 2023 · 2 comments · Fixed by #8925
Assignees
Labels
feature request New feature or request

Comments

@nvliyuan
Copy link
Collaborator

nvliyuan commented Jun 6, 2023

It would be better to support conv function:
Codes:

df = spark.createDataFrame([("010101",)], ['n'])
df.select(conv(df.n, 2, 16).alias('hex')).collect()

driverlogs:

@Expression <Alias> conv(n#0, 2, 16) AS hex#2 could run on GPU
    ! <Conv> conv(n#0, 2, 16) cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.catalyst.expressions.Conv
      @Expression <AttributeReference> n#0 could run on GPU

Tasks

@nvliyuan nvliyuan added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jun 6, 2023
@mattahrens
Copy link
Collaborator

Scope would include new JNI kernel

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jun 6, 2023
@gerashegalov gerashegalov self-assigned this Jun 15, 2023
@gerashegalov
Copy link
Collaborator

gerashegalov commented Jun 26, 2023

This can be implemented using a generalized version of hex_to_integers followed by integers_to_hex where the radices are not hard-coded to base=16

Demo of current functionality

gerashegalov added a commit to gerashegalov/spark-rapids-jni that referenced this issue Aug 3, 2023
Contributes to NVIDIA/spark-rapids#8511

Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov gerashegalov linked a pull request Aug 3, 2023 that will close this issue
gerashegalov added a commit to NVIDIA/spark-rapids-jni that referenced this issue Aug 17, 2023
Contributes to NVIDIA/spark-rapids#8511

POC supporting form/to  radices 10 and 16 leveraging existing libcudf API 

Signed-off-by: Gera Shegalov <gera@apache.org>
gerashegalov added a commit that referenced this issue Aug 25, 2023
Contributes to #8511 

POC only supports 10/16<->10/16 radix conversions, without overflow checks it's guaranteed to produce identical results to CPU only for 
- decimal strings not longer than 18 characters 
- hexadecimal strings not longer than 15 characters 

Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov gerashegalov reopened this Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants