Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable regular expressions on GPU by default [databricks] #4740

Merged
merged 9 commits into from
Feb 10, 2022

Conversation

andygrove
Copy link
Contributor

Signed-off-by: Andy Grove andygrove@nvidia.com

This PR adds a new config option spark.rapids.sql.regexp.enabled which defaults to true and enables all existing regular expression support on GPU, such as RLike, RegExp, RegExpReplace, and RegExpExtract. This replaces the previous approach of disabling individual expression classes by default.

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove andygrove added this to the Jan 31 - Feb 11 milestone Feb 9, 2022
@andygrove andygrove self-assigned this Feb 9, 2022
revans2
revans2 previously approved these changes Feb 10, 2022
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit on the doc comments. Also the check appears to be the same everywhere and it might be nice to have common code for it, but that is really minor.

@@ -351,14 +351,12 @@ abstract class Spark30XdbShims extends Spark30XdbShimsBase with Logging {
override def convertToGpu(child: Expression): GpuExpression = GpuAbs(child, false)
}),
GpuOverrides.expr[RegExpReplace](
"RegExpReplace support for string literal input patterns",
"RegExpReplace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would rather have a description of what this does and not the name of it again.

"String replace using a regular expression"

@@ -297,14 +297,12 @@ abstract class Spark30XShims extends Spark301until320Shims with Logging {
override def convertToGpu(child: Expression): GpuExpression = GpuAbs(child, false)
}),
GpuOverrides.expr[RegExpReplace](
"RegExpReplace support for string literal input patterns",
"RegExpReplace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -207,17 +207,15 @@ abstract class Spark31XShims extends Spark301until320Shims with Logging {
override def convertToGpu(child: Expression): GpuExpression = GpuAbs(child, false)
}),
GpuOverrides.expr[RegExpReplace](
"RegExpReplace support for string literal input patterns",
"RegExpReplace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@@ -206,17 +206,15 @@ abstract class Spark31XdbShims extends Spark31XdbShimsBase with Logging {
override def convertToGpu(child: Expression): GpuExpression = GpuAbs(child, false)
}),
GpuOverrides.expr[RegExpReplace](
"RegExpReplace support for string literal input patterns",
"RegExpReplace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@@ -309,7 +309,7 @@ trait Spark320PlusShims extends SparkShims with RebaseShims with Logging {
override def convertToGpu(child: Expression): GpuExpression = GpuAbs(child, ansiEnabled)
}),
GpuOverrides.expr[RegExpReplace](
"RegExpReplace support for string literal input patterns",
"RegExpReplace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here too

…nctions.scala

Co-authored-by: Jason Lowe <jlowe@nvidia.com>
@andygrove andygrove changed the title Enable regular expressions on GPU by default Enable regular expressions on GPU by default [databricks] Feb 10, 2022
@jlowe
Copy link
Member

jlowe commented Feb 10, 2022

build

@andygrove andygrove merged commit 2053dc7 into NVIDIA:branch-22.04 Feb 10, 2022
@andygrove andygrove deleted the enable-regexp branch February 10, 2022 20:41
@sameerz sameerz added the task Work required that improves the product but is not user facing label Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants