Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU version of ToPrettyString [databricks] #9221

Merged
merged 28 commits into from
Sep 28, 2023

Conversation

razajafri
Copy link
Collaborator

@razajafri razajafri commented Sep 12, 2023

This PR adds a GpuToPrettyString so we don't fall back to the CPU when show() is called in REPL.

The PR was manually tested and unit tests were added

fixes #9065
fixes #9171
fixes #9220

@razajafri razajafri self-assigned this Sep 12, 2023
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
@razajafri razajafri changed the title Add GPU version of ToPrettyString Add GPU version of ToPrettyString [databricks] Sep 12, 2023
@razajafri razajafri marked this pull request as draft September 12, 2023 04:15
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick pass through much of the code and it feels like there is a lot of refactoring that I have a hard time seeing any value with. I understand ToStringBase so you can override how some functions are implemented, but everything else feels unnecessary.

@sameerz sameerz added the Spark 3.5+ Spark 3.5+ issues label Sep 13, 2023
@razajafri razajafri marked this pull request as ready for review September 20, 2023 19:17
@razajafri
Copy link
Collaborator Author

@revans2 and @andygrove thank you for reviewing.

I have addressed all of your comments I think. PTAL

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only non-nit I have left is what Andy showed about ANSI mode for cast.

val cast = meta.wrapped.asInstanceOf[UnaryExpression]
val from = cast.child.dataType
val to = StringType
if (!gpuCanCast(from, to)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is going to show up in the generated docs, even if we tried to build them with 3.5.0. We should at least have a follow on issue to figure that out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there is a valid need to print ToPrettyString as part of the supported_ops. It is a sub set of what GpuCast can do.

@razajafri
Copy link
Collaborator Author

@andygrove @revans2 PTAL.

@razajafri
Copy link
Collaborator Author

build

@razajafri razajafri marked this pull request as draft September 27, 2023 16:20
@razajafri
Copy link
Collaborator Author

@revans2 @andygrove I have addressed your concerns. PTAL

I have also converted this to draft until rapidsai/cudf#14205 is merged

@@ -25,6 +25,7 @@ import org.scalatest.funsuite.AnyFunSuite
import org.apache.spark.sql.Row
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make changes to this file because of the way InternalRow accesses Arrays, Maps and Structs when using BoundReference

@andygrove
Copy link
Contributor

@mimaomao @kevnzhao fyi, this PR is ready to go, but we are just waiting on rapidsai/cudf#14205 to be merged before we merge this one.

@razajafri
Copy link
Collaborator Author

build

@razajafri razajafri marked this pull request as ready for review September 28, 2023 16:47
@razajafri razajafri merged commit d8a5e5e into NVIDIA:branch-23.10 Sep 28, 2023
29 checks passed
@razajafri razajafri deleted the SP-9171-toprettystring branch September 28, 2023 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Spark 3.5+ Spark 3.5+ issues
Projects
None yet
4 participants