-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Struct to string casting functionality #1814
Conversation
4079523
to
ab41687
Compare
ab41687
to
c89599b
Compare
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/TypeChecks.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Ryan Lee <ryanlee@nvidia.com>
Signed-off-by: Ryan Lee <ryanlee@nvidia.com>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
columns += ColumnVector.fromScalar(bracketScalar, input.getRowCount().toInt) | ||
} | ||
withResource(input.getBase().getChildColumnView(0)) { childView => | ||
withResource(childView.copyToColumnVector()) { childVector => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to copy this to a column vector? I understand because of the APIs we have right now it is needed, but it would be nice to avoid the copy if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would not copy this column vector, which is why I had been using toString
in earlier iterations. How might I avoid this copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature of doColumnar
takes a GpuColumnVector
It uses the GpuColumnVector
to extract its Spark data type so it can do the cast. A ColumnVector
is a ColumnView
. As such I would propose that we insert another API in between that we could use for recursive calls.
override def doColumnar(input: GpuColumnVector): ColumnVector =
doColumnar(input.getBase, input.dataType())
private def doColumnar(input: ColumnView, sparkType: DataType): COlumnVector = {
(sparkType, dataType) match {
...
}
}
Then you can call the new doColumnar without making copy of the data.
withResource(input.getBase().getChildColumnView(0)) { childView =>
columns += doColumnar(childView, inputSchema(0).dataType)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There a few things around decimal casting that didn't translate nicely. In some cases they want to do a zero-copy, so I've filtered them out in the GpuColumnVector
wrapper API. In those cases, if those cases are found when doing a ColumnView
cast, it's copied to a ColumnVector. Reworking the decimal support to be ColumnView
friendly version can be done in a follow up PR if necessary, but it seemed a bit out of scope and was non-trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a follow on issue to clean this up? The issue appears to be because of incRefCount, and there should be ways for us to work around that without too much difficulty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
build |
Signed-off-by: Ryan Lee <ryanlee@nvidia.com> Co-authored-by: Ryan Lee <ryanlee@nvidia.com>
Signed-off-by: Ryan Lee <ryanlee@nvidia.com> Co-authored-by: Ryan Lee <ryanlee@nvidia.com>
Resolves #1698
Requires rapidsai/cudf#7406 -- draft until that PR is merged
Inherits incompatibility of base casting -- e.g. floats -- because casting is used recursively