Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] crash if we have a decimal128 in a struct in an array #4600

Closed
revans2 opened this issue Jan 21, 2022 · 2 comments
Closed

[BUG] crash if we have a decimal128 in a struct in an array #4600

revans2 opened this issue Jan 21, 2022 · 2 comments
Assignees
Labels
bug Something isn't working P0 Must have for release

Comments

@revans2
Copy link
Collaborator

revans2 commented Jan 21, 2022

Describe the bug

spark.range(10).selectExpr("id", "array(struct('a', CAST(9 as DECIMAL(29,0))))").collect().length

throws

java.lang.IllegalStateException: Unexpected element type: class java.math.BigInteger
  at ai.rapids.cudf.HostColumnVector$ColumnBuilder.appendChildOrNull(HostColumnVector.java:1146)
  at ai.rapids.cudf.HostColumnVector$ColumnBuilder.append(HostColumnVector.java:1053)
  at ai.rapids.cudf.HostColumnVector$ColumnBuilder.appendStructValues(HostColumnVector.java:926)
  at ai.rapids.cudf.HostColumnVector.fromStructs(HostColumnVector.java:302)
  at ai.rapids.cudf.ColumnVector.fromStructs(ColumnVector.java:1193)
  at com.nvidia.spark.rapids.GpuScalar$.columnVectorFromLiterals(literals.scala:202)
  at com.nvidia.spark.rapids.GpuScalar$.from(literals.scala:325)
  at com.nvidia.spark.rapids.GpuScalar.getBase(literals.scala:469)
  at com.nvidia.spark.rapids.GpuColumnVector.from(GpuColumnVector.java:877)
  at com.nvidia.spark.rapids.GpuExpressionsUtils$.$anonfun$resolveColumnVector$1(GpuExpressions.scala:74)
  at com.nvidia.spark.rapids.Arm.withResourceIfAllowed(Arm.scala:73)
  at com.nvidia.spark.rapids.Arm.withResourceIfAllowed$(Arm.scala:71)
  at com.nvidia.spark.rapids.GpuExpressionsUtils$.withResourceIfAllowed(GpuExpressions.scala:30)
  at com.nvidia.spark.rapids.GpuExpressionsUtils$.resolveColumnVector(GpuExpressions.scala:72)
  at com.nvidia.spark.rapids.GpuExpressionsUtils$.columnarEvalToColumn(GpuExpressions.scala:93)
  at com.nvidia.spark.rapids.GpuProjectExec$.projectSingle(basicPhysicalOperators.scala:102)
  at com.nvidia.spark.rapids.GpuProjectExec$.$anonfun$project$1(basicPhysicalOperators.scala:109)
  at com.nvidia.spark.rapids.RapidsPluginImplicits$MapsSafely.$anonfun$safeMap$1(implicits.scala:162)
  at com.nvidia.spark.rapids.RapidsPluginImplicits$MapsSafely.$anonfun$safeMap$1$adapted(implicits.scala:159)
  at scala.collection.immutable.List.foreach(List.scala:431)
  at com.nvidia.spark.rapids.RapidsPluginImplicits$MapsSafely.safeMap(implicits.scala:159)
  at com.nvidia.spark.rapids.RapidsPluginImplicits$AutoCloseableProducingSeq.safeMap(implicits.scala:194)
  at com.nvidia.spark.rapids.GpuProjectExec$.project(basicPhysicalOperators.scala:109)
  at com.nvidia.spark.rapids.GpuProjectExec$.projectAndClose(basicPhysicalOperators.scala:73)
  at com.nvidia.spark.rapids.GpuProjectExec.$anonfun$doExecuteColumnar$1(basicPhysicalOperators.scala:149)
  at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
  at com.nvidia.spark.rapids.ColumnarToRowIterator.$anonfun$fetchNextBatch$2(GpuColumnarToRowExec.scala:242)
  at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
  at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
  at com.nvidia.spark.rapids.ColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:188)
  at com.nvidia.spark.rapids.ColumnarToRowIterator.fetchNextBatch(GpuColumnarToRowExec.scala:239)
  at com.nvidia.spark.rapids.ColumnarToRowIterator.loadNextBatch(GpuColumnarToRowExec.scala:216)
  at com.nvidia.spark.rapids.ColumnarToRowIterator.hasNext(GpuColumnarToRowExec.scala:256)
  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
  at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:131)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)

This was found as a part of #4470

@revans2 revans2 added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Jan 21, 2022
@revans2 revans2 self-assigned this Jan 21, 2022
@revans2 revans2 removed the ? - Needs Triage Need team to review and classify label Jan 21, 2022
@revans2 revans2 added this to the Jan 10 - Jan 28 milestone Jan 21, 2022
@revans2
Copy link
Collaborator Author

revans2 commented Jan 21, 2022

rapidsai/cudf#10105 was just merged into CUDF and should fix the issue. I just kicked the build and once it is in and the premerge build for #4470 passes I'll call it good.

@nartal1
Copy link
Collaborator

nartal1 commented Jan 21, 2022

Thanks @revans2 for fixing this issue.

@revans2 revans2 closed this as completed Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release
Projects
None yet
Development

No branches or pull requests

2 participants