Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] array_test.py::test_array_transform_non_deterministic failed with non-UTC time zone #10055

Closed
NvTimLiu opened this issue Dec 14, 2023 · 1 comment · Fixed by #10060
Closed
Assignees
Labels
bug Something isn't working

Comments

@NvTimLiu
Copy link
Collaborator

NvTimLiu commented Dec 14, 2023

Describe the bug

 FAILED ../../src/main/python/array_test.py::test_array_transform_non_deterministic[DATAGEN_SEED=1702503689, INJECT_OOM] - galArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec
 Project [transform(sequence(0, (cast((rand(5) * 10.0) as int) + 1), None, Some(Iran)), lambdafunction((lambda x#4689 * 22),  AS t#4688]
 +- GpuColumnarToRow false
    +- GpuRange (0, 1, step=1, splits=2)
 FAILED ../../src/main/python/array_test.py::test_array_transform_non_deterministic_second_param[DATAGEN_SEED=1702503689] - galArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec
 Project [transform(sequence(0, (cast((rand(5) * 10.0) as int) + 1), None, Some(Iran)), lambdafunction((lambda x#4700 +  x#4700, lambda i#4701, false)) AS t#4699]
 +- GpuColumnarToRow false
    +- GpuRange (0, 1, step=1, splits=2)

 =================================== FAILURES ===================================
 ____________________ test_array_transform_non_deterministic ____________________
 [gw2] linux -- Python 3.9.18 /opt/conda/bin/python
 
     def test_array_transform_non_deterministic():
 >       assert_gpu_and_cpu_are_equal_collect(
                 lambda spark : spark.range(1).selectExpr("transform(sequence(0, cast(rand(5)*10 as int) + 1), x -> x * 22) 
                 conf={'spark.rapids.sql.castFloatToIntegralTypes.enabled': True})
 
 ../../src/main/python/array_test.py:336: 
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 ../../src/main/python/asserts.py:588: in assert_gpu_and_cpu_are_equal_collect
     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first, nc_before_compare=result_canonicalize_func_before_compare)
 ../../src/main/python/asserts.py:494: in _assert_gpu_and_cpu_are_equal
     from_gpu = run_on_gpu()
 ../../src/main/python/asserts.py:487: in run_on_gpu
     from_gpu = with_gpu_session(bring_back, conf=conf)
 ../../src/main/python/spark_session.py:122: in with_gpu_session
     return with_spark_session(func, conf=copy)
 ../../src/main/python/spark_session.py:89: in with_spark_session
     ret = func(_spark)
 ../../src/main/python/asserts.py:205: in <lambda>
     bring_back = lambda spark: limit_func(spark).collect()
 ../../../spark-3.1.1-bin-hadoop3.2/python/pyspark/sql/dataframe.py:677: in collect
     sock_info = self._jdf.collectToPython()
 /home/jenkins/agent/workspace/jenkins-rapids_it-non-utc-dev-12/jars/spark-3.1.1-bin-hadoop3.2/python/lib/py4j-0.10.9-way.py:1304: in __call__
     return_value = get_return_value(
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
 a = ('xro42497', <py4j.java_gateway.GatewayClient object at 0x7f6caea38430>, 'o42496', 'collectToPython')
 kw = {}
 converted = IllegalArgumentException('Part of the plan is not columnar class xecution.ProjectExec\nProject [...:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat hread.java:750)\n', None)
 
     def deco(*a, **kw):
         try:
             return f(*a, **kw)
         except py4j.protocol.Py4JJavaError as e:
             converted = convert_exception(e.java_exception)
             if not isinstance(converted, UnknownException):
                 # Hide where the exception came from that shows a non-Pythonic
                 # JVM exception message.
 >               raise converted from None
 E               pyspark.sql.utils.IllegalArgumentException: Part of the plan is not columnar class xecution.ProjectExec
 E               Project [transform(sequence(0, (cast((rand(5) * 10.0) as int) + 1), None, Some(Iran)),  x#4689 * 22), lambda x#4689, false)) AS t#4688]
 E               +- GpuColumnarToRow false
 E                  +- GpuRange (0, 1, step=1, splits=2)
 
 ../../../spark-3.1.1-bin-hadoop3.2/python/pyspark/sql/utils.py:117: IllegalArgumentException
 ----------------------------- Captured stdout call -----------------------------
 ### CPU RUN ###
 ### GPU RUN ###
 _____________ test_array_transform_non_deterministic_second_param ______________
 [gw2] linux -- Python 3.9.18 /opt/conda/bin/python
 
     def test_array_transform_non_deterministic_second_param():
 >       assert_gpu_and_cpu_are_equal_collect(
                 lambda spark : debug_df(spark.range(1).selectExpr("transform(sequence(0, cast(rand(5)*10 as int) + 1), (x, 
                 conf={'spark.rapids.sql.castFloatToIntegralTypes.enabled': True})
 
 ../../src/main/python/array_test.py:341: 
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 ../../src/main/python/asserts.py:588: in assert_gpu_and_cpu_are_equal_collect
     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first, nc_before_compare=result_canonicalize_func_before_compare)
 ../../src/main/python/asserts.py:494: in _assert_gpu_and_cpu_are_equal
     from_gpu = run_on_gpu()
 ../../src/main/python/asserts.py:487: in run_on_gpu
     from_gpu = with_gpu_session(bring_back, conf=conf)
 ../../src/main/python/spark_session.py:122: in with_gpu_session
     return with_spark_session(func, conf=copy)
 ../../src/main/python/spark_session.py:89: in with_spark_session
     ret = func(_spark)
 ../../src/main/python/asserts.py:205: in <lambda>
     bring_back = lambda spark: limit_func(spark).collect()
 ../../src/main/python/array_test.py:342: in <lambda>
     lambda spark : debug_df(spark.range(1).selectExpr("transform(sequence(0, cast(rand(5)*10 as int) + 1), (x, i) -> x + i) 
 ../../src/main/python/data_gen.py:910: in debug_df
     print('COLLECTED\n{}'.format(df.collect()))
 ../../../spark-3.1.1-bin-hadoop3.2/python/pyspark/sql/dataframe.py:677: in collect
     sock_info = self._jdf.collectToPython()
 /home/jenkins/agent/workspace/jenkins-rapids_it-non-utc-dev-12/jars/spark-3.1.1-bin-hadoop3.2/python/lib/py4j-0.10.9-way.py:1304: in __call__
     return_value = get_return_value(
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
 a = ('xro42665', <py4j.java_gateway.GatewayClient object at 0x7f6caea38430>, 'o42664', 'collectToPython')
 kw = {}
 converted = IllegalArgumentException('Part of the plan is not columnar class xecution.ProjectExec\nProject [...:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat hread.java:750)\n', None)
 
     def deco(*a, **kw):
         try:
             return f(*a, **kw)
         except py4j.protocol.Py4JJavaError as e:
             converted = convert_exception(e.java_exception)
             if not isinstance(converted, UnknownException):
                 # Hide where the exception came from that shows a non-Pythonic
                 # JVM exception message.
 >               raise converted from None
 E               pyspark.sql.utils.IllegalArgumentException: Part of the plan is not columnar class xecution.ProjectExec
 E               Project [transform(sequence(0, (cast((rand(5) * 10.0) as int) + 1), None, Some(Iran)),  x#4700 + lambda i#4701), lambda x#4700, lambda i#4701, false)) AS t#4699]
 E               +- GpuColumnarToRow false
 E                  +- GpuRange (0, 1, step=1, splits=2)
 
 ../../../spark-3.1.1-bin-hadoop3.2/python/pyspark/sql/utils.py:117: IllegalArgumentException
 ----------------------------- Captured stdout call -----------------------------
 ### CPU RUN ###
 COLLECTED
 [Row(t=[0, 2])]
 == Physical Plan ==
 Project [transform(sequence(0, (cast((rand(5) * 10.0) as int) + 1), None, Some(Iran)), lambdafunction((lambda x#4694 +  x#4694, lambda i#4695, false)) AS t#4693]
 +- *(1) Range (0, 1, step=1, splits=2)
 
 
 root
  |-- t: array (nullable = true)
  |    |-- element: integer (containsNull = false)
 
 ### GPU RUN ###
@NvTimLiu NvTimLiu added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 14, 2023
@NVnavkumar NVnavkumar changed the title [BUG] array_test.py::test_array_transform_non_deterministic failed [BUG] array_test.py::test_array_transform_non_deterministic failed with non-UTC time zone Dec 14, 2023
@res-life res-life self-assigned this Dec 15, 2023
@res-life
Copy link
Collaborator

Fix: #10060

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants