Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] array_item test failures on Spark 3.3.x #8652

Closed
jlowe opened this issue Jul 3, 2023 · 0 comments · Fixed by #11054
Closed

[BUG] array_item test failures on Spark 3.3.x #8652

jlowe opened this issue Jul 3, 2023 · 0 comments · Fixed by #11054
Labels
bug Something isn't working

Comments

@jlowe
Copy link
Member

jlowe commented Jul 3, 2023

The following tests are failing on Spark 3.3.x because it expects an exception to be thrown but it is not thrown by the CPU Spark session. Saw these failures on Spark 3.3.0 and Spark 3.3.2.

==================================================================== FAILURES =====================================================================
___________________________________________________ test_array_item_with_strict_index[-2-True] ____________________________________________________
[gw4] linux -- Python 3.8.10 /home/jlowe/miniconda3/envs/cudf_dev/bin/python

strict_index_enabled = True, index = -2

    @pytest.mark.skipif(not is_spark_33X() or is_databricks_runtime(), reason="'strictIndexOperator' is introduced from Spark 3.3.0 and removed in Spark 3.4.0 and DB11.3")
    @pytest.mark.parametrize('strict_index_enabled', [True, False])
    @pytest.mark.parametrize('index', [-2, 100, array_neg_index_gen, array_out_index_gen], ids=idfn)
    def test_array_item_with_strict_index(strict_index_enabled, index):
        message = "SparkArrayIndexOutOfBoundsException"
        if isinstance(index, int):
            test_df = lambda spark: unary_op_df(spark, ArrayGen(int_gen)).select(col('a')[index])
        else:
            test_df = lambda spark: two_col_df(spark, ArrayGen(int_gen), index).selectExpr('a[b]')
    
        test_conf = copy_and_update(ansi_enabled_conf, {'spark.sql.ansi.strictIndexOperator': strict_index_enabled})
    
        if strict_index_enabled:
>           assert_gpu_and_cpu_error(
                lambda spark: test_df(spark).collect(),
                conf=test_conf,
                error_message=message)

../../src/main/python/array_test.py:136: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/main/python/asserts.py:626: in assert_gpu_and_cpu_error
    assert_spark_exception(lambda: with_cpu_session(df_fun, conf), error_message)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

func = <function assert_gpu_and_cpu_error.<locals>.<lambda> at 0x7f91f460e790>, error_message = 'SparkArrayIndexOutOfBoundsException'

    def assert_spark_exception(func, error_message):
        """
        Assert that a specific Java exception is thrown
        :param func: a function to be verified
        :param error_message: a string such as the one produce by java.lang.Exception.toString
        :return: Assertion failure if no exception matching error_message has occurred.
        """
        with pytest.raises(Exception) as excinfo:
>           func()
E           Failed: DID NOT RAISE <class 'Exception'>

../../src/main/python/asserts.py:613: Failed
___________________________________________________ test_array_item_with_strict_index[100-True] ___________________________________________________
[gw4] linux -- Python 3.8.10 /home/jlowe/miniconda3/envs/cudf_dev/bin/python

strict_index_enabled = True, index = 100

    @pytest.mark.skipif(not is_spark_33X() or is_databricks_runtime(), reason="'strictIndexOperator' is introduced from Spark 3.3.0 and removed in Spark 3.4.0 and DB11.3")
    @pytest.mark.parametrize('strict_index_enabled', [True, False])
    @pytest.mark.parametrize('index', [-2, 100, array_neg_index_gen, array_out_index_gen], ids=idfn)
    def test_array_item_with_strict_index(strict_index_enabled, index):
        message = "SparkArrayIndexOutOfBoundsException"
        if isinstance(index, int):
            test_df = lambda spark: unary_op_df(spark, ArrayGen(int_gen)).select(col('a')[index])
        else:
            test_df = lambda spark: two_col_df(spark, ArrayGen(int_gen), index).selectExpr('a[b]')
    
        test_conf = copy_and_update(ansi_enabled_conf, {'spark.sql.ansi.strictIndexOperator': strict_index_enabled})
    
        if strict_index_enabled:
>           assert_gpu_and_cpu_error(
                lambda spark: test_df(spark).collect(),
                conf=test_conf,
                error_message=message)

../../src/main/python/array_test.py:136: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/main/python/asserts.py:626: in assert_gpu_and_cpu_error
    assert_spark_exception(lambda: with_cpu_session(df_fun, conf), error_message)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

func = <function assert_gpu_and_cpu_error.<locals>.<lambda> at 0x7f91cdb680d0>, error_message = 'SparkArrayIndexOutOfBoundsException'

    def assert_spark_exception(func, error_message):
        """
        Assert that a specific Java exception is thrown
        :param func: a function to be verified
        :param error_message: a string such as the one produce by java.lang.Exception.toString
        :return: Assertion failure if no exception matching error_message has occurred.
        """
        with pytest.raises(Exception) as excinfo:
>           func()
E           Failed: DID NOT RAISE <class 'Exception'>

../../src/main/python/asserts.py:613: Failed
___________________________________________________ test_array_item_ansi_fail_invalid_index[-2] ___________________________________________________
[gw4] linux -- Python 3.8.10 /home/jlowe/miniconda3/envs/cudf_dev/bin/python

index = -2

    @pytest.mark.parametrize('index', [-2, 100, array_neg_index_gen, array_out_index_gen], ids=idfn)
    def test_array_item_ansi_fail_invalid_index(index):
        message = "SparkArrayIndexOutOfBoundsException" if (is_databricks104_or_later() or is_spark_330_or_later()) else "java.lang.ArrayIndexOutOfBoundsException"
        if isinstance(index, int):
            test_func = lambda spark: unary_op_df(spark, ArrayGen(int_gen)).select(col('a')[index]).collect()
        else:
            test_func = lambda spark: two_col_df(spark, ArrayGen(int_gen), index).selectExpr('a[b]').collect()
>       assert_gpu_and_cpu_error(
            test_func,
            conf=ansi_enabled_conf,
            error_message=message)

../../src/main/python/array_test.py:153: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/main/python/asserts.py:626: in assert_gpu_and_cpu_error
    assert_spark_exception(lambda: with_cpu_session(df_fun, conf), error_message)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

func = <function assert_gpu_and_cpu_error.<locals>.<lambda> at 0x7f91cf36dee0>, error_message = 'SparkArrayIndexOutOfBoundsException'

    def assert_spark_exception(func, error_message):
        """
        Assert that a specific Java exception is thrown
        :param func: a function to be verified
        :param error_message: a string such as the one produce by java.lang.Exception.toString
        :return: Assertion failure if no exception matching error_message has occurred.
        """
        with pytest.raises(Exception) as excinfo:
>           func()
E           Failed: DID NOT RAISE <class 'Exception'>

../../src/main/python/asserts.py:613: Failed
__________________________________________________ test_array_item_ansi_fail_invalid_index[100] ___________________________________________________
[gw4] linux -- Python 3.8.10 /home/jlowe/miniconda3/envs/cudf_dev/bin/python

index = 100

    @pytest.mark.parametrize('index', [-2, 100, array_neg_index_gen, array_out_index_gen], ids=idfn)
    def test_array_item_ansi_fail_invalid_index(index):
        message = "SparkArrayIndexOutOfBoundsException" if (is_databricks104_or_later() or is_spark_330_or_later()) else "java.lang.ArrayIndexOutOfBoundsException"
        if isinstance(index, int):
            test_func = lambda spark: unary_op_df(spark, ArrayGen(int_gen)).select(col('a')[index]).collect()
        else:
            test_func = lambda spark: two_col_df(spark, ArrayGen(int_gen), index).selectExpr('a[b]').collect()
>       assert_gpu_and_cpu_error(
            test_func,
            conf=ansi_enabled_conf,
            error_message=message)

../../src/main/python/array_test.py:153: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/main/python/asserts.py:626: in assert_gpu_and_cpu_error
    assert_spark_exception(lambda: with_cpu_session(df_fun, conf), error_message)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

func = <function assert_gpu_and_cpu_error.<locals>.<lambda> at 0x7f91cdb65820>, error_message = 'SparkArrayIndexOutOfBoundsException'

    def assert_spark_exception(func, error_message):
        """
        Assert that a specific Java exception is thrown
        :param func: a function to be verified
        :param error_message: a string such as the one produce by java.lang.Exception.toString
        :return: Assertion failure if no exception matching error_message has occurred.
        """
        with pytest.raises(Exception) as excinfo:
>           func()
E           Failed: DID NOT RAISE <class 'Exception'>

../../src/main/python/asserts.py:613: Failed
---- generated xml file: /home/jlowe/src/spark-rapids/integration_tests/target/run_dir-20230703093946-9jfJ/TEST-pytest-1688395186391743284.xml ----
============================================================= short test summary info =============================================================
FAILED ../../src/main/python/array_test.py::test_array_item_with_strict_index[-2-True][INJECT_OOM] - Failed: DID NOT RAISE <class 'Exception'>
FAILED ../../src/main/python/array_test.py::test_array_item_with_strict_index[100-True] - Failed: DID NOT RAISE <class 'Exception'>
FAILED ../../src/main/python/array_test.py::test_array_item_ansi_fail_invalid_index[-2] - Failed: DID NOT RAISE <class 'Exception'>
FAILED ../../src/main/python/array_test.py::test_array_item_ansi_fail_invalid_index[100] - Failed: DID NOT RAISE <class 'Exception'>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants