Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] assert failed test_struct_self_join and test_computation_in_grpby_columns #5286

Closed
pxLi opened this issue Apr 20, 2022 · 7 comments · Fixed by #5400
Closed

[BUG] assert failed test_struct_self_join and test_computation_in_grpby_columns #5286

pxLi opened this issue Apr 20, 2022 · 7 comments · Fixed by #5400
Assignees
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf P0 Must have for release

Comments

@pxLi
Copy link
Collaborator

pxLi commented Apr 20, 2022

Describe the bug
The failures occurred in integration tests, seems like related to recent cudf changes

[2022-04-20T02:45:21.259Z] FAILED ../../src/main/python/hash_aggregate_test.py::test_computation_in_grpby_columns[IGNORE_ORDER]
[2022-04-20T02:45:21.260Z] FAILED ../../src/main/python/join_test.py::test_struct_self_join[IGNORE_ORDER({'local': True})]

join_test.py::test_struct_self_join[IGNORE_ORDER({'local': True})],

[2022-04-20T02:45:21.257Z] ### COLLECT: GPU TOOK 0.4855940341949463 CPU TOOK 0.5880005359649658 ###
[2022-04-20T02:45:21.257Z] CPU OUTPUT: [Row(col=Row(name=Row(firstname='Adam ', middlename='', lastname='Green'), newname=Row(firstname='Adam ', lastname='Green')), name=Row(firstname='Adam ', middlename='', lastname='Green')), Row(col=Row(name=Row(firstname='Bob ', middlename='Middle', lastname='Green'), newname=Row(firstname='Bob ', lastname='Green')), name=Row(firstname='Bob ', middlename='Middle', lastname='Green')), Row(col=Row(name=Row(firstname='Cathy ', middlename='', lastname='Green'), newname=Row(firstname='Cathy ', lastname='Green')), name=Row(firstname='Cathy ', middlename='', lastname='Green'))]
[2022-04-20T02:45:21.258Z] GPU OUTPUT: [Row(col=Row(name=Row(firstname='Adam ', middlename=None, lastname='Green'), newname=Row(firstname='Adam ', lastname='Green')), name=Row(firstname='Adam ', middlename=None, lastname='Green')), Row(col=Row(name=Row(firstname='Bob ', middlename='Middle', lastname='Green'), newname=Row(firstname='Bob ', lastname='Green')), name=Row(firstname='Bob ', middlename='Middle', lastname='Green')), Row(col=Row(name=Row(firstname='Cathy ', middlename=None, lastname='Green'), newname=Row(firstname='Cathy ', lastname='Green')), name=Row(firstname='Cathy ', middlename=None, lastname='Green'))]

hash_aggregate_test.py::test_computation_in_grpby_columns[IGNORE_ORDER],

[2022-04-20T02:45:21.255Z] ### COLLECT: GPU TOOK 2.2361795902252197 CPU TOOK 0.8191990852355957 ###
[2022-04-20T02:45:21.255Z] CPU OUTPUT: [Row(substring(a, 2, 10)=None, sum(b)=306548), Row(substring(a, 2, 10)='', sum(b)=-148367), Row(substring(a, 2, 10)='a', sum(b)=198339), Row(substring(a, 2, 10)='aa', sum(b)=-213522), Row(substring(a, 2, 10)='aaa', sum(b)=462), Row(substring(a, 2, 10)='aaaa', sum(b)=-210714), Row(substring(a, 2, 10)='aaaaa', sum(b)=-48129), Row(substring(a, 2, 10)='aaaaaa', sum(b)=-137496), Row(substring(a, 2, 10)='aaaaaaa', sum(b)=-30235), Row(substring(a, 2, 10)='aaaaaaaa', sum(b)=163441), Row(substring(a, 2, 10)='aaaaaaaaa', sum(b)=206396), Row(substring(a, 2, 10)='aaaaaaaaaa', sum(b)=-807799)]
[2022-04-20T02:45:21.255Z] GPU OUTPUT: [Row(substring(a, 2, 10)=None, sum(b)=-67160), Row(substring(a, 2, 10)=None, sum(b)=306548), Row(substring(a, 2, 10)='', sum(b)=-81207), Row(substring(a, 2, 10)='a', sum(b)=198339), Row(substring(a, 2, 10)='aa', sum(b)=-213522), Row(substring(a, 2, 10)='aaa', sum(b)=462), Row(substring(a, 2, 10)='aaaa', sum(b)=-210714), Row(substring(a, 2, 10)='aaaaa', sum(b)=-48129), Row(substring(a, 2, 10)='aaaaaa', sum(b)=-137496), Row(substring(a, 2, 10)='aaaaaaa', sum(b)=-30235), Row(substring(a, 2, 10)='aaaaaaaa', sum(b)=163441), Row(substring(a, 2, 10)='aaaaaaaaa', sum(b)=206396), Row(substring(a, 2, 10)='aaaaaaaaaa', sum(b)=-807799)]

detailed log,

[2022-04-20T02:45:21.254Z] =================================== FAILURES ===================================
[2022-04-20T02:45:21.254Z] �[31m�[1m______________________ test_computation_in_grpby_columns _______________________�[0m
[2022-04-20T02:45:21.254Z] [gw2] linux -- Python 3.8.13 /databricks/conda/envs/cudf-udf/bin/python
[2022-04-20T02:45:21.254Z] 
[2022-04-20T02:45:21.254Z]     �[37m@ignore_order�[39;49;00m
[2022-04-20T02:45:21.254Z]     �[94mdef�[39;49;00m �[92mtest_computation_in_grpby_columns�[39;49;00m():
[2022-04-20T02:45:21.254Z]         conf = {�[33m'�[39;49;00m�[33mspark.rapids.sql.batchSizeBytes�[39;49;00m�[33m'�[39;49;00m : �[33m'�[39;49;00m�[33m250�[39;49;00m�[33m'�[39;49;00m}
[2022-04-20T02:45:21.254Z]         data_gen = [
[2022-04-20T02:45:21.254Z]                 (�[33m'�[39;49;00m�[33ma�[39;49;00m�[33m'�[39;49;00m, RepeatSeqGen(StringGen(�[33m'�[39;49;00m�[33ma�[39;49;00m�[33m{�[39;49;00m�[33m1,20}�[39;49;00m�[33m'�[39;49;00m), length=�[94m50�[39;49;00m)),
[2022-04-20T02:45:21.254Z]                 (�[33m'�[39;49;00m�[33mb�[39;49;00m�[33m'�[39;49;00m, short_gen)]
[2022-04-20T02:45:21.254Z] >       assert_gpu_and_cpu_are_equal_collect(
[2022-04-20T02:45:21.254Z]             �[94mlambda�[39;49;00m spark: gen_df(spark, data_gen).groupby(f.substring(f.col(�[33m'�[39;49;00m�[33ma�[39;49;00m�[33m'�[39;49;00m), �[94m2�[39;49;00m, �[94m10�[39;49;00m)).agg(f.sum(�[33m'�[39;49;00m�[33mb�[39;49;00m�[33m'�[39;49;00m)),
[2022-04-20T02:45:21.254Z]             conf = conf)
[2022-04-20T02:45:21.254Z] 
[2022-04-20T02:45:21.254Z] �[1m�[31m../../src/main/python/hash_aggregate_test.py�[0m:352: 
[2022-04-20T02:45:21.254Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-04-20T02:45:21.254Z] �[1m�[31m../../src/main/python/asserts.py�[0m:508: in assert_gpu_and_cpu_are_equal_collect
[2022-04-20T02:45:21.254Z]     _assert_gpu_and_cpu_are_equal(func, �[33m'�[39;49;00m�[33mCOLLECT�[39;49;00m�[33m'�[39;49;00m, conf=conf, is_cpu_first=is_cpu_first)
[2022-04-20T02:45:21.254Z] �[1m�[31m../../src/main/python/asserts.py�[0m:439: in _assert_gpu_and_cpu_are_equal
[2022-04-20T02:45:21.254Z]     assert_equal(from_cpu, from_gpu)
[2022-04-20T02:45:21.254Z] �[1m�[31m../../src/main/python/asserts.py�[0m:106: in assert_equal
[2022-04-20T02:45:21.254Z]     _assert_equal(cpu, gpu, float_check=get_float_check(), path=[])
[2022-04-20T02:45:21.254Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-04-20T02:45:21.254Z] 
[2022-04-20T02:45:21.254Z] cpu = [Row(substring(a, 2, 10)=None, sum(b)=306548), Row(substring(a, 2, 10)='', sum(b)=-148367), Row(substring(a, 2, 10)='a...aa', sum(b)=-213522), Row(substring(a, 2, 10)='aaa', sum(b)=462), Row(substring(a, 2, 10)='aaaa', sum(b)=-210714), ...]
[2022-04-20T02:45:21.254Z] gpu = [Row(substring(a, 2, 10)=None, sum(b)=-67160), Row(substring(a, 2, 10)=None, sum(b)=306548), Row(substring(a, 2, 10)='...0)='a', sum(b)=198339), Row(substring(a, 2, 10)='aa', sum(b)=-213522), Row(substring(a, 2, 10)='aaa', sum(b)=462), ...]
[2022-04-20T02:45:21.254Z] float_check = <function get_float_check.<locals>.<lambda> at 0x7fae16be90d0>
[2022-04-20T02:45:21.254Z] path = []
[2022-04-20T02:45:21.254Z] 
[2022-04-20T02:45:21.254Z]     �[94mdef�[39;49;00m �[92m_assert_equal�[39;49;00m(cpu, gpu, float_check, path):
[2022-04-20T02:45:21.255Z]         t = �[96mtype�[39;49;00m(cpu)
[2022-04-20T02:45:21.255Z]         �[94mif�[39;49;00m (t �[95mis�[39;49;00m Row):
[2022-04-20T02:45:21.255Z]             �[94massert�[39;49;00m �[96mlen�[39;49;00m(cpu) == �[96mlen�[39;49;00m(gpu), �[33m"�[39;49;00m�[33mCPU and GPU row have different lengths at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, �[96mlen�[39;49;00m(cpu), �[96mlen�[39;49;00m(gpu))
[2022-04-20T02:45:21.255Z]             �[94mif�[39;49;00m �[96mhasattr�[39;49;00m(cpu, �[33m"�[39;49;00m�[33m__fields__�[39;49;00m�[33m"�[39;49;00m) �[95mand�[39;49;00m �[96mhasattr�[39;49;00m(gpu, �[33m"�[39;49;00m�[33m__fields__�[39;49;00m�[33m"�[39;49;00m):
[2022-04-20T02:45:21.255Z]                 �[94massert�[39;49;00m cpu.__fields__ == gpu.__fields__, �[33m"�[39;49;00m�[33mCPU and GPU row have different fields at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, cpu.__fields__, gpu.__fields__)
[2022-04-20T02:45:21.255Z]                 �[94mfor�[39;49;00m field �[95min�[39;49;00m cpu.__fields__:
[2022-04-20T02:45:21.255Z]                     _assert_equal(cpu[field], gpu[field], float_check, path + [field])
[2022-04-20T02:45:21.255Z]             �[94melse�[39;49;00m:
[2022-04-20T02:45:21.255Z]                 �[94mfor�[39;49;00m index �[95min�[39;49;00m �[96mrange�[39;49;00m(�[96mlen�[39;49;00m(cpu)):
[2022-04-20T02:45:21.255Z]                     _assert_equal(cpu[index], gpu[index], float_check, path + [index])
[2022-04-20T02:45:21.255Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m �[96mlist�[39;49;00m):
[2022-04-20T02:45:21.255Z] >           �[94massert�[39;49;00m �[96mlen�[39;49;00m(cpu) == �[96mlen�[39;49;00m(gpu), �[33m"�[39;49;00m�[33mCPU and GPU list have different lengths at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, �[96mlen�[39;49;00m(cpu), �[96mlen�[39;49;00m(gpu))
[2022-04-20T02:45:21.255Z] �[1m�[31mE           AssertionError: CPU and GPU list have different lengths at [] CPU: 12 GPU: 13�[0m
[2022-04-20T02:45:21.255Z] 
[2022-04-20T02:45:21.255Z] �[1m�[31m../../src/main/python/asserts.py�[0m:40: AssertionError
[2022-04-20T02:45:21.255Z] ----------------------------- Captured stdout call -----------------------------
[2022-04-20T02:45:21.255Z] ### CPU RUN ###
[2022-04-20T02:45:21.255Z] ### GPU RUN ###
[2022-04-20T02:45:21.255Z] ### COLLECT: GPU TOOK 2.2361795902252197 CPU TOOK 0.8191990852355957 ###
[2022-04-20T02:45:21.255Z] CPU OUTPUT: [Row(substring(a, 2, 10)=None, sum(b)=306548), Row(substring(a, 2, 10)='', sum(b)=-148367), Row(substring(a, 2, 10)='a', sum(b)=198339), Row(substring(a, 2, 10)='aa', sum(b)=-213522), Row(substring(a, 2, 10)='aaa', sum(b)=462), Row(substring(a, 2, 10)='aaaa', sum(b)=-210714), Row(substring(a, 2, 10)='aaaaa', sum(b)=-48129), Row(substring(a, 2, 10)='aaaaaa', sum(b)=-137496), Row(substring(a, 2, 10)='aaaaaaa', sum(b)=-30235), Row(substring(a, 2, 10)='aaaaaaaa', sum(b)=163441), Row(substring(a, 2, 10)='aaaaaaaaa', sum(b)=206396), Row(substring(a, 2, 10)='aaaaaaaaaa', sum(b)=-807799)]
[2022-04-20T02:45:21.255Z] GPU OUTPUT: [Row(substring(a, 2, 10)=None, sum(b)=-67160), Row(substring(a, 2, 10)=None, sum(b)=306548), Row(substring(a, 2, 10)='', sum(b)=-81207), Row(substring(a, 2, 10)='a', sum(b)=198339), Row(substring(a, 2, 10)='aa', sum(b)=-213522), Row(substring(a, 2, 10)='aaa', sum(b)=462), Row(substring(a, 2, 10)='aaaa', sum(b)=-210714), Row(substring(a, 2, 10)='aaaaa', sum(b)=-48129), Row(substring(a, 2, 10)='aaaaaa', sum(b)=-137496), Row(substring(a, 2, 10)='aaaaaaa', sum(b)=-30235), Row(substring(a, 2, 10)='aaaaaaaa', sum(b)=163441), Row(substring(a, 2, 10)='aaaaaaaaa', sum(b)=206396), Row(substring(a, 2, 10)='aaaaaaaaaa', sum(b)=-807799)]
[2022-04-20T02:45:21.255Z] �[31m�[1m____________________________ test_struct_self_join _____________________________�[0m
[2022-04-20T02:45:21.255Z] [gw0] linux -- Python 3.8.13 /databricks/conda/envs/cudf-udf/bin/python
[2022-04-20T02:45:21.255Z] 
[2022-04-20T02:45:21.255Z] spark_tmp_table_factory = <conftest.TmpTableFactory object at 0x7f4a624ae940>
[2022-04-20T02:45:21.255Z] 
[2022-04-20T02:45:21.255Z]     �[37m@ignore_order�[39;49;00m(local=�[94mTrue�[39;49;00m)
[2022-04-20T02:45:21.255Z]     �[94mdef�[39;49;00m �[92mtest_struct_self_join�[39;49;00m(spark_tmp_table_factory):
[2022-04-20T02:45:21.255Z]         �[94mdef�[39;49;00m �[92mdo_join�[39;49;00m(spark):
[2022-04-20T02:45:21.255Z]             data = [
[2022-04-20T02:45:21.255Z]                 ((�[33m"�[39;49;00m�[33mAdam �[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mGreen�[39;49;00m�[33m"�[39;49;00m), �[33m"�[39;49;00m�[33m1�[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mM�[39;49;00m�[33m"�[39;49;00m, �[94m1000�[39;49;00m),
[2022-04-20T02:45:21.255Z]                 ((�[33m"�[39;49;00m�[33mBob �[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mMiddle�[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mGreen�[39;49;00m�[33m"�[39;49;00m), �[33m"�[39;49;00m�[33m2�[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mM�[39;49;00m�[33m"�[39;49;00m, �[94m2000�[39;49;00m),
[2022-04-20T02:45:21.255Z]                 ((�[33m"�[39;49;00m�[33mCathy �[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mGreen�[39;49;00m�[33m"�[39;49;00m), �[33m"�[39;49;00m�[33m3�[39;49;00m�[33m"�[39;49;00m, �[33m"�[39;49;00m�[33mF�[39;49;00m�[33m"�[39;49;00m, �[94m3000�[39;49;00m)
[2022-04-20T02:45:21.255Z]             ]
[2022-04-20T02:45:21.255Z]             schema = (StructType()
[2022-04-20T02:45:21.255Z]                       .add(�[33m"�[39;49;00m�[33mname�[39;49;00m�[33m"�[39;49;00m, StructType()
[2022-04-20T02:45:21.255Z]                            .add(�[33m"�[39;49;00m�[33mfirstname�[39;49;00m�[33m"�[39;49;00m, StringType())
[2022-04-20T02:45:21.256Z]                            .add(�[33m"�[39;49;00m�[33mmiddlename�[39;49;00m�[33m"�[39;49;00m, StringType())
[2022-04-20T02:45:21.256Z]                            .add(�[33m"�[39;49;00m�[33mlastname�[39;49;00m�[33m"�[39;49;00m, StringType()))
[2022-04-20T02:45:21.256Z]                       .add(�[33m"�[39;49;00m�[33mid�[39;49;00m�[33m"�[39;49;00m, StringType())
[2022-04-20T02:45:21.256Z]                       .add(�[33m"�[39;49;00m�[33mgender�[39;49;00m�[33m"�[39;49;00m, StringType())
[2022-04-20T02:45:21.256Z]                       .add(�[33m"�[39;49;00m�[33msalary�[39;49;00m�[33m"�[39;49;00m, IntegerType()))
[2022-04-20T02:45:21.256Z]             df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)
[2022-04-20T02:45:21.256Z]             df_name = spark_tmp_table_factory.get()
[2022-04-20T02:45:21.256Z]             df.createOrReplaceTempView(df_name)
[2022-04-20T02:45:21.256Z]             resultdf = spark.sql(
[2022-04-20T02:45:21.256Z]                 �[33m"�[39;49;00m�[33mselect struct(name, struct(name.firstname, name.lastname) as newname)�[39;49;00m�[33m"�[39;49;00m +
[2022-04-20T02:45:21.256Z]                 �[33m"�[39;49;00m�[33m as col,name from �[39;49;00m�[33m"�[39;49;00m + df_name + �[33m"�[39;49;00m�[33m union�[39;49;00m�[33m"�[39;49;00m +
[2022-04-20T02:45:21.256Z]                 �[33m"�[39;49;00m�[33m select struct(name, struct(name.firstname, name.lastname) as newname) as col,name�[39;49;00m�[33m"�[39;49;00m +
[2022-04-20T02:45:21.256Z]                 �[33m"�[39;49;00m�[33m from �[39;49;00m�[33m"�[39;49;00m + df_name)
[2022-04-20T02:45:21.256Z]             resultdf_name = spark_tmp_table_factory.get()
[2022-04-20T02:45:21.256Z]             resultdf.createOrReplaceTempView(resultdf_name)
[2022-04-20T02:45:21.256Z]             �[94mreturn�[39;49;00m spark.sql(�[33m"�[39;49;00m�[33mselect a.* from �[39;49;00m�[33m{}�[39;49;00m�[33m a, �[39;49;00m�[33m{}�[39;49;00m�[33m b where a.name=b.name�[39;49;00m�[33m"�[39;49;00m.format(
[2022-04-20T02:45:21.256Z]                 resultdf_name, resultdf_name))
[2022-04-20T02:45:21.256Z] >       assert_gpu_and_cpu_are_equal_collect(do_join)
[2022-04-20T02:45:21.256Z] 
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/join_test.py�[0m:771: 
[2022-04-20T02:45:21.256Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:508: in assert_gpu_and_cpu_are_equal_collect
[2022-04-20T02:45:21.256Z]     _assert_gpu_and_cpu_are_equal(func, �[33m'�[39;49;00m�[33mCOLLECT�[39;49;00m�[33m'�[39;49;00m, conf=conf, is_cpu_first=is_cpu_first)
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:439: in _assert_gpu_and_cpu_are_equal
[2022-04-20T02:45:21.256Z]     assert_equal(from_cpu, from_gpu)
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:106: in assert_equal
[2022-04-20T02:45:21.256Z]     _assert_equal(cpu, gpu, float_check=get_float_check(), path=[])
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:42: in _assert_equal
[2022-04-20T02:45:21.256Z]     _assert_equal(cpu[index], gpu[index], float_check, path + [index])
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:35: in _assert_equal
[2022-04-20T02:45:21.256Z]     _assert_equal(cpu[field], gpu[field], float_check, path + [field])
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:35: in _assert_equal
[2022-04-20T02:45:21.256Z]     _assert_equal(cpu[field], gpu[field], float_check, path + [field])
[2022-04-20T02:45:21.256Z] �[1m�[31m../../src/main/python/asserts.py�[0m:35: in _assert_equal
[2022-04-20T02:45:21.256Z]     _assert_equal(cpu[field], gpu[field], float_check, path + [field])
[2022-04-20T02:45:21.256Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-04-20T02:45:21.256Z] 
[2022-04-20T02:45:21.256Z] cpu = '', gpu = None
[2022-04-20T02:45:21.256Z] float_check = <function get_float_check.<locals>.<lambda> at 0x7f4a6780d430>
[2022-04-20T02:45:21.256Z] path = [0, 'col', 'name', 'middlename']
[2022-04-20T02:45:21.256Z] 
[2022-04-20T02:45:21.256Z]     �[94mdef�[39;49;00m �[92m_assert_equal�[39;49;00m(cpu, gpu, float_check, path):
[2022-04-20T02:45:21.256Z]         t = �[96mtype�[39;49;00m(cpu)
[2022-04-20T02:45:21.256Z]         �[94mif�[39;49;00m (t �[95mis�[39;49;00m Row):
[2022-04-20T02:45:21.256Z]             �[94massert�[39;49;00m �[96mlen�[39;49;00m(cpu) == �[96mlen�[39;49;00m(gpu), �[33m"�[39;49;00m�[33mCPU and GPU row have different lengths at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, �[96mlen�[39;49;00m(cpu), �[96mlen�[39;49;00m(gpu))
[2022-04-20T02:45:21.256Z]             �[94mif�[39;49;00m �[96mhasattr�[39;49;00m(cpu, �[33m"�[39;49;00m�[33m__fields__�[39;49;00m�[33m"�[39;49;00m) �[95mand�[39;49;00m �[96mhasattr�[39;49;00m(gpu, �[33m"�[39;49;00m�[33m__fields__�[39;49;00m�[33m"�[39;49;00m):
[2022-04-20T02:45:21.256Z]                 �[94massert�[39;49;00m cpu.__fields__ == gpu.__fields__, �[33m"�[39;49;00m�[33mCPU and GPU row have different fields at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, cpu.__fields__, gpu.__fields__)
[2022-04-20T02:45:21.256Z]                 �[94mfor�[39;49;00m field �[95min�[39;49;00m cpu.__fields__:
[2022-04-20T02:45:21.256Z]                     _assert_equal(cpu[field], gpu[field], float_check, path + [field])
[2022-04-20T02:45:21.256Z]             �[94melse�[39;49;00m:
[2022-04-20T02:45:21.256Z]                 �[94mfor�[39;49;00m index �[95min�[39;49;00m �[96mrange�[39;49;00m(�[96mlen�[39;49;00m(cpu)):
[2022-04-20T02:45:21.256Z]                     _assert_equal(cpu[index], gpu[index], float_check, path + [index])
[2022-04-20T02:45:21.256Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m �[96mlist�[39;49;00m):
[2022-04-20T02:45:21.256Z]             �[94massert�[39;49;00m �[96mlen�[39;49;00m(cpu) == �[96mlen�[39;49;00m(gpu), �[33m"�[39;49;00m�[33mCPU and GPU list have different lengths at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, �[96mlen�[39;49;00m(cpu), �[96mlen�[39;49;00m(gpu))
[2022-04-20T02:45:21.257Z]             �[94mfor�[39;49;00m index �[95min�[39;49;00m �[96mrange�[39;49;00m(�[96mlen�[39;49;00m(cpu)):
[2022-04-20T02:45:21.257Z]                 _assert_equal(cpu[index], gpu[index], float_check, path + [index])
[2022-04-20T02:45:21.257Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m �[96mtuple�[39;49;00m):
[2022-04-20T02:45:21.257Z]             �[94massert�[39;49;00m �[96mlen�[39;49;00m(cpu) == �[96mlen�[39;49;00m(gpu), �[33m"�[39;49;00m�[33mCPU and GPU list have different lengths at �[39;49;00m�[33m{}�[39;49;00m�[33m CPU: �[39;49;00m�[33m{}�[39;49;00m�[33m GPU: �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path, �[96mlen�[39;49;00m(cpu), �[96mlen�[39;49;00m(gpu))
[2022-04-20T02:45:21.257Z]             �[94mfor�[39;49;00m index �[95min�[39;49;00m �[96mrange�[39;49;00m(�[96mlen�[39;49;00m(cpu)):
[2022-04-20T02:45:21.257Z]                 _assert_equal(cpu[index], gpu[index], float_check, path + [index])
[2022-04-20T02:45:21.257Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m pytypes.GeneratorType):
[2022-04-20T02:45:21.257Z]             index = �[94m0�[39;49;00m
[2022-04-20T02:45:21.257Z]             �[90m# generator has no zip :( so we have to do this the hard way�[39;49;00m
[2022-04-20T02:45:21.257Z]             done = �[94mFalse�[39;49;00m
[2022-04-20T02:45:21.257Z]             �[94mwhile�[39;49;00m �[95mnot�[39;49;00m done:
[2022-04-20T02:45:21.257Z]                 sub_cpu = �[94mNone�[39;49;00m
[2022-04-20T02:45:21.257Z]                 sub_gpu = �[94mNone�[39;49;00m
[2022-04-20T02:45:21.257Z]                 �[94mtry�[39;49;00m:
[2022-04-20T02:45:21.257Z]                     sub_cpu = �[96mnext�[39;49;00m(cpu)
[2022-04-20T02:45:21.257Z]                 �[94mexcept�[39;49;00m �[96mStopIteration�[39;49;00m:
[2022-04-20T02:45:21.257Z]                     done = �[94mTrue�[39;49;00m
[2022-04-20T02:45:21.257Z]     
[2022-04-20T02:45:21.257Z]                 �[94mtry�[39;49;00m:
[2022-04-20T02:45:21.257Z]                     sub_gpu = �[96mnext�[39;49;00m(gpu)
[2022-04-20T02:45:21.257Z]                 �[94mexcept�[39;49;00m �[96mStopIteration�[39;49;00m:
[2022-04-20T02:45:21.257Z]                     done = �[94mTrue�[39;49;00m
[2022-04-20T02:45:21.257Z]     
[2022-04-20T02:45:21.257Z]                 �[94mif�[39;49;00m done:
[2022-04-20T02:45:21.257Z]                     �[94massert�[39;49;00m sub_cpu == sub_gpu �[95mand�[39;49;00m sub_cpu == �[94mNone�[39;49;00m, �[33m"�[39;49;00m�[33mCPU and GPU generators have different lengths at �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path)
[2022-04-20T02:45:21.257Z]                 �[94melse�[39;49;00m:
[2022-04-20T02:45:21.257Z]                     _assert_equal(sub_cpu, sub_gpu, float_check, path + [index])
[2022-04-20T02:45:21.257Z]     
[2022-04-20T02:45:21.257Z]                 index = index + �[94m1�[39;49;00m
[2022-04-20T02:45:21.257Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m �[96mdict�[39;49;00m):
[2022-04-20T02:45:21.257Z]             �[90m# The order of key/values is not guaranteed in python dicts, nor are they guaranteed by Spark�[39;49;00m
[2022-04-20T02:45:21.257Z]             �[90m# so sort the items to do our best with ignoring the order of dicts�[39;49;00m
[2022-04-20T02:45:21.257Z]             cpu_items = �[96mlist�[39;49;00m(cpu.items()).sort(key=_RowCmp)
[2022-04-20T02:45:21.257Z]             gpu_items = �[96mlist�[39;49;00m(gpu.items()).sort(key=_RowCmp)
[2022-04-20T02:45:21.257Z]             _assert_equal(cpu_items, gpu_items, float_check, path + [�[33m"�[39;49;00m�[33mmap�[39;49;00m�[33m"�[39;49;00m])
[2022-04-20T02:45:21.257Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m �[96mint�[39;49;00m):
[2022-04-20T02:45:21.257Z]             �[94massert�[39;49;00m cpu == gpu, �[33m"�[39;49;00m�[33mGPU and CPU int values are different at �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path)
[2022-04-20T02:45:21.257Z]         �[94melif�[39;49;00m (t �[95mis�[39;49;00m �[96mfloat�[39;49;00m):
[2022-04-20T02:45:21.257Z]             �[94mif�[39;49;00m (math.isnan(cpu)):
[2022-04-20T02:45:21.257Z]                 �[94massert�[39;49;00m math.isnan(gpu), �[33m"�[39;49;00m�[33mGPU and CPU float values are different at �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path)
[2022-04-20T02:45:21.257Z]             �[94melse�[39;49;00m:
[2022-04-20T02:45:21.257Z]                 �[94massert�[39;49;00m float_check(cpu, gpu), �[33m"�[39;49;00m�[33mGPU and CPU float values are different �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path)
[2022-04-20T02:45:21.257Z]         �[94melif�[39;49;00m �[96misinstance�[39;49;00m(cpu, �[96mstr�[39;49;00m):
[2022-04-20T02:45:21.257Z] >           �[94massert�[39;49;00m cpu == gpu, �[33m"�[39;49;00m�[33mGPU and CPU string values are different at �[39;49;00m�[33m{}�[39;49;00m�[33m"�[39;49;00m.format(path)
[2022-04-20T02:45:21.257Z] �[1m�[31mE           AssertionError: GPU and CPU string values are different at [0, 'col', 'name', 'middlename']�[0m
[2022-04-20T02:45:21.257Z] 
[2022-04-20T02:45:21.257Z] �[1m�[31m../../src/main/python/asserts.py�[0m:84: AssertionError
[2022-04-20T02:45:21.257Z] ----------------------------- Captured stdout call -----------------------------
[2022-04-20T02:45:21.257Z] ### CPU RUN ###
[2022-04-20T02:45:21.257Z] ### GPU RUN ###
[2022-04-20T02:45:21.257Z] ### COLLECT: GPU TOOK 0.4855940341949463 CPU TOOK 0.5880005359649658 ###
[2022-04-20T02:45:21.257Z] CPU OUTPUT: [Row(col=Row(name=Row(firstname='Adam ', middlename='', lastname='Green'), newname=Row(firstname='Adam ', lastname='Green')), name=Row(firstname='Adam ', middlename='', lastname='Green')), Row(col=Row(name=Row(firstname='Bob ', middlename='Middle', lastname='Green'), newname=Row(firstname='Bob ', lastname='Green')), name=Row(firstname='Bob ', middlename='Middle', lastname='Green')), Row(col=Row(name=Row(firstname='Cathy ', middlename='', lastname='Green'), newname=Row(firstname='Cathy ', lastname='Green')), name=Row(firstname='Cathy ', middlename='', lastname='Green'))]
[2022-04-20T02:45:21.258Z] GPU OUTPUT: [Row(col=Row(name=Row(firstname='Adam ', middlename=None, lastname='Green'), newname=Row(firstname='Adam ', lastname='Green')), name=Row(firstname='Adam ', middlename=None, lastname='Green')), Row(col=Row(name=Row(firstname='Bob ', middlename='Middle', lastname='Green'), newname=Row(firstname='Bob ', lastname='Green')), name=Row(firstname='Bob ', middlename='Middle', lastname='Green')), Row(col=Row(name=Row(firstname='Cathy ', middlename=None, lastname='Green'), newname=Row(firstname='Cathy ', lastname='Green')), name=Row(firstname='Cathy ', middlename=None, lastname='Green'))]
@pxLi pxLi added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 20, 2022
@pxLi pxLi changed the title [BUG] assert failed test_struct_self_join and test_computation_in_grpby_columns in databricks runtime [BUG] assert failed test_struct_self_join and test_computation_in_grpby_columns Apr 20, 2022
@abellina abellina self-assigned this Apr 20, 2022
@abellina abellina added the P0 Must have for release label Apr 20, 2022
@abellina
Copy link
Collaborator

It looks like the issue is in this diff: https://github.com/rapidsai/cudf/compare/6c79b5902d55bab599731a9bded7e89b9c4875c5..65b1cbdeda9cab57243d0a98e646c860ef86039e#diff-50ba2711690aca8e4f28d7b491373a4dd76443127c8b452a77b6c1fe2388d9e3.

There were some string changes here that could be related, so I am reverting those to confirm.

@abellina
Copy link
Collaborator

Reverting: rapidsai/cudf#10673 fixes the test failure. It is specific to when rows with empty strings are joined, as regular projections are working fine.

@pxLi
Copy link
Collaborator Author

pxLi commented Apr 21, 2022

thanks for looking into this! I am wondering if we could add some UTs in cudfjni side so we could catch the error earlier~

@abellina
Copy link
Collaborator

Quick update, here's a minimum repro case in java (this test fails, where we should be getting a table with a single row/column with the empty string).

I'll move to working on this in cuDF.

  @Test
  void testPartitionStrings() {
    try (Table t = new Table.TestBuilder().column("").build();
         ContiguousTable ct = t.contiguousSplit()[0]) {
      try (ColumnVector parts = ColumnVector.fromInts(0);
           PartitionedTable pt = ct.getTable().partition(parts, 2)) {
        ColumnVector partitioned = pt.getTable().getColumn(0);
        try (HostColumnVector hostP = partitioned.copyToHost()) {
          assert(!hostP.isNull(0));
        }
      }
    }
  }

@abellina
Copy link
Collaborator

abellina commented Apr 22, 2022

thanks for looking into this! I am wondering if we could add some UTs in cudfjni side so we could catch the error earlier~

@pxLi I'll try, but this a chain of things. I have to have a string column with an empty string row, then I need to call contiguous split, and finally I should call partition.

Removing rapidsai/cudf#10673 fixes the issue, removing contiguous split also fixes the issue, and if the row isn't a string or it is a non-empty string it all works. It seems we are assuming that "" (size 0 string) is null, so we are loosing track of the fact that it is a valid string.

@abellina abellina added the cudf_dependency An issue or PR with this label depends on a new feature in cudf label Apr 25, 2022
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Apr 26, 2022
@sameerz
Copy link
Collaborator

sameerz commented Apr 29, 2022

@abellina is this resolved?

@jlowe
Copy link
Member

jlowe commented Apr 29, 2022

@abellina is this resolved?

Almost. The cudf change is in, but we still need to re-enable the disabled tests. I'll be posting a PR shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf P0 Must have for release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants