You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FAILED ../../../../integration_tests/src/main/python/cast_test.py::test_cast_string_ts_valid_format[String2][DATAGEN_SEED=1699978422, INJECT_OOM] - AssertionError: GPU and CPU timestamp values are different at [1614, 'a']
The text was updated successfully, but these errors were encountered:
Looked into this a bit, and I think there's two problems here. First, the test generates strings that almost always are invalid timestamps, so the test is not very useful in practice. However with this specific datagen seed, it happens to generate a valid timestamp in the third row, specifically 7141-09-13 08:15:02+121024 which parses to 7141-09-12 15:04:38 on the CPU but parses to null on the GPU.
In case of failure: StringGen('[0-9]{1,4}-[0-3][0-9]-[0-5][0-9][ |T][0-3][0-9]:[0-6][0-9]:[0-6][0-9].[0-9]{0,6}Z?')
The . here is meant to be a literal ., but it's a wildcard character in regex, so it can generate anything in this case, which also leads to many invalid timestamps.
In the failed case it generated '+': 7141-09-13 08:15:02+121024 and Spark somehow supports this format.
Repro:
The text was updated successfully, but these errors were encountered: