Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test_parquet_check_schema_compatibility [databricks] #5484

Closed

Conversation

sperlingxx
Copy link
Collaborator

Fixes #5481

The tests failed on matching error message because DB runtime throws a different exception from Spark, with modified error message.

Signed-off-by: sperlingxx lovedreamf@gmail.com

Signed-off-by: sperlingxx <lovedreamf@gmail.com>
@sperlingxx sperlingxx requested a review from pxLi May 13, 2022 07:42
@pxLi pxLi added bug Something isn't working test Only impacts tests labels May 13, 2022
Copy link
Collaborator

@pxLi pxLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the phrase 🤦

@pxLi
Copy link
Collaborator

pxLi commented May 13, 2022

build

@pxLi
Copy link
Collaborator

pxLi commented May 13, 2022

hmm failed unrelated case in 312db and 321db.
Might be the assert error cases caused some side effects for other test,
or databricks just applied some changes to runtimes...

[2022-05-13T09:46:49.991Z] FAILED ../../src/main/python/hash_aggregate_test.py::test_groupby_std_variance[{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'false', 'spark.rapids.sql.castStringToFloat.enabled': 'true', 'spark.rapids.sql.batchSizeBytes': '250'}-[('a', Decimal(not_null)(18,0)), ('b', Decimal(not_null)(18,0)), ('c', Decimal(not_null)(18,0))]][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
[2022-05-13T09:46:49.989Z] 22/05/13 09:00:13 ERROR Executor: Exception in task 0.0 in stage 5647.0 (TID 16756)
[2022-05-13T09:46:49.989Z] java.net.SocketException: Broken pipe (Write failed)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite0(Native Method)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
[2022-05-13T09:46:49.989Z] 	at java.io.DataOutputStream.flush(DataOutputStream.java:123)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:576)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2264)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:365)
[2022-05-13T09:46:49.989Z] 22/05/13 09:00:13 WARN TaskSetManager: Lost task 0.0 in stage 5647.0 (TID 16756) (10.2.128.4 executor driver): java.net.SocketException: Broken pipe (Write failed)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite0(Native Method)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
[2022-05-13T09:46:49.989Z] 	at java.io.DataOutputStream.flush(DataOutputStream.java:123)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:576)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2264)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:365)

@sperlingxx
Copy link
Collaborator Author

hmm failed unrelated case in 312db and 321db. Might be the assert error cases caused some side effects for other test, or databricks just applied some changes to runtimes...

[2022-05-13T09:46:49.991Z] FAILED ../../src/main/python/hash_aggregate_test.py::test_groupby_std_variance[{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'false', 'spark.rapids.sql.castStringToFloat.enabled': 'true', 'spark.rapids.sql.batchSizeBytes': '250'}-[('a', Decimal(not_null)(18,0)), ('b', Decimal(not_null)(18,0)), ('c', Decimal(not_null)(18,0))]][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
[2022-05-13T09:46:49.989Z] 22/05/13 09:00:13 ERROR Executor: Exception in task 0.0 in stage 5647.0 (TID 16756)
[2022-05-13T09:46:49.989Z] java.net.SocketException: Broken pipe (Write failed)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite0(Native Method)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
[2022-05-13T09:46:49.989Z] 	at java.io.DataOutputStream.flush(DataOutputStream.java:123)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:576)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2264)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:365)
[2022-05-13T09:46:49.989Z] 22/05/13 09:00:13 WARN TaskSetManager: Lost task 0.0 in stage 5647.0 (TID 16756) (10.2.128.4 executor driver): java.net.SocketException: Broken pipe (Write failed)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite0(Native Method)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
[2022-05-13T09:46:49.989Z] 	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
[2022-05-13T09:46:49.989Z] 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
[2022-05-13T09:46:49.989Z] 	at java.io.DataOutputStream.flush(DataOutputStream.java:123)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:576)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2264)
[2022-05-13T09:46:49.989Z] 	at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:365)

Really weird ....

@sperlingxx
Copy link
Collaborator Author

build

@revans2
Copy link
Collaborator

revans2 commented May 13, 2022

Looks like changes to the runtime. There was a date cast test that just failed too with a new error.

@sameerz sameerz added this to the May 2 - May 20 milestone May 13, 2022
@sperlingxx
Copy link
Collaborator Author

Yes, the failed case test_cast_string_date_invalid_ansi_before_320 will be fixed by #5494. So, it looks like a dead lock scenario: two failed cases are fixed by two sepearate PRs.

@pxLi
Copy link
Collaborator

pxLi commented May 16, 2022

close this one as #5494 includes both

@pxLi pxLi closed this May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] test_parquet_check_schema_compatibility failed in databricks runtimes
4 participants