Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Test failures for 0.2 when run with multiple executors #812

Closed
sameerz opened this issue Sep 18, 2020 · 9 comments · Fixed by #944
Closed

[BUG] Test failures for 0.2 when run with multiple executors #812

sameerz opened this issue Sep 18, 2020 · 9 comments · Fixed by #944
Assignees
Labels
bug Something isn't working P1 Nice to have for release test Only impacts tests

Comments

@sameerz
Copy link
Collaborator

sameerz commented Sep 18, 2020

There are 15 test failures on Dataproc-preview2 on Ubuntu 18 when running our integration tests. The following tests fail:

cache_test.py (1 failure)
join_test.py (1 failure)
window_function_test.py (1 failure)
qa_nightly_select_test.py (12 failures)

I ran the integration tests twice, and got the same first three tests failing, but a different set of qa_nightly_select_tests failing. I am attaching my cluster creation script, a log of the tests, and the spark-default.conf.
dataproc-integration-tests.tar.gz

@sameerz sameerz added bug Something isn't working ? - Needs Triage Need team to review and classify labels Sep 18, 2020
@revans2 revans2 self-assigned this Sep 18, 2020
@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

I reran myself and ended up with 19 failures. I'll start trying to debug them.

FAILED ../../src/main/python/cache_test.py::test_passing_gpuExpr_as_Expr - As...
FAILED ../../src/main/python/join_test.py::test_join_bucketed_table[false][IGNORE_ORDER, ALLOW_NON_GPU(DataWritingCommandExec)]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_needs_sort_select[SUM(byteF) OVER (PARTITION BY byteF ORDER BY CAST(dateF AS TIMESTAMP) RANGE BETWEEN INTERVAL 1 DAYS PRECEDING AND INTERVAL 1 DAYS FOLLOWING ) as sum_total][IGNORE_ORDER, INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(byteF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(intF) GROUP BY byteF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(longF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(floatF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(doubleF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(strF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(byteF) GROUP BY intF, shortF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[FIRST(shortF) GROUP BY intF, byteF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[LAST(byteF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[LAST(intF) GROUP BY byteF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[LAST(floatF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[LAST(doubleF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[LAST(strF) GROUP BY intF][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[byteF, SUM(byteF) OVER (PARTITION BY shortF ORDER BY intF ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING ) as res][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/qa_nightly_select_test.py::test_select_first_last[SUM(intF) OVER (PARTITION BY byteF ORDER BY byteF ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) as res][IGNORE_ORDER({'local': True}), INCOMPAT, APPROXIMATE_FLOAT]
FAILED ../../src/main/python/window_function_test.py::test_window_aggs_for_ranges[[('a', RepeatSeq), ('b', Date), ('c', Integer)]1][IGNORE_ORDER]

@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

All of the first/last test failures were fixes when I configured it to use a single executor. It looks like those are issues with the tests when they try to run too wide. I'll see about the others if they are similar.

@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

test_passing_gpuExpr_as_Expr is also related to that same issue.

@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

test_window_aggs_for_ranges also passed with a single executor.

@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

test_join_bucketed_table is still failing, but I suspect that it is caused by running with an older version of the plugin. I'll try and clean things up on all of the nodes so we are running with a newer version of the plugin everywhere.

@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

Yup that was it. The jar on the nodes was from a few days ago and it didn't have the bucketing fix in it. Once I replaced the plugin jar with the one that has the fix the test passes. I will rerun all of the tests again with a single executor just to be sure. After that I'll turn this into an issue around the tests because they should either be updated so that they can pass with multiple executors or they should have a way to detect that they are being run incorrectly and skip themselves.

@revans2 revans2 added test Only impacts tests and removed ? - Needs Triage Need team to review and classify labels Sep 18, 2020
@revans2
Copy link
Collaborator

revans2 commented Sep 18, 2020

Moved this to 0.3 as these are test issues, not correctness issues with the plugin.

@revans2 revans2 added the P1 Nice to have for release label Sep 18, 2020
@revans2 revans2 removed their assignment Sep 18, 2020
@tgravescs tgravescs changed the title [BUG] Test failures for 0.2 on Dataproc [BUG] Test failures for 0.2 when run with multiple executors Oct 8, 2020
@revans2 revans2 self-assigned this Oct 9, 2020
@sameerz sameerz added this to the Oct 12 - Oct 23 milestone Oct 9, 2020
@revans2
Copy link
Collaborator

revans2 commented Oct 12, 2020

I have found that one of the issues we are running into is that spark's sort order is a stable sort, but ours is not. I need to add in java bindings for this and update the plugin.

@revans2
Copy link
Collaborator

revans2 commented Oct 12, 2020

Actually I will file a follow on issue for that, I think the issue is more related to unessisary sorts in some of the aggregates. I'll see if I can fix it that way first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Nice to have for release test Only impacts tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants