[SPARK-26148][PYTHON][TESTS] Increases default parallelism in PySpark tests to speed up #23111

HyukjinKwon · 2018-11-22T00:11:12Z

What changes were proposed in this pull request?

This PR proposes to increase parallelism in PySpark tests to speed up from 4 to 8.

It decreases the elapsed time from

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99163/consoleFull
Tests passed in 1770 seconds

to

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99186/testReport/
Tests passed in 1027 seconds

How was this patch tested?

Jenkins tests

SparkQA · 2018-11-22T01:00:41Z

Test build #99149 has finished for PR 23111 at commit ec0f730.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-11-22T01:04:02Z

retest this please

SparkQA · 2018-11-22T02:02:15Z

Test build #99151 has finished for PR 23111 at commit ec0f730.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-11-22T02:08:32Z

retest this please

SparkQA · 2018-11-22T03:05:51Z

Test build #99156 has finished for PR 23111 at commit ec0f730.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-11-22T04:25:42Z

Test build #99159 has finished for PR 23111 at commit d3db950.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-11-22T06:08:33Z

retest this please

SparkQA · 2018-11-22T06:52:06Z

Test build #99163 has finished for PR 23111 at commit d3db950.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-11-22T08:05:02Z

Test build #99164 has finished for PR 23111 at commit 973f6da.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-11-22T08:42:44Z

retest this please

SparkQA · 2018-11-22T13:03:30Z

Test build #99172 has finished for PR 23111 at commit 973f6da.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-11-22T14:47:11Z

retest this please

HyukjinKwon · 2018-11-22T14:48:07Z

Oh? it drastically decreases from, for instance,

Tests passed in 1770 seconds

to

Tests passed in 1171 seconds

HyukjinKwon · 2018-11-22T14:57:10Z

cc @rxin, @BryanCutler, @squito. This decreases elapsed time (even faster then before splitting the tests).

SparkQA · 2018-11-22T18:04:36Z

Test build #99187 has finished for PR 23111 at commit 3cb6d0f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-11-22T18:44:53Z

Test build #99186 has finished for PR 23111 at commit 973f6da.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-11-22T23:58:19Z

Yea, the improvement looks persistent:

Tests passed in 1027 seconds

HyukjinKwon · 2018-11-23T15:48:01Z

Hey all, I will merge this in few days if there's no more comments. It's going to speed up the tests roughly 12 ~ 15 mins.

HyukjinKwon · 2018-11-25T15:26:05Z

Merged to master.

squito · 2018-11-26T20:14:44Z

wow, thats great! glad there is a big speedup.

squito · 2018-11-26T20:15:22Z

we might need to be careful that this doesn't un-intentionally overload the jenkins workers so that we end up hitting more timeouts from too many things running concurrently (I dunno how isolated the workers are)

HyukjinKwon · 2018-11-26T23:29:15Z

Ah, it just increases number of python threads that run each PySpark only suit. Shouldn't be a big deal. I'm keeping my eyes on the build.

…essionWithSGDTests.test_training_and_prediction test ## What changes were proposed in this pull request? Looks this test is flaky https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console ``` ====================================================================== FAIL: test_training_and_prediction (pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests) Test that the model improves on toy data with no. of batches ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 367, in test_training_and_prediction self._eventually(condition) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 78, in _eventually % (timeout, lastValue)) AssertionError: Test failed due to timeout after 30 sec, with last condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74 ---------------------------------------------------------------------- Ran 13 tests in 185.051s FAILED (failures=1, skipped=1) ``` This looks happening after increasing the parallelism in Jenkins to speed up at #23111. I am able to reproduce this manually when the resource usage is heavy (with manual decrease of timeout). ## How was this patch tested? Manually tested by ``` cd python ./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms StreamingLogisticRegressionWithSGDTests.test_training_and_prediction' --python-executables=python ``` Closes #23236 from HyukjinKwon/SPARK-26275. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

… tests to speed up ## What changes were proposed in this pull request? This PR proposes to increase parallelism in PySpark tests to speed up from 4 to 8. It decreases the elapsed time from https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99163/consoleFull Tests passed in 1770 seconds to https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99186/testReport/ Tests passed in 1027 seconds ## How was this patch tested? Jenkins tests Closes apache#23111 from HyukjinKwon/parallelism. Authored-by: hyukjinkwon <gurwls223@apache.org> Signed-off-by: hyukjinkwon <gurwls223@apache.org>

…essionWithSGDTests.test_training_and_prediction test ## What changes were proposed in this pull request? Looks this test is flaky https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console ``` ====================================================================== FAIL: test_training_and_prediction (pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests) Test that the model improves on toy data with no. of batches ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 367, in test_training_and_prediction self._eventually(condition) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 78, in _eventually % (timeout, lastValue)) AssertionError: Test failed due to timeout after 30 sec, with last condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74 ---------------------------------------------------------------------- Ran 13 tests in 185.051s FAILED (failures=1, skipped=1) ``` This looks happening after increasing the parallelism in Jenkins to speed up at apache#23111. I am able to reproduce this manually when the resource usage is heavy (with manual decrease of timeout). ## How was this patch tested? Manually tested by ``` cd python ./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms StreamingLogisticRegressionWithSGDTests.test_training_and_prediction' --python-executables=python ``` Closes apache#23236 from HyukjinKwon/SPARK-26275. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

Increases default parallelism in PySpark tests

ec0f730

HyukjinKwon added 2 commits November 22, 2018 11:28

change to 4

29d15d8

but leave a change in python

d3db950

This comment has been minimized.

Sign in to view

8 parallelism

973f6da

revert python change back

3cb6d0f

HyukjinKwon changed the title ~~[DO-NOT-MERGE] Increases default parallelism in PySpark tests~~ [SPARK-26148][PYTHON][TESTS] Increases default parallelism in PySpark tests to speed up Nov 22, 2018

asfgit closed this in 41d5aae Nov 25, 2018

HyukjinKwon mentioned this pull request Dec 5, 2018

[SPARK-26275][PYTHON][ML] Increases timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test #23236

Closed

HyukjinKwon deleted the parallelism branch March 3, 2020 01:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-26148][PYTHON][TESTS] Increases default parallelism in PySpark tests to speed up #23111

[SPARK-26148][PYTHON][TESTS] Increases default parallelism in PySpark tests to speed up #23111

HyukjinKwon commented Nov 22, 2018 •

edited

Loading

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

This comment has been minimized.

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

HyukjinKwon commented Nov 23, 2018

HyukjinKwon commented Nov 25, 2018

squito commented Nov 26, 2018

squito commented Nov 26, 2018

HyukjinKwon commented Nov 26, 2018 •

edited

Loading

[SPARK-26148][PYTHON][TESTS] Increases default parallelism in PySpark tests to speed up #23111

[SPARK-26148][PYTHON][TESTS] Increases default parallelism in PySpark tests to speed up #23111

Conversation

HyukjinKwon commented Nov 22, 2018 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

This comment has been minimized.

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

SparkQA commented Nov 22, 2018

SparkQA commented Nov 22, 2018

HyukjinKwon commented Nov 22, 2018

HyukjinKwon commented Nov 23, 2018

HyukjinKwon commented Nov 25, 2018

squito commented Nov 26, 2018

squito commented Nov 26, 2018

HyukjinKwon commented Nov 26, 2018 • edited Loading

HyukjinKwon commented Nov 22, 2018 •

edited

Loading

HyukjinKwon commented Nov 26, 2018 •

edited

Loading