[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method `taskList` #23310

gengliangwang · 2018-12-13T14:55:13Z

What changes were proposed in this pull request?

In the method taskList(since #21688), the executor log value is queried in KV store for every task(method constructTaskData).
This PR propose to use a hashmap for reducing duplicated KV store lookups in the method.

How was this patch tested?

Manual check

gengliangwang · 2018-12-13T14:56:17Z

@srowen @tgravescs @pgandhi999

SparkQA · 2018-12-13T15:01:39Z

Test build #100095 has started for PR 23310 at commit 1cfd872.

srowen · 2018-12-13T15:07:30Z

Agree -- CC @pgandhi999

gengliangwang · 2018-12-14T03:07:40Z

retest this please.

SparkQA · 2018-12-14T07:25:21Z

Test build #100120 has finished for PR 23310 at commit 1cfd872.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR

LGTM. Btw, given that there's change on response on public API, might this PR need to be treated as backward incompatible change?

gengliangwang · 2018-12-14T08:18:25Z

@HeartSaVioR #21688 is only on master. IMO we don't need to consider backward incompatibility.

HeartSaVioR · 2018-12-14T08:20:19Z

@gengliangwang Ah my bad. Thanks for noticing. LGTM.

tgravescs · 2018-12-14T14:37:24Z

overall makes sense, the only downside is less on server side, but can always be added back later.

Were you seeing specific performance issue or slow load time with this?

gengliangwang · 2018-12-14T17:34:16Z

@tgravescs I found the issue when I merge the code changes to our product. I don't try it in large application to get performance difference. Performance is not the key motivation. We can use a hash map to reduce the number of KV store lookups.

The main point of this PR is about the data structure TaskData should not conclude the field executorLogs, which is redundant and not belong to the scope of task data.
I can understand your concern about less output on server side. But the solution in this PR is quite clean and simple.

tgravescs · 2018-12-14T18:17:49Z

I'm not against the change, one can argue both ways whether it should be in the scope of task data or not. I personally don't see that as a problem based on how we are trying to do server side stuff here. In many ways it makes sense for the rest api to return exactly what you want for your UI so you don't have to do joins or lookups on other tables. logs are directly related to tasks so from a logical perspective they do belong there. Actually I hate how when we stop tracking executors to save memory the log links go away. Its very annoying from a debugging point of view.

Reducing the # of lookups should be good. I was just wanting to know if you actually saw a performance issue with this or not. I can change any code I want because I think its better but unless I measure it to prove that it doesn't mean it does or is necessary.

In this case since we don't do the executor table on the server side I think this is ok, theoretically that could get out of sync with the task table since its doing server side lookups and not reloading the entire page. This change could make that slightly worse if you get new executors not in that table. But until/if we convert everything to server side I think that is ok.

gengliangwang · 2018-12-14T18:24:34Z

@tgravescs Thanks for the explanation. Totally agree.

dongjoon-hyun · 2018-12-17T02:15:27Z

Retest this please.

dongjoon-hyun · 2018-12-17T05:38:22Z

core/src/main/resources/org/apache/spark/ui/static/stagepage.js

@@ -348,9 +348,9 @@ $(document).ready(function () {

            // prepare data for executor summary table
            stageExecutorSummaryInfoKeys = Object.keys(responseBody.executorSummary);
+            var executorDetailsMap = {};


@gengliangwang . Is this filled correctly always? I'm hitting the following error during reviewing this patch.

I think it is caused by the cache of browser. The javascript might be using the cached one.

Try loading the page without cache should work: https://en.wikipedia.org/wiki/Wikipedia:Bypass_your_cache

@gengliangwang you're sure it works locally? I just want to make sure as I'm not sure how well tests cover this case

@srowen I tested locally with the event logs under core/src/test/resources/spark-events, manually compare the log URL of stage pages in application_1538416563558_0014. Also check if any error/warning in chrome console.

Everything works well.

SparkQA · 2018-12-17T06:48:49Z

Test build #100213 has finished for PR 23310 at commit 1cfd872.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2018-12-20T15:11:06Z

core/src/main/resources/org/apache/spark/ui/static/stagepage.js

-                        {data : "executorLogs", name: "Logs", render: formatLogsCells},
+                        {
+                            data : function (row, type) {
+                                if(executorDetailsMap[row.executorId] && executorDetailsMap[row.executorId]["executorLogs"]) {


missing space after if

tgravescs · 2018-12-20T15:12:23Z

core/src/main/resources/org/apache/spark/ui/static/stagepage.js

            $.getJSON(createRESTEndPointForExecutorsPage(appId),
              function(executorSummaryResponse, status, jqXHR) {
-                var executorDetailsMap = {};
                executorSummaryResponse.forEach(function (executorDetail) {
                    executorDetailsMap[executorDetail.id] = executorDetail;


So one issue I think here is that this is being filled in async from what the task table is being filled in and referencing this field. You could potentially have issues where this hasn't finished by the time the task table wants to use it.

Oh sorry about that.
Then I think we should just use a hash map in the backend.

This reverts commit 1cfd872c9f64b214ef4a3a17b507033bf296d60b.

SparkQA · 2018-12-20T21:07:44Z

Test build #100339 has finished for PR 23310 at commit 40582c1.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-12-20T21:49:19Z

Test build #100338 has finished for PR 23310 at commit 67992a5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-12-21T14:16:16Z

It looks like the change now doesn't match the title and description. Does this actually address the original issue?

tgravescs · 2018-12-21T14:45:22Z

At a high level it could be better because he is storing the executor info in executorIdToLogs hashmap as he goes through each executor and if you have multiple tasks on the same executor it would have already loaded that info from the kv store. So it could be less lookups. Need to look in more detail.

pgandhi999 · 2018-12-21T14:52:28Z

LGTM. I see that this will reduce the loading time of the page significantly especially when there are multiple tasks running on same executor.

gengliangwang · 2018-12-21T14:55:44Z

@srowen The new changes use a hash map to reduce the duplicated KV store lookups inside one method call taskList.
To further improve it, we can either:

keep the original change, and make the request to "/allexecutors" synchronous.
Use a global hash map for executor ID to executor log, which should be working for most cases. Not sure if the mapping can be changed. E.g. executor removed and added back with the old ID but different log? I am not sure if this can happen in Spark.

What do you think?

srowen · 2018-12-27T17:01:20Z

@gengliangwang OK this is a narrower change than it was originally. That's fine if it's an improvement. I'd just suggest you modify the JIRA and/or PR description to reflect the current change as needed.

gengliangwang · 2018-12-27T17:36:34Z

Hi @srowen ,
thanks for the suggestion. I have updated the description in this PR and the Jira

SparkQA · 2018-12-28T19:07:46Z

Test build #4488 has finished for PR 23310 at commit 40582c1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-12-30T03:48:11Z

Merged to master

gengliangwang · 2018-12-30T14:59:05Z

@srowen @tgravescs @pgandhi999 Thanks for the review. And again, sorry for the mistake.

…kList` ## What changes were proposed in this pull request? In the method `taskList`(since apache#21688), the executor log value is queried in KV store for every task(method `constructTaskData`). This PR propose to use a hashmap for reducing duplicated KV store lookups in the method. ![image](https://user-images.githubusercontent.com/1097932/49946230-841c7680-ff29-11e8-8b83-d8f7553bfe5e.png) ## How was this patch tested? Manual check Closes apache#23310 from gengliangwang/removeExecutorLog. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>

gengliangwang changed the title ~~[SPARK-26363][WebUI] Remove redundant field executorLogs in TaskData~~ [SPARK-26363][WebUI] Avoid duplicated KV store lookup for task table Dec 13, 2018

gengliangwang changed the title ~~[SPARK-26363][WebUI] Avoid duplicated KV store lookup for task table~~ [SPARK-26363][WebUI] Avoid duplicated KV store lookups for task table Dec 13, 2018

HeartSaVioR approved these changes Dec 14, 2018

View reviewed changes

dongjoon-hyun reviewed Dec 17, 2018

View reviewed changes

srowen approved these changes Dec 18, 2018

View reviewed changes

tgravescs reviewed Dec 20, 2018

View reviewed changes

gengliangwang added 3 commits December 21, 2018 00:00

remove executorLogs in TaskData

908bafd

Revert "remove executorLogs in TaskData"

e89040f

This reverts commit 1cfd872c9f64b214ef4a3a17b507033bf296d60b.

use hash map to reduce kv store lookup

67992a5

gengliangwang force-pushed the removeExecutorLog branch from 1cfd872 to 67992a5 Compare December 20, 2018 17:10

remove unused code

40582c1

gengliangwang changed the title ~~[SPARK-26363][WebUI] Avoid duplicated KV store lookups for task table~~ [SPARK-26363][WebUI] Avoid duplicated KV store lookups in method taskList Dec 27, 2018

srowen closed this in 240817b Dec 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method `taskList` #23310

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method `taskList` #23310

gengliangwang commented Dec 13, 2018 •

edited

Loading

gengliangwang commented Dec 13, 2018

SparkQA commented Dec 13, 2018

srowen commented Dec 13, 2018

gengliangwang commented Dec 14, 2018

SparkQA commented Dec 14, 2018

HeartSaVioR left a comment

gengliangwang commented Dec 14, 2018

HeartSaVioR commented Dec 14, 2018 •

edited

Loading

tgravescs commented Dec 14, 2018

gengliangwang commented Dec 14, 2018 •

edited

Loading

tgravescs commented Dec 14, 2018

gengliangwang commented Dec 14, 2018

dongjoon-hyun commented Dec 17, 2018

dongjoon-hyun Dec 17, 2018

gengliangwang Dec 17, 2018

gengliangwang Dec 17, 2018

srowen Dec 19, 2018

gengliangwang Dec 19, 2018

SparkQA commented Dec 17, 2018

tgravescs Dec 20, 2018

tgravescs Dec 20, 2018

gengliangwang Dec 20, 2018 •

edited

Loading

SparkQA commented Dec 20, 2018

SparkQA commented Dec 20, 2018

srowen commented Dec 21, 2018

tgravescs commented Dec 21, 2018

pgandhi999 commented Dec 21, 2018

gengliangwang commented Dec 21, 2018 •

edited

Loading

srowen commented Dec 27, 2018

gengliangwang commented Dec 27, 2018

SparkQA commented Dec 28, 2018

srowen commented Dec 30, 2018

gengliangwang commented Dec 30, 2018

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method taskList #23310

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method taskList #23310

Conversation

gengliangwang commented Dec 13, 2018 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

gengliangwang commented Dec 13, 2018

SparkQA commented Dec 13, 2018

srowen commented Dec 13, 2018

gengliangwang commented Dec 14, 2018

SparkQA commented Dec 14, 2018

HeartSaVioR left a comment

Choose a reason for hiding this comment

gengliangwang commented Dec 14, 2018

HeartSaVioR commented Dec 14, 2018 • edited Loading

tgravescs commented Dec 14, 2018

gengliangwang commented Dec 14, 2018 • edited Loading

tgravescs commented Dec 14, 2018

gengliangwang commented Dec 14, 2018

dongjoon-hyun commented Dec 17, 2018

dongjoon-hyun Dec 17, 2018

Choose a reason for hiding this comment

gengliangwang Dec 17, 2018

Choose a reason for hiding this comment

gengliangwang Dec 17, 2018

Choose a reason for hiding this comment

srowen Dec 19, 2018

Choose a reason for hiding this comment

gengliangwang Dec 19, 2018

Choose a reason for hiding this comment

SparkQA commented Dec 17, 2018

tgravescs Dec 20, 2018

Choose a reason for hiding this comment

tgravescs Dec 20, 2018

Choose a reason for hiding this comment

gengliangwang Dec 20, 2018 • edited Loading

Choose a reason for hiding this comment

SparkQA commented Dec 20, 2018

SparkQA commented Dec 20, 2018

srowen commented Dec 21, 2018

tgravescs commented Dec 21, 2018

pgandhi999 commented Dec 21, 2018

gengliangwang commented Dec 21, 2018 • edited Loading

srowen commented Dec 27, 2018

gengliangwang commented Dec 27, 2018

SparkQA commented Dec 28, 2018

srowen commented Dec 30, 2018

gengliangwang commented Dec 30, 2018

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method `taskList` #23310

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method `taskList` #23310

gengliangwang commented Dec 13, 2018 •

edited

Loading

HeartSaVioR commented Dec 14, 2018 •

edited

Loading

gengliangwang commented Dec 14, 2018 •

edited

Loading

gengliangwang Dec 20, 2018 •

edited

Loading

gengliangwang commented Dec 21, 2018 •

edited

Loading