Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26363][WebUI] Avoid duplicated KV store lookups in method taskList #23310

Closed
wants to merge 4 commits into from

Conversation

gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Dec 13, 2018

What changes were proposed in this pull request?

In the method taskList(since #21688), the executor log value is queried in KV store for every task(method constructTaskData).
This PR propose to use a hashmap for reducing duplicated KV store lookups in the method.

image

How was this patch tested?

Manual check

@gengliangwang
Copy link
Member Author

@gengliangwang gengliangwang changed the title [SPARK-26363][WebUI] Remove redundant field executorLogs in TaskData [SPARK-26363][WebUI] Avoid duplicated KV store lookup for task table Dec 13, 2018
@gengliangwang gengliangwang changed the title [SPARK-26363][WebUI] Avoid duplicated KV store lookup for task table [SPARK-26363][WebUI] Avoid duplicated KV store lookups for task table Dec 13, 2018
@SparkQA
Copy link

SparkQA commented Dec 13, 2018

Test build #100095 has started for PR 23310 at commit 1cfd872.

@srowen
Copy link
Member

srowen commented Dec 13, 2018

Agree -- CC @pgandhi999

@gengliangwang
Copy link
Member Author

retest this please.

@SparkQA
Copy link

SparkQA commented Dec 14, 2018

Test build #100120 has finished for PR 23310 at commit 1cfd872.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Btw, given that there's change on response on public API, might this PR need to be treated as backward incompatible change?

@gengliangwang
Copy link
Member Author

@HeartSaVioR #21688 is only on master. IMO we don't need to consider backward incompatibility.

@HeartSaVioR
Copy link
Contributor

HeartSaVioR commented Dec 14, 2018

@gengliangwang Ah my bad. Thanks for noticing. LGTM.

@tgravescs
Copy link
Contributor

overall makes sense, the only downside is less on server side, but can always be added back later.

Were you seeing specific performance issue or slow load time with this?

@gengliangwang
Copy link
Member Author

gengliangwang commented Dec 14, 2018

@tgravescs I found the issue when I merge the code changes to our product. I don't try it in large application to get performance difference. Performance is not the key motivation. We can use a hash map to reduce the number of KV store lookups.

The main point of this PR is about the data structure TaskData should not conclude the field executorLogs, which is redundant and not belong to the scope of task data.
I can understand your concern about less output on server side. But the solution in this PR is quite clean and simple.

@tgravescs
Copy link
Contributor

I'm not against the change, one can argue both ways whether it should be in the scope of task data or not. I personally don't see that as a problem based on how we are trying to do server side stuff here. In many ways it makes sense for the rest api to return exactly what you want for your UI so you don't have to do joins or lookups on other tables. logs are directly related to tasks so from a logical perspective they do belong there. Actually I hate how when we stop tracking executors to save memory the log links go away. Its very annoying from a debugging point of view.

Reducing the # of lookups should be good. I was just wanting to know if you actually saw a performance issue with this or not. I can change any code I want because I think its better but unless I measure it to prove that it doesn't mean it does or is necessary.

In this case since we don't do the executor table on the server side I think this is ok, theoretically that could get out of sync with the task table since its doing server side lookups and not reloading the entire page. This change could make that slightly worse if you get new executors not in that table. But until/if we convert everything to server side I think that is ok.

@gengliangwang
Copy link
Member Author

@tgravescs Thanks for the explanation. Totally agree.

@dongjoon-hyun
Copy link
Member

Retest this please.

@@ -348,9 +348,9 @@ $(document).ready(function () {

// prepare data for executor summary table
stageExecutorSummaryInfoKeys = Object.keys(responseBody.executorSummary);
var executorDetailsMap = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gengliangwang . Is this filled correctly always? I'm hitting the following error during reviewing this patch.

screenshot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is caused by the cache of browser. The javascript might be using the cached one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try loading the page without cache should work: https://en.wikipedia.org/wiki/Wikipedia:Bypass_your_cache

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gengliangwang you're sure it works locally? I just want to make sure as I'm not sure how well tests cover this case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen I tested locally with the event logs under core/src/test/resources/spark-events, manually compare the log URL of stage pages in application_1538416563558_0014. Also check if any error/warning in chrome console.

Everything works well.

@SparkQA
Copy link

SparkQA commented Dec 17, 2018

Test build #100213 has finished for PR 23310 at commit 1cfd872.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

{data : "executorLogs", name: "Logs", render: formatLogsCells},
{
data : function (row, type) {
if(executorDetailsMap[row.executorId] && executorDetailsMap[row.executorId]["executorLogs"]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space after if

$.getJSON(createRESTEndPointForExecutorsPage(appId),
function(executorSummaryResponse, status, jqXHR) {
var executorDetailsMap = {};
executorSummaryResponse.forEach(function (executorDetail) {
executorDetailsMap[executorDetail.id] = executorDetail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one issue I think here is that this is being filled in async from what the task table is being filled in and referencing this field. You could potentially have issues where this hasn't finished by the time the task table wants to use it.

Copy link
Member Author

@gengliangwang gengliangwang Dec 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry about that.
Then I think we should just use a hash map in the backend.

@SparkQA
Copy link

SparkQA commented Dec 20, 2018

Test build #100339 has finished for PR 23310 at commit 40582c1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 20, 2018

Test build #100338 has finished for PR 23310 at commit 67992a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Dec 21, 2018

It looks like the change now doesn't match the title and description. Does this actually address the original issue?

@tgravescs
Copy link
Contributor

At a high level it could be better because he is storing the executor info in executorIdToLogs hashmap as he goes through each executor and if you have multiple tasks on the same executor it would have already loaded that info from the kv store. So it could be less lookups. Need to look in more detail.

@pgandhi999
Copy link

LGTM. I see that this will reduce the loading time of the page significantly especially when there are multiple tasks running on same executor.

@gengliangwang
Copy link
Member Author

gengliangwang commented Dec 21, 2018

@srowen The new changes use a hash map to reduce the duplicated KV store lookups inside one method call taskList.
To further improve it, we can either:

  1. keep the original change, and make the request to "/allexecutors" synchronous.
  2. Use a global hash map for executor ID to executor log, which should be working for most cases. Not sure if the mapping can be changed. E.g. executor removed and added back with the old ID but different log? I am not sure if this can happen in Spark.

What do you think?

@srowen
Copy link
Member

srowen commented Dec 27, 2018

@gengliangwang OK this is a narrower change than it was originally. That's fine if it's an improvement. I'd just suggest you modify the JIRA and/or PR description to reflect the current change as needed.

@gengliangwang gengliangwang changed the title [SPARK-26363][WebUI] Avoid duplicated KV store lookups for task table [SPARK-26363][WebUI] Avoid duplicated KV store lookups in method taskList Dec 27, 2018
@gengliangwang
Copy link
Member Author

Hi @srowen ,
thanks for the suggestion. I have updated the description in this PR and the Jira

@SparkQA
Copy link

SparkQA commented Dec 28, 2018

Test build #4488 has finished for PR 23310 at commit 40582c1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Dec 30, 2018

Merged to master

@srowen srowen closed this in 240817b Dec 30, 2018
@gengliangwang
Copy link
Member Author

@srowen @tgravescs @pgandhi999 Thanks for the review. And again, sorry for the mistake.

jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…kList`

## What changes were proposed in this pull request?

In the method `taskList`(since apache#21688),  the executor log value is queried in KV store  for every task(method `constructTaskData`).
This PR propose to use a hashmap for reducing duplicated KV store lookups in the method.

![image](https://user-images.githubusercontent.com/1097932/49946230-841c7680-ff29-11e8-8b83-d8f7553bfe5e.png)

## How was this patch tested?

Manual check

Closes apache#23310 from gengliangwang/removeExecutorLog.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants