Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12149] [Web UI] Executor UI improvement suggestions - Color UI #10154

Closed
wants to merge 22 commits into from

Conversation

ajbozarth
Copy link
Member

Added color coding to the Executors page for Active Tasks, Failed Tasks, Completed Tasks and Task Time.

Active Tasks is shaded blue with it's range based on percentage of total cores used.
Failed Tasks is shaded red ranging over the first 10% of total tasks failed
Completed Tasks is shaded green ranging over 10% of total tasks including failed and active tasks, but only when there are active or failed tasks on that executor.
Task Time is shaded red when GC Time goes over 10% of total time with it's range directly corresponding to the percent of total time.

@ajbozarth
Copy link
Member Author

Few discussion issues to raise:

  1. I chose to use opacity (alpha channel) to range the colors, what do people think of this? I felt it was simpler than figuring out a range of rgb.
  2. I chose a 10% threshold for the GC color appearing, is that a good threshold or should I change it?
  3. What do people think of my choices for determining the color ranges?

// activeTasks range from 0 to all cores
activeTasksAlpha = (info.activeTasks.toDouble / info.totalCores) * 0.5 + 0.5
}
if (info.totalTasks > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also assign these like

val (failedTasksAlpha, completedTasksAlpha) = 
  if (...) {
    (..., ...)
  } else {
    (1.0, 1.0)
  }

but whether that's clearer is a matter of taste.
It might be better to cap alpha at 1.0 just for tidiness.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to do that but the style guide wasn't clear on whether it was allow, I'll update it since it's cleaner. And I'll add the 1.0 cap, I left it off originally because the code was getting busy and hard to read originally.

@SparkQA
Copy link

SparkQA commented Dec 5, 2015

Test build #2173 has finished for PR 10154 at commit d630cac.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

Can you post screenshots?

@@ -4,11 +4,13 @@
"rddBlocks" : 8,
"memoryUsed" : 28000128,
"diskUsed" : 0,
"totalCores" : 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected number of cores is 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated that file according to the javadocs for HistoryServerSuite, the test that uses it. I would assume it's 0 since it's the driver and the driver has no cores assigned to run tasks.

@ajbozarth
Copy link
Member Author

screen shot 2015-12-07 at 11 11 22 am
screen shot 2015-12-07 at 11 11 33 am
screen shot 2015-12-07 at 11 14 30 am
Here's a few screen shots, note that I lowered the threshold and range for the GC Time Color so you can see it easier for this example (1% threshold with color range over first 5% instead of 10% threshold with range over first 50%)

@srowen
Copy link
Member

srowen commented Dec 7, 2015

Thanks for the screenshots. The color really 'pops', so my concern is just making sure it doesn't get overwhelming.

I can see using the red to very clearly call attention to failed things, or excessive GC. It also sort of feels write to call attention to active things while they're active. (If green = good and red = bad and blue = neutral, is blue more appropriate for an active task? it could turn out to fail or complete. shrug.)

I wasn't as sure about coloring the completed tasks column therefore, but it's a matter of taste.

I kind of like how this looks, but if anyone else is concerned about the amount of color, I suppose it's coherent to make the text colored instead of the background.

@ajbozarth
Copy link
Member Author

I ran the MiMa tests locally (I didn't know they didn't run before) and they're failing because I changed api.scala and added two vals to ExecutorSummary. How do I go about fixing the MiMa failure?

@srowen
Copy link
Member

srowen commented Dec 7, 2015

It must be a false positive or else some problem with how mima got invoked. I don't see any API changes here. If they're "real" false positives you'll need to add an exclusion but it tells you what to add. Let's see what Jenkins says.

@SparkQA
Copy link

SparkQA commented Dec 7, 2015

Test build #2178 has finished for PR 10154 at commit 2887aca.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth
Copy link
Member Author

added the MiMa exclude, please advise if I did it correctly

@@ -33,11 +33,13 @@ private[ui] case class ExecutorSummaryInfo(
rddBlocks: Int,
memoryUsed: Long,
diskUsed: Long,
totalCores: Int,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the comment for this case class says it isn't used anymore - do we really need to update it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some checks, MiMa exclude is still needed even without this change and this change did not affect any code, therefore I've removed the change.

@ajbozarth
Copy link
Member Author

Is there a way someone could get the tests to run to check my change? I assumed they would run on my commit, but they didn't.

@srowen
Copy link
Member

srowen commented Dec 17, 2015

Jenkins test this please

@srowen
Copy link
Member

srowen commented Dec 17, 2015

Jenkins add to whitelist

@SparkQA
Copy link

SparkQA commented Dec 17, 2015

Test build #47913 has finished for PR 10154 at commit 3208d85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth
Copy link
Member Author

Thanks @srowen , Is there anyone we should pull into this to get more opinions on the change?

@srowen
Copy link
Member

srowen commented Dec 21, 2015

My only reservations are: active vs complete being colored green vs blue instead of the other way around, and also coloring the complete column at all, since it would always be colored after starting.

@ajbozarth
Copy link
Member Author

I'll switch the green and blue and post some screen shots to see the difference.

As for the completed column always being blue:
I could make it conditional so it's only blue if there are either active or failed tasks to compare too, but I feel like that would end up looking odd when some executors' completed tasks are colored and some aren't.
Or I could remove the coloring from completed tasks entirely, I'll post some screenshots of this as well before pushing any changes.

@ajbozarth
Copy link
Member Author

screen shot 2015-12-21 at 12 20 30 pm
screen shot 2015-12-21 at 12 10 57 pm
screen shot 2015-12-21 at 12 04 21 pm
There's three screenshots of what switching the blue and green would look like, I actually think I like it better.

@ajbozarth
Copy link
Member Author

And here's screenshots of completed only being colored when comparing to active or failed, examples with both color choices.
screen shot 2015-12-21 at 12 43 58 pm
screen shot 2015-12-21 at 12 57 11 pm
screen shot 2015-12-21 at 12 45 05 pm
screen shot 2015-12-21 at 12 57 19 pm
Personally I don't know how I feel about this, I think it looks better visually, but I'm not sure about usability.

@ajbozarth
Copy link
Member Author

Thanks, I only tested using a standalone cluster, I have Hadoop and Yarn set up on my remote testing cluster so I can test that now.

@tgravescs
Copy link
Contributor

So after seeing this run I'm not sure how useful the Completed Task column shading it. Especially when there aren't any tasks running. I think it draws your attention to that column for no reason. So I think I would like to see the shading on completed tasks removed unless someone has a strong use case for it, thoughts?

I still need to try out the failed and GC time.

@ajbozarth
Copy link
Member Author

I had an implementation a handful of commits ago that only shaded completed when either active or failed was also colored for that row, allowing for quick comparison. What do you think of that option (there are screen shots of it above)? Also if we got rid of coloring completed tasks do we want to switch active back to green or leave it as blue?

@ajbozarth
Copy link
Member Author

Tested it with Yarn and it behaves the same way as Standalone

@tgravescs
Copy link
Contributor

thanks, I saw the above changes for only showing completed when active or failed but again I'm not sure how useful it is. I think just dropping the color for now makes sense. I think leave the color blue in active.

If we were to keep completed tasks I think it makes more sense to shade vs other executors. The times I can think of wanting to know completed tasks is different is perhaps when that executor is being slow or skewed data. So if it doesn't have as many completed tasks as other executors. I don't think completed vs total on same executor is really that useful. But I think for now we just remove it.

@ajbozarth
Copy link
Member Author

Screen shot of my incoming update:
screen shot 2016-01-19 at 3 44 10 pm

@SparkQA
Copy link

SparkQA commented Jan 20, 2016

Test build #49727 has finished for PR 10154 at commit 2f54a33.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth
Copy link
Member Author

retest please

@tgravescs
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jan 20, 2016

Test build #49788 has finished for PR 10154 at commit 2f54a33.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

I haven't forgot about this, just busy, I want to take one last pass over this.

// failedTasks range max at 10% failure, alpha max = 1
val failedTasksAlpha =
if (totalTasks > 0) {
math.min (10 * failedTasks.toDouble / totalTasks, 1) * 0.5 + 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove space after min

@tgravescs
Copy link
Contributor

A few minor nits, otherwise looks good.

@ajbozarth
Copy link
Member Author

Fixed the little issues, don't know when the space after min got there though. And this also showed me my auto-format in IntelliJ isn't 100% correct.

@SparkQA
Copy link

SparkQA commented Jan 25, 2016

Test build #49996 has finished for PR 10154 at commit 9293c41.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth
Copy link
Member Author

retest this please

@tgravescs
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jan 25, 2016

Test build #50000 has finished for PR 10154 at commit 9293c41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

+1 committed this, thanks @ajbozarth

@rxin
Copy link
Contributor

rxin commented Oct 18, 2016

So I looked at the UI today and I have to say the color code is extremely confusing. On a normal cluster I saw a bunch of reds, which is usually reserved for errors.

(I'm going to leave a comment on the JIRA ticket too)

@ajbozarth
Copy link
Member Author

@rxin The reds only show up on failed tasks or a high gc time, what would you expect from each color? Also this was a while ago (and one of my first prs) so I'm willing to revisit how this was done. If we want to open a new JIRA to revisit it I'll work on it.

(responding here since this is where all the reviewers are looped in)

@srowen
Copy link
Member

srowen commented Oct 19, 2016

Yes, I have not noticed the color to be confusing. What type of state are you looking at @rxin? it could be that with enough activity, failures are inevitable, and marking anything > 0 failures as red is a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants