-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some metrics improvements and timeline reporting #4451
Conversation
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor comments
yStart: Long, | ||
minStart: Long, | ||
fileWriter: ToolTextFileWriter): Unit = { | ||
val x = xStart + (startTime - minStart)/MS_PER_PIXEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spaces around /
docs/tuning-guide.md
Outdated
| Key | Name | Description | | ||
|------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| bufferTime | buffer time | Time spent buffering input from file data sources. This buffering time happens on the CPU, typically with no GPU semaphore held. | | ||
| readFsTime | time to read fs data | Time spent actually reading the data and writing it to on heap memory. This is a part of `bufferTime` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hyphenated spelling on-heap
, off-heap
is easier to parse
build |
Converting to draft because I found some issues with op time for join that I want to understand better. |
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/AbstractGpuJoinIterator.scala
Show resolved
Hide resolved
tools/src/main/scala/com/nvidia/spark/rapids/tool/profiling/GenerateTimeline.scala
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ColumnarOutputWriter.scala
Show resolved
Hide resolved
build |
This is a result of trying to find a heuristic to optimize parquet/orc splits and also looking at buffering times to try and understand if there are more optimizations we could do for HDFS/other distributed file systems.
It fixes some metrics and offers a way to visualize the metrics in the timeline view from the profiling tool. The colors on the timeline were already really bad, and this does not help at all.
I don't consider this completely done because I have not documented the metrics reporting yet. I have not done this because I wasn't sure if the colors I have picked are okay. Also I was not sure if we wanted to put in a pattern in addition to a color to make it simpler to see. I also am not sure if this is something we want to have on by default, especially because the semaphore time only happens when debug metrics are enabled. Here is a high level overview.
The bottom half of each task shows the amount of time taken as reported by various metrics.
feedback is appreciated.