Skip to content

Commit

Permalink
Qualification tool hook up final output based on per exec analysis (#…
Browse files Browse the repository at this point in the history
…5550)

* QualificationTool. Add speedup information to AppSummaryInfo

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* address review comments

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* debug

* check for dataset

* change dataset check for all

Signed-off-by: Thomas Graves <tgraves@apache.org>

* test unsupported time

* more changes

* fix to string

* fix including wholestage codegen

* put unsupported Dur

* calculate duration of non sql stages

* change to get stage task time

* combine

* hooking up final output

* initial scores changes

* logging

* logging

* update factor

* gturn off some logging:

* debug

* track execs without stages

* Add in exec info output

* fix output

* add sorting

* fix output

* fix output sizes

* use plan infos without execs removed

* output children node ids

* output stages info

* cleanup

* fix running app

* fix stage header

Signed-off-by: Thomas Graves <tgraves@apache.org>

* Start removing unneeded fields

* Update summary table

* cleanup

* fix sorting

* update running app

* update test

* fix missing

* fix more reporting to be based on supported execs

* fixes

* fix test

* fix event processor calling base

* fix df duration

* debug

* debug

* fix double and int

* fix double

* fxi merge

* fix double

* fix divide 0

* fix double to 2 precision

* fix formatting output

* fix sorting

* fix Running

* move around sorting

* remove logWarnings

* update sorting

* Add appId to execs report

* update sxtages output

* remove unused imports

* fix running app

* update tests

* fix sorting

* add estimated into to summary info

* fix running

* try using enum

* fix recommendation to string

* update to use recommended/strongly recommended

* fix ecommendation

* fix divide 0

* fix opportunity

* fix up df task duration

* debug

* fix bug with estimated in csv

* rearrange codce

* cleanup

* cleanup and start handling failures

* handle failures and cleanup

* fix output

* change sorting

* fixies

* handle test having udf

* change speedup of the *InPandas and arrow eval

* fix test

* fix ui after changing field names

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* make ExecInfo regular class

* fix commented out

* fix more execinfo

* update test schema

* change what goes to DF

* move to outer class

* fix schema

* fix limit option

* remove extra header csv

* match up test with csv

* Fix not supported read formats

* update results

* 2 places for average speedup

* update results

* update results

* comment out tests

* update test

* update operator scores

* cleanup

* fix test hang

Co-authored-by: Ahmed Hussein (amahussein) <a@ahussein.me>
  • Loading branch information
tgravescs and amahussein authored May 23, 2022
1 parent ae27905 commit eb0f23e
Show file tree
Hide file tree
Showing 74 changed files with 833 additions and 615 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2221,7 +2221,13 @@ object SupportedOpsForTools {
("FilterExec", "2.4"),
("HashAggregateExec", "3.4"),
("SortExec", "6.0"),
("SortMergeJoinExec", "14.9"))
("SortMergeJoinExec", "14.9"),
("ArrowEvalPythonExec", "1.2"),
("AggregateInPandasExec", "1.2"),
("FlatMapGroupsInPandasExec", "1.2"),
("MapInPandasExec", "1.2"),
("WindowInPandasExec", "1.2")
)
GpuOverrides.execs.values.toSeq.sortBy(_.tag.toString).foreach { rule =>
val checks = rule.getChecks
if (rule.isVisible && checks.forall(_.shown)) {
Expand Down
10 changes: 5 additions & 5 deletions tools/src/main/resources/operatorsScore.csv
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,11 @@ BroadcastNestedLoopJoinExec,2.0
CartesianProductExec,2.0
ShuffledHashJoinExec,2.0
SortMergeJoinExec,14.9
AggregateInPandasExec,2.0
ArrowEvalPythonExec,2.0
FlatMapGroupsInPandasExec,2.0
MapInPandasExec,2.0
WindowInPandasExec,2.0
AggregateInPandasExec,1.2
ArrowEvalPythonExec,1.2
FlatMapGroupsInPandasExec,1.2
MapInPandasExec,1.2
WindowInPandasExec,1.2
WindowExec,2.0
Abs,3
Acos,3
Expand Down
16 changes: 6 additions & 10 deletions tools/src/main/resources/ui/html/raw.html
Original file line number Diff line number Diff line change
Expand Up @@ -168,21 +168,17 @@ <h1 class="dash-title">Full Report</h1>
<th><span data-toggle="tooltip" data-placement="top"
title="This is an estimate at how much time the tasks spent doing processing on the CPU vs waiting on IO. Shaded red when it is below 40%">
Executor CPU Time Percent</span></th>
<th>SQL Duration with Potential Problems</th>
<th>Estimated GPU Duration</th>
<th>Unsupported Task Duration</th>
<th>Speed-up Duration</th>
<th>Speed-up Factor</th>
<th>Speed-up Bucket</th>
<th>Longest SqlDuration</th>
<th>SQL Ids with Failures</th>
<th>Read Score Percent</th>
<th>Read File Format Score</th>
<th>Unsupported Read File Formats and Types</th>
<th>Write Data Format</th>
<th>Complex Types</th>
<th>Nested Complex Types</th>
<th>Estimated Duration</th>
<th>Unsupported Duration</th>
<th>Speed-up Duration</th>
<th>Speed-up Factor</th>
<th>Total Speed-up</th>
<th>Speed-up Bucket</th>
<th>Longest SqlDuration</th>
</tr>
</thead>
</table>
Expand Down
7 changes: 6 additions & 1 deletion tools/src/main/resources/ui/js/qual-report.js
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,11 @@ function getExpandedAppDetails(rowData) {
' <td> {{durationCollection.accelerationOpportunity}} </td>' +
' <td> ' + toolTipsValues.gpuRecommendations.details.gpuOpportunity + '</td>' +
' </tr>' +
' <tr>' +
' <th scope=\"row\">GPU Time Saved</th>' +
' <td> {{durationCollection.accelerationOpportunity}} </td>' +
' <td> ' + toolTipsValues.gpuRecommendations.details.gpuTimeSaved + '</td>' +
' </tr>' +
' </tbody>' +
'</table>';

Expand Down Expand Up @@ -161,7 +166,7 @@ $(document).ready(function(){
},
{
name: 'appDuration',
data: 'appDuration',
data: 'estimatedInfo.appDur',
type: 'numeric',
searchable: false,
render: function (data, type, row) {
Expand Down
44 changes: 20 additions & 24 deletions tools/src/main/resources/ui/js/raw-report.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@

$(document).ready(function() {
let attemptArray = processRawData(qualificationRecords);
let totalSpeedupColumnName = "totalSpeedup"
let sortColumnForGPURecommend = totalSpeedupColumnName
let rawDataTableConf = {
// TODO: To use horizontal scroll for wide table
//"scrollX": true,
Expand Down Expand Up @@ -72,7 +74,7 @@ $(document).ready(function() {
return data;
},
fnCreatedCell: (nTd, sData, oData, _ignored_iRow, _ignored_iCol) => {
if (oData.estimated) {
if (oData.endDurationEstimated) {
$(nTd).css('color', 'blue');
}
}
Expand All @@ -88,45 +90,37 @@ $(document).ready(function() {
}
},
{
data: "sqlDurationForProblematic",
searchable: false,
},
{data: "failedSQLIds"},
{data: "readScorePercent"},
{data: "readFileFormatScore"},
{data: "readFileFormatAndTypesNotSupported"},
{
data: "writeDataFormat",
orderable: false,
data: "durationCollection.estimatedDurationWallClock",
},
{
data: "complexTypes",
orderable: false,
data: "unsupportedTaskDuration",
},
{
data: "nestedComplexTypes",
orderable: false,
data: "estimatedInfo.gpuOpportunity",
},
{
data: "estimatedDuration",
name: "totalSpeedup",
data: "estimatedInfo.estimatedGpuSpeedup",
},
{
data: "unsupportedDuration",
data: "gpuCategory",
},
{
data: "speedupDuration",
},
{
data: "speedupFactor",
data: "longestSqlDuration",
},
{data: "failedSQLIds"},
{data: "readFileFormatAndTypesNotSupported"},
{
data: "totalSpeedup",
data: "writeDataFormat",
orderable: false,
},
{
data: "speedupBucket",
data: "complexTypes",
orderable: false,
},
{
data: "longestSqlDuration",
data: "nestedComplexTypes",
orderable: false,
}
],
// dom: "<'row'<'col-sm-12 col-md-6'B><'col-sm-12 col-md-6'>>" +
Expand All @@ -139,6 +133,8 @@ $(document).ready(function() {
text: 'Export'
}]
};
rawDataTableConf.order =
[[getColumnIndex(rawDataTableConf.columns, sortColumnForGPURecommend), "desc"]];
var rawAppsTable = $('#all-apps-raw-data-table').DataTable(rawDataTableConf);
$('#all-apps-raw-data [data-toggle="tooltip"]').tooltip();

Expand Down
5 changes: 3 additions & 2 deletions tools/src/main/resources/ui/js/ui-config.js
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,8 @@ let toolTipsValues = {
"speedupDuration": "Duration of SQL operations that are supported on GPU. It is calculated as (sqlDuration - unsupportedDuration)",
"unsupportedDuration": "An estimate total duration of SQL operations that are not supported on GPU",
"sqlDFDuration": "Time duration that includes only SQL-Dataframe queries.",
"gpuOpportunity": "Wall-Clock time that shows how much of the SQL duration can be speed-up on the GPU."
"gpuOpportunity": "Wall-Clock time that shows how much of the SQL duration can be speed-up on the GPU.",
"gpuTimeSaved": "Estimated Wall-Clock time saved if it was run on the GPU"
}
}
}
Expand All @@ -83,7 +84,7 @@ let UIConfig = {
"dataProcessing": {
// name of the column used to decide on the category of the app
// total SpeedUp is a factor between 1.0 and 10.0
"gpuRecommendation.appColumn": "totalSpeedup",
"gpuRecommendation.appColumn": "estimatedInfo.estimatedGpuSpeedup",
// when set to true, the JS will generate random value for recommendations
"simulateRecommendation": false
},
Expand Down
77 changes: 42 additions & 35 deletions tools/src/main/resources/ui/js/uiutils.js
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ class GpuRecommendationCategory {
// Method
isGroupOf(row) {
return row.gpuRecommendation >= this.range.low
&& row.gpuRecommendation < this.range.high;
&& row.gpuRecommendation < this.range.high;
}

toggleCollapsed() {
Expand All @@ -154,7 +154,6 @@ let recommendationContainer = [
"badge badge-pill badge-not-recommended"),
];


function createRecommendationGroups(recommendationsArr) {
let map = new Map()
recommendationsArr.forEach(object => {
Expand All @@ -167,22 +166,13 @@ let recommendationsMap = new Map(createRecommendationGroups(recommendationContai

let sparkUsers = new Map();


/* define constants for the tables configurations */
let defaultPageLength = 20;
let defaultLengthMenu = [[20, 40, 60, 100, -1], [20, 40, 60, 100, "All"]];

let appFieldAccCriterion = UIConfig.dataProcessing["gpuRecommendation.appColumn"];
let simulateRecommendationEnabled = UIConfig.dataProcessing["simulateRecommendation"];

function simulateGPURecommendations(appsArray, maxScore) {
for (let i in appsArray) {
appsArray[i]["gpuRecommendation"] =
simulateRecommendationEnabled ? getRandomIntInclusive(1, 10)
: appsArray[i][appFieldAccCriterion];
}
}

// bind the raw data top the GPU recommendations
function setGPURecommendations(appsArray) {
for (let i in appsArray) {
Expand All @@ -199,8 +189,7 @@ function setAppInfoRecord(appRecord) {
// which maps into wallclock time that shows how much of the SQL duration we think we can
// speed up on the GPU
function calculateAccOpportunityAsDuration(appRec) {
let ratio = (appRec["speedupDuration"] * 1.0) / appRec["sqlDataframeTaskDuration"];
return appRec["sqlDataFrameDuration"] * ratio;
return appRec.estimatedInfo.gpuOpportunity;
}

function setAppTaskDuration(appRec) {
Expand All @@ -211,52 +200,70 @@ function setAppTaskDuration(appRec) {
}

function calculateAccOpportunity(appRec) {
return (appRec["speedupDuration"] * 100.0) / appRec["appTaskDuration"];
return appRec.estimatedInfo.gpuOpportunity;
}

// Quick workaround to map names generated by scala to the name in UI
function mapFieldsToUI(rawAppRecord) {
rawAppRecord["speedupDuration"] = rawAppRecord["speedupOpportunity"]
rawAppRecord["cpuPercent"] = rawAppRecord["executorCPUPercent"];
// set default longestSqlDuration for backward compatibility
if (!rawAppRecord.hasOwnProperty("longestSqlDuration")) {
rawAppRecord["longestSqlDuration"] = 0;
}
// Note that appFieldAccCriterion variable does not work anymore after
// the fields became hierarchical
rawAppRecord["gpuRecommendation"] =
simulateRecommendationEnabled ? getRandomIntInclusive(1, 10)
: parseFloat(rawAppRecord.estimatedInfo.estimatedGpuSpeedup);
rawAppRecord["estimatedGPUDuration"] = parseFloat(rawAppRecord.estimatedInfo.estimatedGpuDur);
rawAppRecord["totalSpeedup"] = rawAppRecord.estimatedInfo.estimatedGpuSpeedup;
rawAppRecord["appDuration"] = rawAppRecord.estimatedInfo.appDur;
rawAppRecord["sqlDataFrameDuration"] = rawAppRecord.estimatedInfo.sqlDfDuration;
rawAppRecord["accelerationOpportunity"] = calculateAccOpportunity(rawAppRecord);
rawAppRecord["unsupportedTaskDuration"] = rawAppRecord["unsupportedSQLTaskDuration"];

}

function processRawData(rawRecords) {
let processedRecords = [];
let maxOpportunity = 0;
for (let i in rawRecords) {
let appRecord = JSON.parse(JSON.stringify(rawRecords[i]));
appRecord["estimated"] = appRecord["appDurationEstimated"];
appRecord["cpuPercent"] = appRecord["executorCPUPercent"];
// set default longestSqlDuration for backward compatibility
if (!appRecord.hasOwnProperty("longestSqlDuration")) {
appRecord["longestSqlDuration"] = 0;
}
mapFieldsToUI(appRecord)

appRecord["durationCollection"] = {
"appDuration": formatDuration(appRecord["appDuration"]),
"sqlDFDuration": formatDuration(appRecord["sqlDataFrameDuration"]),
"sqlDFTaskDuration": formatDuration(appRecord["sqlDataframeTaskDuration"]),
"sqlDurationProblems": formatDuration(appRecord["sqlDurationForProblematic"]),
"nonSqlTaskDurationAndOverhead": formatDuration(appRecord["nonSqlTaskDurationAndOverhead"]),
"estimatedDuration": formatDuration(appRecord["estimatedDuration"]),
"estimatedGPUDuration": formatDuration(appRecord["estimatedGPUDuration"]),
"estimatedDurationWallClock":
formatDuration((appRecord["appDuration"] * 1.0) / appRecord["totalSpeedup"]),
formatDuration(appRecord.estimatedInfo.estimatedGpuDur),
"accelerationOpportunity": formatDuration(calculateAccOpportunityAsDuration(appRecord)),
"unsupportedDuration": formatDuration(appRecord["unsupportedDuration"]),
"speedupDuration": formatDuration(appRecord["speedupDuration"]),
"longestSqlDuration": formatDuration(appRecord["longestSqlDuration"]),
"gpuTimeSaved": formatDuration(appRecord.estimatedInfo.estimatedGpuTimeSaved),
}

appRecord["totalSpeedup_display"] =
parseFloat(appRecord["totalSpeedup"]).toFixed(1);
setAppInfoRecord(appRecord);
maxOpportunity =
(maxOpportunity < appRecord[appFieldAccCriterion])
? appRecord[appFieldAccCriterion] : maxOpportunity;
(maxOpportunity < appRecord["gpuRecommendation"])
? appRecord["gpuRecommendation"] : maxOpportunity;
if (UIConfig.fullAppView.enabled) {
appRecord["attemptDetailsURL"] = "application.html?app_id=" + appRecord.appId;
} else {
appRecord["attemptDetailsURL"] = "#!"
}

setAppTaskDuration(appRecord);
appRecord["accelerationOpportunity"] = calculateAccOpportunity(appRecord);

processedRecords.push(appRecord)
}
simulateGPURecommendations(processedRecords, maxOpportunity);
setGPURecommendations(processedRecords);
setGlobalReportSummary(processedRecords);
return processedRecords;
Expand All @@ -282,30 +289,30 @@ function setGlobalReportSummary(processedApps) {
let recommendedCnt = 0;
let tlcCount = 0;
let totalDurations = 0;
let totalSqlDataframeTaskDuration = 0;
let totalSqlDataframeDuration = 0;
// only count apps that are recommended
let totalSpeedUpDurations = 0;
let totalGPUOpportunityDurations = 0;
for (let i in processedApps) {
// check if completedTime is estimated
if (processedApps[i]["estimated"]) {
if (processedApps[i]["endDurationEstimated"]) {
totalEstimatedApps += 1;
}
totalDurations += processedApps[i].appDuration;
totalSqlDataframeTaskDuration += processedApps[i].sqlDataframeTaskDuration;
totalSqlDataframeDuration += processedApps[i].sqlDataFrameDuration;
// check if the app is recommended or needs more information
let recommendedGroup = recommendationsMap.get(processedApps[i]["gpuCategory"])
if (recommendedGroup.id < "C") {
// this is a recommended app
// aggregate for GPU recommendation box
recommendedCnt += 1;
totalSpeedUpDurations += processedApps[i]["speedupDuration"]
totalGPUOpportunityDurations += processedApps[i]["accelerationOpportunity"]
} else {
if (recommendedGroup.id === "D") {
tlcCount += 1;
}
}

}

let estimatedPercentage = 0.0;
let gpuPercent = 0.0;
let tlcPercent = 0.0;
Expand All @@ -318,15 +325,15 @@ function setGlobalReportSummary(processedApps) {
gpuPercent = (100.0 * recommendedCnt) / processedApps.length;
// percent of apps missing information
tlcPercent = (100.0 * tlcCount) / processedApps.length;
speedUpPercent = (100.0 * totalSpeedUpDurations) / totalSqlDataframeTaskDuration;
speedUpPercent = (100.0 * totalGPUOpportunityDurations) / totalSqlDataframeDuration;
}
qualReportSummary.totalApps.numeric = processedApps.length;
qualReportSummary.totalApps.totalAppsDurations = formatDuration(totalDurations);
// speedups
qualReportSummary.speedups.numeric =
formatDuration(totalSpeedUpDurations);
formatDuration(totalGPUOpportunityDurations);
qualReportSummary.speedups.totalSqlDataframeTaskDuration =
formatDuration(totalSqlDataframeTaskDuration);
formatDuration(totalSqlDataframeDuration);
qualReportSummary.speedups.statsPercentage = twoDecimalFormatter.format(speedUpPercent)
+ qualReportSummary.speedups.statsPercentage;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ object EventLogPathProcessor extends Logging {
}

// Databricks has the latest events in file named eventlog and then any rolled in format
// eventlog-2021-06-14--20-00.gz, here we assume that is any files start with eventlog
// eventlog-2021-06-14--20-00.gz, here we assume that if any files start with eventlog
// then the directory is a Databricks event log directory.
def isDatabricksEventLogDir(dir: FileStatus, fs: FileSystem): Boolean = {
val dbLogFiles = fs.listStatus(dir.getPath, new PathFilter {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,6 @@ case class AggregateInPandasExecParser(
(1.0, false)
}
// TODO - add in parsing expressions - average speedup across?
ExecInfo(sqlID, node.name, "", speedupFactor, duration, node.id, isSupported, None)
new ExecInfo(sqlID, node.name, "", speedupFactor, duration, node.id, isSupported, None)
}
}
Loading

0 comments on commit eb0f23e

Please sign in to comment.