Qualification tool hook up final output based on per exec analysis #5550

tgravescs · 2022-05-19T22:30:48Z

fixes #5512 and fixes #5364

Lots of changes here.
So we can try to estimate based on the SQL task times, which is really estimated with the stages task times added up, and we use job overhead To do this, we line up the SQLID to stages and then which operators were in each Stage based on the accumulators. Note not all operators have a mapping to stages.
We take the task time calculations and get a ratio and then try to apply them to the wall clock times to display to the user.

change some of the UDF based operators to have score of 1.2 as we only accelerate the data transfer and udfs still run on cpu.

Each exec checks for udfs and dataset individually to say those aren't supported vs doing the entire SQL operation.

We have 4 levels of recommendation, strongly, recommended, not and not applicable. not applicable is when we have stage or job failures. Need to followup and update UI with good way to display this.

Updated the output to be what we want. Focus on wall clock for the user readable side, the csv file has the task level details. I think we will want to put more in the csv file but for now this works. Outputs the exec level in csv file and outputs the stages used in SQL operations in a csv file.

Need to do more testing on this but would like to get this in and be able to get others to test. A few tests were commented out to update later.

There is also more code cleanup and refactoring I would like to do to make this more readable but that can wait.

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

Signed-off-by: Thomas Graves <tgraves@apache.org>

tgravescs · 2022-05-19T22:31:51Z

build

tools/src/main/scala/com/nvidia/spark/rapids/tool/planparser/SQLPlanParser.scala

tgravescs · 2022-05-20T14:31:24Z

tools/src/main/scala/org/apache/spark/sql/rapids/tool/qualification/QualificationAppInfo.scala

+
+      // TODO - do we want to use this and rely on stages, but some SQL don't have stages
+      // so this is less than SQL DF real
+      val sqlDFWallClockDuration =


there is a bug here with the df duration because its looking at supported vs all. I will fix in a followup #5570

amahussein

I checked the UI works fine with this PR.
I approve this PR as long as we can followup with PRs addressing the issues listed in #3792

tgravescs · 2022-05-20T18:19:27Z

build

nartal1

LGTM. The calculation of speedup can be improved in follow on PR's.

tgravescs · 2022-05-23T12:56:44Z

build

tgravescs · 2022-05-23T14:00:23Z

not sure why a test was hanging due to this, but fixed.

tgravescs · 2022-05-23T14:00:28Z

build

amahussein and others added 30 commits May 10, 2022 15:59

QualificationTool. Add speedup information to AppSummaryInfo

55c8384

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

address review comments

82a5a9b

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

debug

9d0cbc3

check for dataset

af060fb

change dataset check for all

ed74657

Signed-off-by: Thomas Graves <tgraves@apache.org>

test unsupported time

17b39c7

more changes

6b2d419

fix to string

c14e2e3

fix including wholestage codegen

da9665d

put unsupported Dur

a68deae

calculate duration of non sql stages

c965c1c

change to get stage task time

53668a4

combine

480ddee

hooking up final output

c7aa64b

initial scores changes

9b09b23

logging

5ad9ec4

logging

5af6d70

update factor

3abed14

gturn off some logging:

c0f3bb5

debug

b8b0ad9

Merge remote-tracking branch 'origin/branch-22.06' into qualStageMetrics

e941ca4

track execs without stages

718a32a

Add in exec info output

14419e8

fix output

1a1b5d1

add sorting

f715975

fix output

5fbe980

fix output sizes

aaf3873

use plan infos without execs removed

ad365cd

output children node ids

e5c3551

output stages info

851b712

tgravescs and others added 6 commits May 19, 2022 16:46

update results

5e88b53

update results

9c741ae

comment out tests

273754e

update test

826aca1

update operator scores

8ecf8b8

cleanup

992296c

tgravescs added the tools label May 19, 2022

tgravescs added this to the May 2 - May 20 milestone May 19, 2022

tgravescs self-assigned this May 19, 2022

tgravescs requested review from amahussein and nartal1 and removed request for amahussein May 19, 2022 22:33

nartal1 reviewed May 20, 2022

View reviewed changes

tools/src/main/scala/com/nvidia/spark/rapids/tool/planparser/SQLPlanParser.scala Show resolved Hide resolved

tgravescs changed the title ~~Qual stage metrics~~ Qualification tool hook up final output based on per exec analysis May 20, 2022

tgravescs commented May 20, 2022

View reviewed changes

amahussein previously approved these changes May 20, 2022

View reviewed changes

tgravescs mentioned this pull request May 20, 2022

Qualification tool update RunningQualificationApp #5563

Closed

nartal1 previously approved these changes May 20, 2022

View reviewed changes

amahussein mentioned this pull request May 20, 2022

[FEA] Qualification tool UI cosmetics #5517

Closed

fix test hang

7fcd8f8

tgravescs dismissed stale reviews from nartal1 and amahussein via 7fcd8f8 May 23, 2022 14:00

nartal1 approved these changes May 23, 2022

View reviewed changes

tgravescs merged commit eb0f23e into NVIDIA:branch-22.06 May 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualification tool hook up final output based on per exec analysis #5550

Qualification tool hook up final output based on per exec analysis #5550

tgravescs commented May 19, 2022

tgravescs commented May 19, 2022

tgravescs May 20, 2022

amahussein left a comment

tgravescs commented May 20, 2022

nartal1 left a comment

tgravescs commented May 23, 2022

tgravescs commented May 23, 2022

tgravescs commented May 23, 2022

Qualification tool hook up final output based on per exec analysis #5550

Qualification tool hook up final output based on per exec analysis #5550

Conversation

tgravescs commented May 19, 2022

tgravescs commented May 19, 2022

tgravescs May 20, 2022

Choose a reason for hiding this comment

amahussein left a comment

Choose a reason for hiding this comment

tgravescs commented May 20, 2022

nartal1 left a comment

Choose a reason for hiding this comment

tgravescs commented May 23, 2022

tgravescs commented May 23, 2022

tgravescs commented May 23, 2022