Use fresh SparkSession when capturing to avoid late capture of previous query #537

jlowe · 2020-08-10T22:32:42Z

Signed-off-by: Jason Lowe jlowe@nvidia.com

This hopefully fixes #473.

I believe what's happening in that bug is the test just before the one that fails isn't trying to capture yet the capture callback is still enabled. I suspect the callback on the last test's query is late, occurring while the next test is already running and after it enables the callback capture. That causes it to capture the previous test's GPU run as the CPU run and the subsequent GPU run captures the CPU run instead which explains why we see a CPU plan when it fails.

This updates runOnCpuAndGpuWithCapture to force a new Spark session which should drain the listener callbacks during the session stop and should create a hard boundary between the previous test and the next test that is trying to capture. The downside is that using captures will be slower due to the new session being created.

…us query Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe · 2020-08-10T22:32:58Z

build

…us query (NVIDIA#537) Signed-off-by: Jason Lowe <jlowe@nvidia.com>

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com> Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

Use fresh SparkSession when capturing to avoid late capture of previo…

9ee7591

…us query Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe added bug Something isn't working test Only impacts tests labels Aug 10, 2020

jlowe added this to the Aug 3 - Aug 14 milestone Aug 10, 2020

jlowe self-assigned this Aug 10, 2020

kuhushukla approved these changes Aug 10, 2020

View reviewed changes

revans2 approved these changes Aug 11, 2020

View reviewed changes

revans2 merged commit f7e8536 into NVIDIA:branch-0.2 Aug 11, 2020

nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021

Use fresh SparkSession when capturing to avoid late capture of previo…

e355d0c

…us query (NVIDIA#537) Signed-off-by: Jason Lowe <jlowe@nvidia.com>

nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021

Use fresh SparkSession when capturing to avoid late capture of previo…

9a787b2

…us query (NVIDIA#537) Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe deleted the capture-bug branch September 10, 2021 15:31

pxLi mentioned this pull request Jul 20, 2022

[BUG] Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec failure #6032

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use fresh SparkSession when capturing to avoid late capture of previous query #537

Use fresh SparkSession when capturing to avoid late capture of previous query #537

jlowe commented Aug 10, 2020

jlowe commented Aug 10, 2020

Use fresh SparkSession when capturing to avoid late capture of previous query #537

Use fresh SparkSession when capturing to avoid late capture of previous query #537

Conversation

jlowe commented Aug 10, 2020

jlowe commented Aug 10, 2020