-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling AQE on [databricks] #6461
Conversation
…rs next Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
…Stats fix Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
… to AQE optimizations Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
build |
Looks like I found a new failure when using SQL UNION, seems to crash the system:
Might be caused by the NPE here:
|
… is enabled on Databricks Signed-off-by: Navin Kumar <navink@nvidia.com>
This is actually an in issue in GpuShuffleExchangeExec sending an incorrect job id (maybe stage id?) to the scheduler. Current workaround is to fallback ShuffleExchangeExec to CPU when using Databricks and AQE.
|
…s to using original Spark implementation to fix concurrency bug in shim Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
build |
Fixed the issue in |
Have we measured the performance impact of this? In many cases AQE is not that big of a performance win, but sending the data to the CPU and back again is a really big performance hit. |
Note that it is still on the GPU in Databricks 10.4. We can measure the performance impact for the older version which has this fallback if that makes sense.
On Sep 2, 2022, at 7:55 AM, Robert (Bobby) Evans ***@***.***> wrote:
Fixed the issue in GpuShuffleExchangeExec on Databricks 10.4. For Databricks 9.1, ShuffleExchangeExec will fallback to CPU when AQE is enabled in that environment, due to the complexities of how it handles the submission of the map stage. Updated the lead comment to reflect updates.
Have we measured the performance impact of this? In many cases AQE is not that big of a performance win, but sending the data to the CPU and back again is a really big performance hit.
—
Reply to this email directly, view it on GitHub<#6461 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AXFDIM3CFV2WRLAYCW33BADV4IIMXANCNFSM6AAAAAAQA4JWIA>.
You are receiving this because you were assigned.Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall seems fine, I thought at one point you had mentioned implementing computeStats per exec and trying to get a realistic size, did that not work out?
sql-plugin/src/main/311+-db/scala/com/nvidia/spark/rapids/shims/ShimLeafExecNode.scala
Show resolved
Hide resolved
sql-plugin/src/main/321db/scala/com/nvidia/spark/rapids/shims/AQEUtils.scala
Outdated
Show resolved
Hide resolved
If we wanted to do it correctly and most accurately, it is a bit an undertaking to implement the PlanVisitor for not just the leaf exec's but also some of the operations. In particular, we have to implement potentially our own Join estimation. Also, when I started looking into it, I got some strangely different numbers in our Parquet reading exec's vs what Databricks is reporting. I figure that can be done separately as potentially a task in the future (if there is any upside to the computation). |
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Actually figured out how to get GPU Shuffle back in 9.1, will push an update shortly |
…logic Signed-off-by: Navin Kumar <navink@nvidia.com>
build |
sql-plugin/src/main/312db/scala/com/nvidia/spark/rapids/shims/AQEUtils.scala
Show resolved
Hide resolved
sql-plugin/src/main/321db/scala/com/nvidia/spark/rapids/shims/AQEUtils.scala
Outdated
Show resolved
Hide resolved
// stage, replacing them with aliases. Also, sometimes children are not provided in the | ||
// initial list of expressions after optimizations, so we add them here, and they will | ||
// be deduped anyways in the other passes | ||
val newChildren = wf.children.map(ce => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mythrocks mind taking a look at this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay.
I'm not particularly familiar with the logic behind isPreNeeded
, etc. But on discussion with @NVnavkumar, one wonders if we should check why isPreNeeded
is turning up false
on Databricks, with AQE turned on. We might adjust how isPreNeeded
is calculated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, isPreNeeded
is false
on Databricks, but it's a bit of a red herring in this case. We actually need to the window function children in the GpuWindowExec itself, it looks like the extra GpuProject does not help in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some exploration, I have determined that the bug this fixes is not AQE-specific and I'm not 100% confident that this fix is currently the right approach. I have reverted this fix and filed a new issue #6531 to track the bug here.
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
build |
1 similar comment
build |
… an AQE-specific bug Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
build |
…k in CI Signed-off-by: Navin Kumar <navink@nvidia.com>
build |
build |
Fixes #1059
This branch fixes a couple of issues with enabling Spark Adaptive Query Execution (AQE) on the Databricks Spark environment. Currently this is marked as WIP since I'm still investigating whether there are any more obvious ongoing issues with enabling adaptive execution in Databricks systems. Here are the issues fixed so far:
implementing certain method calls that are currently required by the Databricks environment
implementing handling of a logical plan window optimization that happens in Databricks distributions but not currently in Apache Spark.
Fixed the Databricks shim for GpuShuffleExchangeExec for 10.4 to fix the issue with a duplicate submission of the map stage which causes a missing job id reference.
Added to the AQEUtils shim to handle 9.1 so that ShuffleExchangeExec will fallback to CPU when AQE is enabled.