You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Once we merge #462 we will lose the optimization of wrapping a GpuShuffleExchangeExec in a GpuCoalesceBatches, which we have when AQE is off.
We should explore ways to re-enable this, even though we are not seeing a performance degredation from the TPCxBB benchmarks. It is possible that other AQE optimizations are offseting the impact of this.
Describe the solution you'd like
We have to return something that implements ShuffleExchangeLike when AQE creates new shuffle query stages, so we can't return GpuCoalesceBatchesExec.
We could potentially create some kind of wrapper for GpuCoalesceBatchesExec(GpuShuffleExchangeExec) that implements ShuffleExchangeLike. Another option would be to build the coalesce functionality directly into GpuShuffleExchangeExec.
Describe alternatives you've considered
None.
Additional context
None.
The text was updated successfully, but these errors were encountered:
Note that when we use compression for shuffle we will likely rely on GpuColaesceBatches to batch-decompress a set of compressed ColumnarBatch. See #487 for details.
After spending time studying the code today, I now understand that when we have GpuCoalesceBatches(GpuShuffleExchangeExec()), what this translates to at execution time is that GpuCoalesceBatches is really wrapping ShuffledBatchRDD which reads the output from the map-side shuffle operation.
The current AQE PR does something very similar because it creates GpuCoalesceBatches(GpuCustomShuffleReader()) which also just wraps ShuffledBatchRDD.
This means we did not lose any optimization after all, so I don't think anything actually needs to happen with this issue.
Is your feature request related to a problem? Please describe.
Once we merge #462 we will lose the optimization of wrapping a GpuShuffleExchangeExec in a GpuCoalesceBatches, which we have when AQE is off.
We should explore ways to re-enable this, even though we are not seeing a performance degredation from the TPCxBB benchmarks. It is possible that other AQE optimizations are offseting the impact of this.
Describe the solution you'd like
We have to return something that implements
ShuffleExchangeLike
when AQE creates new shuffle query stages, so we can't returnGpuCoalesceBatchesExec
.We could potentially create some kind of wrapper for
GpuCoalesceBatchesExec(GpuShuffleExchangeExec)
that implementsShuffleExchangeLike
. Another option would be to build the coalesce functionality directly intoGpuShuffleExchangeExec
.Describe alternatives you've considered
None.
Additional context
None.
The text was updated successfully, but these errors were encountered: