You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is pure tech debt and is a follow on from #3178 (comment). The aggregation code is horribly complicated and is doing lots of special cases that the open source Spark code is not doing. I am not 100% sure why we have done all of this special case work. A lot of it feels like setupReferences is part of the issue and we need to do a deeper dive into it.
As a high level explanation of how this is supposed to work.
Each aggregation comes with a mode. Each mode tells the aggregation what to do as a part of that stage. Originally the code assumed that there would only ever be one mode for all of the aggregations. I thought we had ripped that all out and each aggregation does the right thing.
To successfully do an aggregation there are a few steps used.
Initial projection to get input data setup properly.
Initial aggregation to produce partial result(s)
Merge aggregation to combine partial results (This requires that the input schema and the output schema be the same)
Final projection to take the combined partial results and produce a final result.
In general the steps take the pattern 1, 2, 3*, 4. Which means 1, 2 and 4 are required and step 3 can be done as often as needed because the input and output schemas are the same.
Step 4 requires that all of the data for a given group by key is on the same task and has been merged into a single output row. There are several different ways to do this, which is why we end up with several aggregation modes.
Partial mode means that we do Step 1 and Step 2. Then we can do Step 3 as many times as needed depending on how we are doing memory management, and how many batches are needed.
PartialMerge mode means we can do Step 3 at least once and possibly more times depending on how we are doing memory management and how many batches are needed.
Final mode means that we do the same steps as with PartialMerge but do Step 4 when we are done doing the partial merges.
Complete mode is something only Databricks does, but it essentially means we do Step 1, Step 2, Step 3* (depending on memory management requirements), and Step 4 all at once.
The majority of the complexity appears to be in setupReferences where it tries to match up the data coming into the AggregateExec with what each aggregation wants at a given stage.
I think the main ask is to not ... assum{e}... that a hash aggregate exec has a certain shape. If this function could decide per aggregate expression mode what the right binding should be, it should be more robust to new aggregate exec setups that mix and match modes (if we encounter new ones). ...
The text was updated successfully, but these errors were encountered:
This is pure tech debt and is a follow on from #3178 (comment). The aggregation code is horribly complicated and is doing lots of special cases that the open source Spark code is not doing. I am not 100% sure why we have done all of this special case work. A lot of it feels like
setupReferences
is part of the issue and we need to do a deeper dive into it.As a high level explanation of how this is supposed to work.
Each aggregation comes with a mode. Each mode tells the aggregation what to do as a part of that stage. Originally the code assumed that there would only ever be one mode for all of the aggregations. I thought we had ripped that all out and each aggregation does the right thing.
To successfully do an aggregation there are a few steps used.
In general the steps take the pattern 1, 2, 3*, 4. Which means 1, 2 and 4 are required and step 3 can be done as often as needed because the input and output schemas are the same.
Step 4 requires that all of the data for a given group by key is on the same task and has been merged into a single output row. There are several different ways to do this, which is why we end up with several aggregation modes.
The majority of the complexity appears to be in setupReferences where it tries to match up the data coming into the AggregateExec with what each aggregation wants at a given stage.
I think @abellina said it well
The text was updated successfully, but these errors were encountered: