You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I don't know how important this is. When I was doing performance testing I didn't see many cases where the performance was significantly impacted by the performance of this. But it also is not free, which is why I added a metric for it in #8618.
One such example is doing a simple SUM on an int. Where the key has a decent amount of overlap and is small. Because the SUM of an int is a long and because there are only two columns it looks like the size of the output is growing by more than 10%, but the in practice that is not true, and the cost of computing the distinct count might be more than the cost of doing the aggregation itself. I saw a 9% performance degradation in this one case. I don't know if my magic number of 10% should be closer to 200% or something, or if there is a better way to decide that the number of aggregations is just too small to even bother with trying to do the heuristic.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I don't know how important this is. When I was doing performance testing I didn't see many cases where the performance was significantly impacted by the performance of this. But it also is not free, which is why I added a metric for it in #8618.
One such example is doing a simple SUM on an int. Where the key has a decent amount of overlap and is small. Because the SUM of an int is a long and because there are only two columns it looks like the size of the output is growing by more than 10%, but the in practice that is not true, and the cost of computing the distinct count might be more than the cost of doing the aggregation itself. I saw a 9% performance degradation in this one case. I don't know if my magic number of 10% should be closer to 200% or something, or if there is a better way to decide that the number of aggregations is just too small to even bother with trying to do the heuristic.
The text was updated successfully, but these errors were encountered: