Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve aggregation performance of average on DECIMAL128 columns [databricks] #4776

Merged
merged 2 commits into from
Feb 16, 2022

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Feb 14, 2022

Fixes #4722.

Accelerates DECIMAL128 average aggregations using the same technique employed in #4688. This allows the use of the cudf hash-based algorithm instead of the slower sort-based algorithm for these types of aggregations.

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe added the performance A performance related task/issue label Feb 14, 2022
@jlowe jlowe added this to the Feb 14 - Feb 25 milestone Feb 14, 2022
@jlowe jlowe self-assigned this Feb 14, 2022
@jlowe
Copy link
Member Author

jlowe commented Feb 14, 2022

build

@jlowe
Copy link
Member Author

jlowe commented Feb 15, 2022

CI failed due to recent cudf string split API breakage:

Error: :43:11.360Z] [ERROR] [Error] /home/jenkins/agent/workspace/jenkins-rapids_premerge-github-3972/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/stringFunctions.scala:1371: overloaded method value stringSplitRecord with alternatives:
[2022-02-14T17:43:11.361Z]   (x$1: String,x$2: Int)ai.rapids.cudf.ColumnVector <and>
[2022-02-14T17:43:11.361Z]   (x$1: String,x$2: Boolean)ai.rapids.cudf.ColumnVector
[2022-02-14T17:43:11.361Z]  cannot be applied to (ai.rapids.cudf.Scalar, Int)

@jlowe
Copy link
Member Author

jlowe commented Feb 15, 2022

build

@jlowe
Copy link
Member Author

jlowe commented Feb 15, 2022

CI couldn't find resources:

[2022-02-15T15:28:14.617Z] Waited 60 times already, stopping

@jlowe
Copy link
Member Author

jlowe commented Feb 15, 2022

build

@jlowe jlowe merged commit 342cbca into NVIDIA:branch-22.04 Feb 16, 2022
@jlowe jlowe deleted the dec128-avg-perf branch February 16, 2022 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize DECIMAL128 average aggregations
3 participants