Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in unbounded to unbounded optimization for min/max #9228

Merged
merged 1 commit into from
Sep 19, 2023

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Sep 12, 2023

This fixes #9057

I know using the GPU for min/max on two scalar values is problematic, but it is by far the simplest way to make it happen.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2 revans2 self-assigned this Sep 12, 2023
@revans2
Copy link
Collaborator Author

revans2 commented Sep 12, 2023

build

@revans2
Copy link
Collaborator Author

revans2 commented Sep 12, 2023

After this patch

spark.range(1, 10000000000L, 1, 1).selectExpr("id", "MIN(id) OVER (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) as min_id", "MAX(id) OVER (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) as max_id").show()

passes. Before it failed very quickly with

Caused by: java.lang.IllegalStateException: A single batch is required for this operation, but cuDF only supports 2147483647 rows. At least 2147483648 are in this partition. Please try increasing your partition count.

@sameerz sameerz added the task Work required that improves the product but is not user facing label Sep 18, 2023
@mythrocks mythrocks self-requested a review September 18, 2023 16:55
Comment on lines +455 to +456
query_parts = ['min(b) over (order by a rows between UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as min_col',
'max(b) over (order by a rows between UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as max_col']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a neat way of phrasing the query. I'll note this for next time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stole it from test_window_running_no_part just below this on line 492

@revans2 revans2 merged commit d7f58ea into NVIDIA:branch-23.10 Sep 19, 2023
29 checks passed
@revans2 revans2 deleted the u_to_u_min_n_max branch September 19, 2023 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add unbounded to unbounded fixers for min and max
3 participants