Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a few issues with out of core sort #2209

Merged
merged 3 commits into from
Apr 21, 2021

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Apr 20, 2021

This fixes an off by one error when and entire batch is already sorted. It fixes an issue when inserting batches into the pending queue, and it fixes an issue when sorting only rows, no columns. Any of what could lead to data corruption.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2 revans2 added the bug Something isn't working label Apr 20, 2021
@revans2 revans2 added this to the Apr 12 - Apr 23 milestone Apr 20, 2021
@revans2 revans2 self-assigned this Apr 20, 2021
@revans2
Copy link
Collaborator Author

revans2 commented Apr 20, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Apr 20, 2021

build

Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments

@revans2
Copy link
Collaborator Author

revans2 commented Apr 20, 2021

build

@@ -284,7 +283,7 @@ case class GpuOutOfCoreSortIterator(
// Protect ourselves from large rows when there are small targetSizes
val targetRowCount = Math.max((targetBatchSize/averageRowSize).toInt, 1024)

if (sortedOffset == rows - 1) {
if (sortedOffset == rows) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code would read easier if we renamed sortedOffset to sortedRows or numSortedRows

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think of it in terms of offsets instead of number of rows. The fact that they end up being equal is just because the sorted values are at the first part of the batch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocking issue.

Just to explain my thingking: I think of an offset as the start position of an array, or a range the way it's used on L300 [sortedOffset, rows). Since it describes the unsorted range, if we wanted to use the term offset, I'd call it unsortedOffset. On the other hand, we can view it as the definition of the sorted area [0, sortedOffset) in which case sortedRows works better for me.

@pxLi
Copy link
Collaborator

pxLi commented Apr 21, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Apr 21, 2021

@gerashegalov is it OK if I merge this? or do you want me to make changes because the memory model is not OK

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's more of my mental model. no problem either way.

@@ -284,7 +283,7 @@ case class GpuOutOfCoreSortIterator(
// Protect ourselves from large rows when there are small targetSizes
val targetRowCount = Math.max((targetBatchSize/averageRowSize).toInt, 1024)

if (sortedOffset == rows - 1) {
if (sortedOffset == rows) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocking issue.

Just to explain my thingking: I think of an offset as the start position of an array, or a range the way it's used on L300 [sortedOffset, rows). Since it describes the unsorted range, if we wanted to use the term offset, I'd call it unsortedOffset. On the other hand, we can view it as the definition of the sorted area [0, sortedOffset) in which case sortedRows works better for me.

@revans2 revans2 merged commit f60d11d into NVIDIA:branch-0.5 Apr 21, 2021
@revans2 revans2 deleted the out_of_core_sort_fix branch April 21, 2021 18:31
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants