Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry to RoundRobin Partitioner and Range Partitioner #9419

Merged
merged 10 commits into from
Oct 23, 2023

Conversation

winningsix
Copy link
Collaborator

Currently there're three partitioners in general: round robin, range and single partitioner. This PR will cover round robin and range partitioner. And will create a follow-up PR for single partitioner.

It's tested by newly added UTs.

Related issue is 8502.

@winningsix
Copy link
Collaborator Author

build

1 similar comment
@res-life
Copy link
Collaborator

build

@res-life
Copy link
Collaborator

build

@winningsix
Copy link
Collaborator Author

build

1 similar comment
@res-life
Copy link
Collaborator

build

@sameerz sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Oct 16, 2023
@firestarman firestarman linked an issue Oct 16, 2023 that may be closed by this pull request
@winningsix
Copy link
Collaborator Author

build

firestarman
firestarman previously approved these changes Oct 18, 2023
(parts, columns)
} else {
// Increase ref count since the caller will close the batch also.
val spillableBatch = SpillableColumnarBatch(GpuColumnVector.incRefCounts(batch),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same leak question here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This batch will be close in its caller.

@winningsix
Copy link
Collaborator Author

build

1 similar comment
@firestarman
Copy link
Collaborator

build

@winningsix
Copy link
Collaborator Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits from me.

new Random(TaskContext.get().partitionId())
} else {
// For unit test purpose where task context does not exist
new Random
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should we have a hard coded seed here instead? just so that we get deterministic results for unit tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it could be helpful when we need to validate result. We can touch this part when we have such needs.

TestUtils.withGpuSparkSession(new SparkConf()) { _ =>
val rrp = GpuRoundRobinPartitioning(partNum)
// batch will be closed within columnarEvalAny
val batch = buildBatch
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it might be nice to build the batch right when we call columnarEvalAny just so there is less code between where it is created and where it is consumed. Just because it could leak if there is an exception inbetween.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's separated due to buildBatch method implement didn't have a retry in this test suite. We can do it and since it's just a test, so I skipped it.


val rp = GpuRangePartitioner(Array.apply(bounds), gpuSorter)
// batch will be closed within columnarEvalAny
val batch = buildBatch
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit here about how long it lives for. Very minor, but a nice to have.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@winningsix winningsix merged commit ed37af8 into NVIDIA:branch-23.12 Oct 23, 2023
29 of 30 checks passed
@winningsix winningsix deleted the 8502 branch October 23, 2023 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add retry to GpuShuffleExchangeExec
6 participants