Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow skipping host spill for a direct device->disk spill #9211

Merged
merged 12 commits into from
Sep 15, 2023

Conversation

abellina
Copy link
Collaborator

@abellina abellina commented Sep 8, 2023

Contributes to #8881.

This is a work in progress to allow spill to skip host. I need to run this under a workload (by enabling the "skip host" mode locally) and check for leaks and make sure all the accounting is working correctly. It leaves that mode disabled by default because I think this is something that will be integrated later against the new HostAlloc api, but I can try integrating that in this PR if needed.

I did some minor refactoring because I didn't want to preclude a certain order between calling free and getting the size of the spilled RapidsBuffer in the source tier. I made that method a val, so we are guaranteed that it is set until GC.

This also fixes #9223. This is an issue with 0-byte RapidsBuffers as they spill to disk. mmap on 0-byte length file segment is not valid, yet this can very well given a ColumnarBatch with 0 rows for example. The code in this PR will not allow these 0-byte buffers to be marked as spillable, so they will never reach disk.

throw new IllegalStateException("copying from buffer without device memory")
stream: Cuda.Stream): Option[RapidsBufferBase] = {
val wouldFit = trySpillToMaximumSize(other, stream)
// TODO: this is disabled for now since subsequent work will tie this into
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@revans2 this is I think an integration point with the host allocator, so I left this as a TODO disabled by default so it does what it used to do, if we decide to merge this. Let me know if I should look into the apis and try to integrate things, or if this is what you needed.

@abellina abellina changed the title Host spill skip to disk Allow skipping host spill for a direct device->disk spill Sep 8, 2023
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick pass and it is looking good. I want to spend some more time on it later to dig in a little deeper.

@sameerz sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Sep 9, 2023
@abellina
Copy link
Collaborator Author

I have an issue with the posted code where I will loose a RapidsBuffer if it spills while I am spilling. This is because of how I grouped them together and updated tiers all at once. I am testing a fix and will push today.

@abellina abellina marked this pull request as ready for review September 13, 2023 16:35
@abellina
Copy link
Collaborator Author

@revans2 this is ready for another pass

@@ -659,38 +642,19 @@ class RapidsBufferCatalog(
logDebug(s"Skipping spilling $buffer ${buffer.id} to ${spillStore.name} as it is " +
s"already stored in multiple tiers")
}
>>>>>>> 740d1175060555ad373f30ed780a58215b6c3419
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the upmerge was done properly.

Copy link
Collaborator Author

@abellina abellina Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, this method needs to go away (it's now in the store). Fixed in 4916333

@abellina
Copy link
Collaborator Author

build

// try one last time after locking the catalog (slow path)
// if there is a lot of contention here, I would rather lock the world than
// have tasks error out with "Unable to acquire"
synchronized {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not strictly necessary, but while debugging some other issue I thought it wouldn't hurt. Before this we would perform the attempts and throw, without trying to lock the catalog and try it one last time. I don't have evidence of this saving a task that would otherwise fail however.

@abellina
Copy link
Collaborator Author

My intellij was running without assertions , so I didn't see test failures reported by CI. Fixed.

@abellina
Copy link
Collaborator Author

build

@abellina abellina merged commit 34d615d into NVIDIA:branch-23.10 Sep 15, 2023
28 checks passed
@abellina abellina deleted the host_spill_skip_to_disk branch September 15, 2023 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Failed to create memory map on query14_part1 at 100TB with spark.executor.cores=64
3 participants