Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Trigger host memory spilling when more host memory is needed #8881

Closed
revans2 opened this issue Jul 31, 2023 · 0 comments · Fixed by #9257
Closed

[FEA] Trigger host memory spilling when more host memory is needed #8881

revans2 opened this issue Jul 31, 2023 · 0 comments · Fixed by #9257
Assignees
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented Jul 31, 2023

Is your feature request related to a problem? Please describe.
The HostMemoryStore has ways to spill memory to disk. But right now it only happens when spilling from the GPU needs more memory to complete a spill. The goal here is to expose a API so that the new APIs from #8879 can call into it when memory is needed.

At the same time we need to tie the two APIs together. This gets a little complicated though. The spill storage would then need to decide if it should spill pinned or pageable memory to satisfy the request. We should favor spilling pageable memory over pinned unless we have no other choice. In fact spilling pinned memory can be a follow on issue if we need to.

The host memory spill storage would be updated to use the new allocation APIs, and the new allocation APIs would be updated to call into the host spill storage spill API if an allocation would would violate a limit, but there is enough spillable memory to cover it. Ideally reservations will not trigger a spill, but because memory can be made to be not spillable dynamically it might be simplest to just spill as much as is needed for the reservation when it is allocated. If we do this we should have a follow on issue to understand what it would take to not do this. In order to avoid any deadlocks we will have the spill storage use the maxPriority flag when doing host memory allocations.

In addition to this if the spill storage code is informed that an allocation is too large to ever fit in the pool, then instead of allocating it anyways, which happens today, we will need to spill the data to disk from GPU memory using bounce buffers (possibly on heap buffers) and bypass a single large CPU allocation all together.

We also should update the code that gets data out of the disk spill storage and puts it on the GPU. It should use the new allocation APIs and if an allocation would never work it will also need to move the data in a guaranteed to work way back to the GPU. We still need this code to operate at super-priority levels.

Ideally this should expose callbacks to the allocation APIs so as more memory is made spillable blocked threads can be woken up. If this needs to wait for #8882 a follow on issue needs to be filed to so we don't drop this.

@revans2 revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Jul 31, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants