You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When GDS spill is enabled, we would like to perform aligned IO, i.e. reads and writes between GPU memory and NVMe drives should be aligned to 4 KiB. However, the current implementation in the shuffle server (specifically in BufferSendState) packs shuffle buffers into the UCX bounce buffer, making it impossible to do aligned reads.
Describe the solution you'd like
An option in the shuffle server to allow 4 KiB aligned reads of shuffle buffers.
Describe alternatives you've considered
An alternative is to unspill whole shuffle buffers back into GPU memory before streaming them to the UCX bounce buffer, but for large shuffle buffers this might cause more spilling, thus less efficient.
Is your feature request related to a problem? Please describe.
When GDS spill is enabled, we would like to perform aligned IO, i.e. reads and writes between GPU memory and NVMe drives should be aligned to 4 KiB. However, the current implementation in the shuffle server (specifically in
BufferSendState
) packs shuffle buffers into the UCX bounce buffer, making it impossible to do aligned reads.Describe the solution you'd like
An option in the shuffle server to allow 4 KiB aligned reads of shuffle buffers.
Describe alternatives you've considered
An alternative is to unspill whole shuffle buffers back into GPU memory before streaming them to the UCX bounce buffer, but for large shuffle buffers this might cause more spilling, thus less efficient.
Additional context
The GPUDirect Storage best practices guide talks about aligned vs. unaligned IO (https://docs.nvidia.com/gpudirect-storage/best-practices-guide/index.html).
@abellina @jlowe
The text was updated successfully, but these errors were encountered: