-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Pack/unpack cuDF frames during serialization #4803
Conversation
Can one of the admins verify this patch? |
f35dbc5
to
2b40691
Compare
Dask typically does this. So may overwrite the value we have here. However just make sure we have set this value correctly. It makes it a little easier to play with the serialization and deserialization functions here. Plus if Dask doesn't set these for some reason, we have handled it ourselves.
In the `"cuda"` case, we really want to make sure Dask is not trying to compress our data. This shouldn't be happening anyways. However this provides some protection against it. In the `"dask"` case, allow compression since this is data on host.
To cutdown on the number of frames that need to be transmitted over the wire or transferred to/from host, pack all `Buffer`s into one `DeviceBuffer` using CuPy to facilitate. In deserialization, use CuPy to unpack them into a series of separate `DeviceBuffer` allocations. Since cuDF already makes sure CuPy uses RMM for allocations, we need not worry about this when creating `ndarray`s. However it is worth noting we pay for a copy and a larger allocation both during serialization and deserialization. It would be nice to avoid that, but it probably requires some C++/CUDA code to do. So have held off on it for now.
Looks like Dask is doing something funky. The gist is it appears to be trying to call |
Codecov Report
@@ Coverage Diff @@
## branch-0.14 #4803 +/- ##
===============================================
- Coverage 88.42% 88.34% -0.09%
===============================================
Files 51 51
Lines 9734 9743 +9
===============================================
Hits 8607 8607
- Misses 1127 1136 +9
Continue to review full report at Codecov.
|
Closing as this is more thoroughly addressed by PR ( #5025 ). |
Related to issue ( #3793 ), issue ( https://github.com/rapidsai/rmm/issues/318 ), issue ( rapidsai/dask-cuda#250 )
To cutdown on the number of frames that need to be transmitted over the wire or transferred to/from host, pack all
Buffer
s into oneDeviceBuffer
using CuPy to facilitate. In deserialization, use CuPy to unpack them into a series of separateDeviceBuffer
allocations.Since cuDF already makes sure CuPy uses RMM for allocations, we need not worry about this when creating
ndarray
s. However it is worth noting we pay for a copy and a larger allocation both during serialization and deserialization. It would be nice to avoid that, but it probably requires some C++/CUDA code to do. So have held off on it for now.cc @quasiben