-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cudf should spil to main memory when running out of gpu memory #129
Comments
Solving this issue allows cudf to compute medium size data (5GB). AFAIK 50 GB was failing due to OOM (main memory), thus I filled new FR in cudf to handle such cases: rapidsai/cudf#3740 |
@datametrician I would appriciate if you have any clues what might be a problem. Since I switched to using managed memory I started to get the following error
It causes cuda drivers to hang (I assume), trying to use cudf in another session hangs that session as well. I cannot even kill the process (from
The only way seems to be hard reboot, which is not an option at the moment. |
more complete output @datametrician
|
@jangorecki, can you join the RAPIDS Go-AI Slack channel. We do have a feature for this, dask_cudf, and I can show you how to use dask_cudf to get around this. Is there a reason why using our |
@jangorecki this issue looks like managed memory eating up the system memory to the point that that driver context is corrupted where unfortunately the only option is to restart the machine. UVM only supports spilling to host memory because the migration from host --> GPU occurs via a page fault mechanism that won't work with disks. As Taurean pointed out, |
@taureandyernv Thanks for you comment. It is not that dask-cudf is insufficient. I want to use dask-cudf in benchmarks. Problem is that I found documentation lacking my use case (see "dask_cudf.read_csv docstring").
@kkraus14 Thanks for your comment. It helps a lot. It is quite bad that it is so easy to corrupt driver context. IMO it is good reason to warn users before using managed memory in cudf only, but of course not in dask-cudf as you explained. Hopefully I will move to dask-cudf soon. |
spilling to main memory cannot be realiably made without using dask-cudf. Currently implemented spilling was rolled back so we can stil run cudf benchmarks. Re-opening this issue to wait for dask-cudf support |
Hey @jangorecki , we use |
@taureandyernv Thanks for trying to help. Altought spilling cudf to main mem works, it is not reliable because it can corrupt driver context and then whole machine has to be rebooted. So agree it only make sense to use it with dask_cudf, which AFAIU is not affected by that issue. |
according to comment in rapidsai/cudf#2288 (comment) one could spil to main memory without actually using dask-cudf.
related #126
The text was updated successfully, but these errors were encountered: