-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] cudaErrorInvalidValue
when creating cudf.Series from float16 CuPy Series
#9065
Comments
Not saying a CUDA runtime error should happen here, but does cuDF support |
Also, why the heck would your code result in a call to |
The problem is coming from these changes in the PR you pointed to. The issue is that we're now calling @shwina @galipremsagar I'm not 100% sure what the right solution is here. My two cents: we probably want
In cases where we're creating cuDF columns from objects supporting the |
I added some print statements in Of note, the error is on the
|
@harrism to clarify my previous comment, the problem is that previously cuDF was converting the |
The fix to this issue is here: #9069 The root cause was that because of using @vyasr I thought of two approaches:
|
I don't like it that an error at the Python level can result in a CUDA runtime error in RMM which is hard to diagnose. Instead, it should result in a libcudf exception thrown before getting to RMM. This indicates a hole in our C++ test coverage. That's why I am trying to understand this at the libcudf level (I still don't, because I don't know how to hit the C++ debugger from Python code). |
@galipremsagar discussing the tradeoffs between these two approaches is exactly what I was thinking about in my previous comment. I agree that having some sort of @dantegd how urgent is it to get this fixed? @shwina is out until next week and it would be nice to get his take as well. If we need this soon we can always move forward with #9069 as is but leave this issue unresolved until we've answered the deeper questions about how |
@harrism Agreed we should definitely be catching this somewhere in |
@vyasr was hoping someone who is already working on debugging it could take this. To save waiting for a debug build, could someone figure out what inputs Python is/was passing that resulted in the runtime error? Then we could create a gtest which would repro the problem and then fix the test so it throws before calling whatever is getting the CUDA runtime error. |
I would agree. We probably don't want |
@galipremsagar would you be able to update #9069 to also reflect that change? It's possible that change will reveal other cases where this float16->float32 conversion inside |
|
@galipremsagar if you see a place where libcudf could provide a more helpful error and fail gracefully rather than crashing, please file an issue! |
Fixes: #9065 This PR enables using `np.dtype` only for `__cuda_array_interface__` scenario in `as_column`. The dtype in this array interface is guaranteed to be numeric which `np.dtype` can handle. Also there is `float16` dtype upcasting logic already inplace below i.e., at line 1760. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #9069
Describe the bug
Getting an error when creating a Series from float16 CuPy objects, which I believe was not present before #8949
Steps/Code to reproduce bug
Expected behavior
Not failing, or if it is not supported, then fail gracefully as opposed to a cuda error.
Environment overview (please complete the following information)
Environment details
Click here to see environment details
The text was updated successfully, but these errors were encountered: