-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU build broken with CUDA SDK 12.0 #13932
Comments
If your system doesn't have libcublasLt.so.11, then the answer is : "yes". |
Thanks, I meant to say, is it possible that in your future releases, not link to libcublasLt.so.11, but only libcublasLt.so, and then check versions using proper API calls? Regards, |
@tufei, would you mind showing me an example? We didn't explicitly put the name of "libcublasLt.so.11" in the link command. We put "-lcublasLt" there and linker resolved it to "libcublasLt.so.11". Most Linux shared libs work in such a way. I don't know how to change it. |
terminate called after throwing an instance of 'Ort::Exception' ldd libonnxruntime_providers_cuda.so |
coask when onnx can support cuda12? or even support building CUDA 12? |
I think #14659 needs to be merged into latest release for CUDA 12 build fix |
+1 For the time being, I resolved this as follows:
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora35/x86_64/cuda-fedora35.repo
sudo dnf clean all If you're using the RPM fusion repositories for your display drivers: sudo dnf module disable nvidia-driver
sudo dnf install cuda-11-8
sudo dnf install https://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/x86_64/nvidia-machine-learning-repo-rhel8-1.0.0-1.x86_64.rpm and e.g. sudo dnf install libcudnn8 libcudnn8-devel libnccl libnccl-devel You can browse the packages from the rhel8 repo here After the above I can successfully run inference. I'm using the ort crate for rust with models I converted from pytorch(mostly) to ONNX. |
Right now each our package only works with a specific CUDA minor version. For example, the last one only works with CUDA 11.6 and the next one will only work with CUDA 11.8. At some point it will become CUDA 12 point something. If you have more questions about the project's future plan, you can ask @pranavsharma . |
I also had a similar issue, building onnxruntime from source helped !! |
The last code should works fine on Windows with CUDA 12.2. I am adding a build pipeline for it. #17231 |
I had a similar problem when using Clion to compile the onnxruntime-cpp example in yolov8!! [DCSP_ONNX]:/onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1131 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory But I can find libcublasLt.so.11 under /usr/local/cuda11.8/lib64 Please help!!! |
@wongwenxin See https://man7.org/linux/man-pages/man8/ld.so.8.html about how Linux finds dynamic libraries. You might need to run ldconfig to add the directory to the operating system's database, or setup LD_LIBRARY_PATH env. I will close this issue now because it is as designed. All our prebuilt packages were built with CUDA 11.x. They are not compatible with CUDA 12.x. However, you can build ONNX Runtime from source with CUDA 12.x if you need to use that version of CUDA. Feel free to open a new issue if you hit any build error with that. |
Has anyone built the onnx runtime with cuda 12.x successfully? What would be the best instructions to use to do so? |
The latest code should work fine with CUDA 12.2. And we have a nightly package for it. https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ort-cuda-12-nightly/PyPI/ort-nightly-gpu |
Thank you for the comment and link @snnn! Much appreciated 🙏 |
having this error ![]() thank you so much |
Tried with
working! |
Tried this:
But saw this error: ERROR: Could not find a version that satisfies the requirement ort-nightly-gpu==1.17.0.dev20231205004 (from versions: none) |
look at my message above, try with
|
With this command you installed ort-nightly-gpu==1.15.dev |
I have CUDA 12 and it works 🤔 |
For people who will come after me, you can download whl here |
this works. thanks |
I follow the solution: Fannovel16/comfyui_controlnet_aux#75 (comment),
It works for me, but with a warning: |
Has anyone got this working with |
I m facing same issue too. Seems like its not fixed at newer version of package. |
i see the same error:
more infos: cc @pranavsharma - can we open a new issue - or is there a solution to this? |
Did you get the package from https://pkgs.dev.azure.com/onnxruntime/onnxruntime/_packaging/onnxruntime-cuda-12/pypi/simple/ ? |
@snnn thanks for quick reply! no from: https://pypi.org/project/onnxruntime-gpu/
also what is ort-nightly-gpu? @snnn there is no 1.18.0 version? |
Sorry I gave you the wrong URL. The URL should be
|
whats the diff between the 2 different URLs? does new URL have onnxruntime-gpu==1.18.0? also what is ort-nightly-gpu? |
Thanks that helped to fix this problem for now |
https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/onnxruntime-cuda-12 is the place where we host our CUDA 12 python, nuget and maven packages. You can click the "Connect to feed" button to see instructions. https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ort-cuda-12-nightly is similar to above, but it only hosts nightly packages, and currently it doesn't have nightly packages for python. (We are working on it). |
Describe the issue
It seems the ORT has a hard dependency on CUDA SDK 11.x?
[dnn_onnxruntime @ 0x3f7ccc0] SessionOptionsAppendExecutionProvider_CUDA(): /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1069 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory
To reproduce
On Fedora 36 update to latest CUDA SDK 12.0 then try some examples.
Urgency
No response
Platform
Linux
OS Version
Fedora 36
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.13.1
ONNX Runtime API
C
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.0
The text was updated successfully, but these errors were encountered: