You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apparently, the root cause was that the quantization with --avx2 was not supported on the target machine. I made it work by explicitly setting parameters in ORTConfig.json file.
Describe the issue
I exported Mistral model to ONNX format using optimum-cli tool:
I was able to load and run the model using onnxruntime. Then I quantized it:
The attempt to load the quantized model ended up with the error "Protobuf parsing failed".
What have I missed?
P.S. The quantized model is successfully read by Netron.
To reproduce
See the description
Urgency
No response
Platform
Linux
OS Version
Ubuntu 20.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.1
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: