Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Protobuf parsing failed" error when loading a quantized Mistral model #20113

Closed
idruker-cerence opened this issue Mar 27, 2024 · 1 comment
Closed
Assignees

Comments

@idruker-cerence
Copy link

Describe the issue

I exported Mistral model to ONNX format using optimum-cli tool:

optimum-cli export onnx --task text-generation-with-past -m "mistralai/Mistral-7B-v0.1" <path/to/output-onnx-model>

I was able to load and run the model using onnxruntime. Then I quantized it:

optimum-cli onnxruntime quantize --avx2 --onnx_model <path/to/output-onnx-model>/model.onnx --output <path/to/output-onnx-quantized-model>

The attempt to load the quantized model ended up with the error "Protobuf parsing failed".

What have I missed?

P.S. The quantized model is successfully read by Netron.

To reproduce

See the description

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.17.1

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@idruker-cerence
Copy link
Author

Apparently, the root cause was that the quantization with --avx2 was not supported on the target machine. I made it work by explicitly setting parameters in ORTConfig.json file.

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants