Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running quantize_dynamic: Failed to find proper ai.onnx domain #15563

Open
bogedy opened this issue Apr 18, 2023 · 7 comments
Open

Error running quantize_dynamic: Failed to find proper ai.onnx domain #15563

bogedy opened this issue Apr 18, 2023 · 7 comments
Labels
ep:CUDA issues related to the CUDA execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. quantization issues related to quantization

Comments

@bogedy
Copy link

bogedy commented Apr 18, 2023

Describe the issue

When running this:

import os
def quantize_onnx_model(onnx_model_path, quantized_model_path):
    from onnxruntime.quantization import quantize_dynamic, QuantType
    import onnx
    onnx_opt_model = onnx.load(onnx_model_path)
    quantize_dynamic(onnx_model_path,
                     quantized_model_path,
                     weight_type=QuantType.QInt8)

    print(f"quantized model saved to:{quantized_model_path}")

input_path = '../models/onnx_quantize_test/model.onnx'
output_path = '../models/onnx_quantize_test/model_quantized.onnx'
quantize_onnx_model(input_path, output_path)

I get ValueError: Failed to find proper ai.onnx domain.

The culprit seems to be this code in onnxruntime/quantization/onnx_quantizer.py:

def check_opset_version(self):
        ai_onnx_domain = [
            opset for opset in self.model.model.opset_import if not opset.domain or opset.domain == "ai.onnx"
        ]
        if len(ai_onnx_domain) != 1:
            raise ValueError("Failed to find proper ai.onnx domain")

When I load in my model myself and make the ai_onnx_domain I get this:

[version: 12,
 domain: ""
 version: 12]

Am I missing something about quantizing here? Why does len(ai_onnx_domain) need to be 1?

To reproduce

My model is from the huggingface/setfit library, so it's a pytorch sentence_transformers embedding model plus an sklearn logistic regression head. It was trained as follows:

from setfit import SetFitTrainer
from setfit import SetFitModel

model_id = "intfloat/e5-small"
model = SetFitModel.from_pretrained(model_id, multi_target_strategy="one-vs-rest")

trainer = SetFitTrainer(
        train_dataset=train_ds,
        eval_dataset=test_ds,
        model=model,
        metric=lambda preds, refs: evaluator.compute(predictions=preds, references=refs, average='samples'),
        column_mapping={"text": "text", "labels": "label"},
        num_epochs=1
    )

import torch
torch.backends.cuda.matmul.allow_tf32 = True
trainer.train()

I exported to ONNX using this script here https://github.com/huggingface/setfit/blob/main/src/setfit/exporters/onnx.py, which makes use of the ONNX exporter in torch and skl2onnx.

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.14.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. quantization issues related to quantization labels Apr 18, 2023
@yufenglee
Copy link
Member

yufenglee commented Apr 19, 2023

[version: 12,
 domain: ""
 version: 12]

This looks unnormal. The first opset only has version but not domain. We checked opset version because quantization is supported from ONNX opset 10.

@bogedy
Copy link
Author

bogedy commented Apr 19, 2023

Thanks for the response.

I get why it checks the opset but why does the length of the list need to be exactly 1? And if not having a domain is abnormal then why does the code snippet look for opsets where not opset.domain==True?

My model is supposed to have 3 subcomponents: BERT, mean pooling, logistic regression head. That was the filtered list [opset for opset in self.model.model.opset_import if not opset.domain or opset.domain == "ai.onnx"] I was recreating from the onnxruntime code snippet. If I load in the model and just run model.opset_import I get

[version: 12
, domain: ""
version: 12
, domain: "ai.onnx.ml"
version: 1
]

If these represent the three components that I think they do and the last opset is 1 this is also weird because it should be 12 like the other components... Hmm...

@tianleiwu
Copy link
Contributor

It is fine that {domain: "", version: 12} and {domain: "ai.onnx.ml", version: 1} both exists since they are different domains. But if there are {domain: "", version: 12} and {domain: "ai.onnx", version: 1}, then they are conflicting since "" and "ai.onnx" are same domain. The first one {version: 12} without domain is invalid.

@yixzhou
Copy link

yixzhou commented Jun 25, 2024

Hi @bogedy did you get a solution to this issue? I am seeing this issue as well when running quantization, i wonder whether you were able to get it to work

@rawinkler
Copy link

I have the same issue in the absolute same setfit context, also with sklearn classification head. Would be very interested in a solution, too.

@yugaljain1999
Copy link

@rawinkler @yixzhou Have you been able to resolve this issue?

@rawinkler
Copy link

@yugaljain1999 : Unfortunately not, I am sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. quantization issues related to quantization
Projects
None yet
Development

No branches or pull requests

6 participants