Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix quantization issue with transformers >= 4.36.0 #264

Merged
merged 3 commits into from
Dec 16, 2023

Conversation

younesbelkada
Copy link
Collaborator

Fixes #260

For some models, mainly models that use code on the Hub feature such as Qwen architecture, some target modules do not properly handle arguments such as past_key_value. I need to dig a bit though why this happens only on transformers 4.36.0 but this seems to work fine as a quick hotfix

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = 'Qwen/Qwen-7B-Chat'
quant_path = 'qwen-7b-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

# Load model
# NOTE: pass safetensors=True to load safetensors
model = AutoAWQForCausalLM.from_pretrained(model_path, **{"low_cpu_mem_usage": True}, safetensors=True)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Quantize
model.quantize(tokenizer, quant_config=quant_config)

# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

print(f'Model is quantized and saved at "{quant_path}"')

cc @casper-hansen

@dongkuang
Copy link

i have the new error:Token indices sequence length is longer than the specified maximum sequence length for this model (57053 > 32768). Running this sequence through the model will result in indexing errors

@casper-hansen
Copy link
Owner

i have the new error:Token indices sequence length is longer than the specified maximum sequence length for this model (57053 > 32768). Running this sequence through the model will result in indexing errors

This is a warning and not an error. This is the intended usage of the tokenizer currently

@dongkuang
Copy link

OK!Thank you!Successfully processed

@casper-hansen
Copy link
Owner

This is a great fix @younesbelkada and very clean code! LGTM

@casper-hansen casper-hansen merged commit 2350a4d into main Dec 16, 2023
@casper-hansen casper-hansen deleted the fix-issue-transformers-2 branch December 23, 2023 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

awq int4报错 got an unexpected keyword argument 'past_key_values'
3 participants