[BUG] Latest version cannot load Qwen2-VL model config correctly. #33401

fyabc · 2024-09-10T07:15:33Z

System Info

transformers version: 4.45.0.dev0
Platform: Linux-5.10.134-16.101.al8.x86_64-x86_64-with-glibc2.35
Python version: 3.10.14
Huggingface_hub version: 0.23.4
Safetensors version: 0.4.3
Accelerate version: 0.32.1
Accelerate config: not found
PyTorch version (GPU?): 2.4.0+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?:
Using GPU in script?:
GPU type: NVIDIA L20Y

Who can help?

@amyeroberts @qubvel

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Download the config.json from Qwen2-VL-7B-Instruct HF main repo to /tmp/Qwen2-VL-7B-Instruct/config.json.

The downloaded config file content should be:

{
  "architectures": [
    "Qwen2VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "vision_start_token_id": 151652,
  "vision_end_token_id": 151653,
  "vision_token_id": 151654,
  "image_token_id": 151655,
  "video_token_id": 151656,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2_vl",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.41.2",
  "use_cache": true,
  "use_sliding_window": false,
  "vision_config": {
    "depth": 32,
    "embed_dim": 1280,
    "mlp_ratio": 4,
    "num_heads": 16,
    "in_chans": 3,
    "hidden_size": 3584,
    "patch_size": 14,
    "spatial_merge_size": 2,
    "spatial_patch_size": 14,
    "temporal_patch_size": 2
  },
  "rope_scaling": {
    "type": "mrope",
    "mrope_section": [
      16,
      24,
      24
    ]
  },
  "vocab_size": 152064
}

Install the latest transformers version via pip install git+https://github.com/huggingface/transformers@main
Run the following script:

from transformers import AutoConfig
config = AutoConfig.from_pretrained('/tmp/Qwen2-VL-7B-Instruct/')
print(config)

The result is:

Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}

Qwen2VLConfig {
  "_name_or_path": "/tmp/Qwen2-VL-7B-Instruct/",
  "architectures": [
    "Qwen2VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2_vl",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.45.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "in_chans": 3,
    "model_type": "qwen2_vl",
    "spatial_patch_size": 14
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

It prints a warning message, and the output rope_scaling.type and rope_scaling.rope_type are set to default, but mrope is expected.

Expected behavior

This bug seems to be introduced in a recent version of transformers.
When I switch to a old version by git+https://github.com/huggingface/transformers@21fac7abba2a37fae86106f87fcf9974fd1e3830, the output is correct:

Qwen2VLConfig {
  "_name_or_path": "/tmp/Qwen2-VL-7B-Instruct/",
  "architectures": [
    "Qwen2VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2_vl",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "type": "mrope"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.45.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "in_chans": 3,
    "model_type": "qwen2_vl",
    "spatial_patch_size": 14
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

The text was updated successfully, but these errors were encountered:

wangaocheng · 2024-09-10T07:42:42Z

yes the same error

LysandreJik · 2024-09-10T07:50:53Z

cc @zucchini-nlp as well I believe

zucchini-nlp · 2024-09-10T09:16:11Z

Hey! Yes, the warning is currently misleading as the RoPE implementation was recently standardized and Qwen2-VL has a quite different rope-scaling dict compared to other models. Yet, the generation quality shouldn't be affected by that, as per my last interaction with the model everything was same as before standardization

cc @gante as well, as you're working on uniform-RoPE, this might be something we want to fix

gante · 2024-09-10T15:34:18Z

@zucchini-nlp if it is an expected argument, then we shouldn't throw a warning.

Perhaps we could add a extra_ignore_key argument to rope_config_validation, to define additional keys to ignore? I'm expecting this pattern (updating keys but wanting to keep the original in the config instance for BC) to happen again in the future

zucchini-nlp · 2024-09-10T15:55:44Z

@gante yes, that sounds good. I believe this will be part of your RoPE standardization PR, since it's not very urgent and generation is not broken

monkeywl2020 · 2024-09-12T07:40:39Z

In the initialization function of class Qwen2VLConfig in src/transformers/models/qwen2_vl/configuration_qwen2_vl.py, I found this code。

if self.rope_scaling is not None and "type" in self.rope_scaling: 
       if self.rope_scaling["type"] == "mrope": 
               self.rope_scaling["type"] = "default" 
       self.rope_scaling["rope_type"] = self.rope_scaling["type"]

This place has modified the configuration。 rope_scaling["type"] and rope_scaling["rope_type"] Changed to default

zucchini-nlp · 2024-09-12T09:10:34Z

@monkeywl2020 yes, that was a hack to enable uniform RoPE which currently doesn't accept mrope-dtype and since mrope is same as the default rope, with the only difference that the position ids have an extra dimension for height/width/temporal dim

We'll handle this in a better way, to accept non-standard rope kwargs soon

monkeywl2020 · 2024-09-12T10:31:31Z

@monkeywl2020 yes, that was a hack to enable uniform RoPE which currently doesn't accept mrope-dtype and since mrope is same as the default rope, with the only difference that the position ids have an extra dimension for height/width/temporal dim

We'll handle this in a better way, to accept non-standard rope kwargs soon

OK

fyabc · 2024-09-13T04:54:22Z

@zucchini-nlp Hi, can you provide an approximate time for this bug to be fixed?

zucchini-nlp · 2024-09-13T06:50:11Z

@gante will you add this to your general RoPE PR or we can fix it separately?

exceedzhang · 2024-09-22T09:05:17Z

the same error!

RANYABING · 2024-09-23T04:51:20Z

same error！

Unrecognized keys in rope_scaling for 'rope_type'='default': {'mrope_section'}
Traceback (most recent call last):
......

IvanZidov · 2024-09-23T09:40:33Z

Same here!

niaoyu · 2024-09-25T08:57:07Z

Just
pip install git+https://github.com/huggingface/transformers@21fac7abba2a37fae86106f87fcf9974fd1e3830

is OK.

The pr: #32617
seems break the logic about the qwen rope parameter

xuyue1112 · 2024-09-29T03:20:08Z

same problem. If I have already trained with the latest version of master, do I need to retrain with 21fac7abba2a37fae86106f87fcf9974fd1e3830, or do I only need to use this version for inference?

fyabc added the bug label Sep 10, 2024

This was referenced Sep 10, 2024

[Model][VLM] Add Qwen2-VL model support vllm-project/vllm#7905

Merged

vllm推理报错:无法在rope_scaling中获取factor字段 QwenLM/Qwen2-VL#96

Open

LysandreJik added Vision Multimodal and removed bug labels Sep 10, 2024

LysandreJik added Should Fix This has been identified as a bug and should be fixed. bug labels Sep 10, 2024

fyabc mentioned this issue Sep 10, 2024

使用vllm推理Qwen2-VL-2B-Instruct-GPTQ-Int4报错 QwenLM/Qwen2-VL#123

Open

DarkLight1337 mentioned this issue Sep 12, 2024

[Bug]: loading qwen2-vl-7b fails with error: assert "factor" in rope_scaling vllm-project/vllm#8388

Closed

1 task

fyabc mentioned this issue Sep 14, 2024

warning:Unrecognized keys in rope_scaling for 'rope_type'='default': {'mrope_section'} QwenLM/Qwen2-VL#209

Closed

Lissanro mentioned this issue Sep 22, 2024

Getting "ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)" when trying to run Qwen/Qwen2-VL-72B-Instruct-AWQ matatonic/openedai-vision#20

Open

njhill mentioned this issue Sep 26, 2024

[BugFix] Fix test breakages from transformers 4.45 upgrade vllm-project/vllm#8829

Merged

zucchini-nlp linked a pull request Sep 27, 2024 that will close this issue

Ignore keys on validate_rope #33753

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Latest version cannot load Qwen2-VL model config correctly. #33401

[BUG] Latest version cannot load Qwen2-VL model config correctly. #33401

fyabc commented Sep 10, 2024 •

edited

Loading

wangaocheng commented Sep 10, 2024

LysandreJik commented Sep 10, 2024

zucchini-nlp commented Sep 10, 2024

gante commented Sep 10, 2024 •

edited

Loading

zucchini-nlp commented Sep 10, 2024

monkeywl2020 commented Sep 12, 2024 •

edited

Loading

zucchini-nlp commented Sep 12, 2024

monkeywl2020 commented Sep 12, 2024

fyabc commented Sep 13, 2024

zucchini-nlp commented Sep 13, 2024

exceedzhang commented Sep 22, 2024

RANYABING commented Sep 23, 2024

IvanZidov commented Sep 23, 2024

niaoyu commented Sep 25, 2024

xuyue1112 commented Sep 29, 2024

[BUG] Latest version cannot load Qwen2-VL model config correctly. #33401

[BUG] Latest version cannot load Qwen2-VL model config correctly. #33401

Comments

fyabc commented Sep 10, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

wangaocheng commented Sep 10, 2024

LysandreJik commented Sep 10, 2024

zucchini-nlp commented Sep 10, 2024

gante commented Sep 10, 2024 • edited Loading

zucchini-nlp commented Sep 10, 2024

monkeywl2020 commented Sep 12, 2024 • edited Loading

zucchini-nlp commented Sep 12, 2024

monkeywl2020 commented Sep 12, 2024

fyabc commented Sep 13, 2024

zucchini-nlp commented Sep 13, 2024

exceedzhang commented Sep 22, 2024

RANYABING commented Sep 23, 2024

IvanZidov commented Sep 23, 2024

niaoyu commented Sep 25, 2024

xuyue1112 commented Sep 29, 2024

fyabc commented Sep 10, 2024 •

edited

Loading

gante commented Sep 10, 2024 •

edited

Loading

monkeywl2020 commented Sep 12, 2024 •

edited

Loading