Skip to content

Commit

Permalink
Add internvl2 awq models (modelscope#1846)
Browse files Browse the repository at this point in the history
  • Loading branch information
tastelikefeet committed Aug 29, 2024
1 parent 17a3209 commit a533877
Show file tree
Hide file tree
Showing 3 changed files with 90 additions and 0 deletions.
5 changes: 5 additions & 0 deletions docs/source/LLM/支持的模型和数据集.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,11 @@
|internvl2-26b|[OpenGVLab/InternVL2-26B](https://modelscope.cn/models/OpenGVLab/InternVL2-26B/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✔|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-26B](https://huggingface.co/OpenGVLab/InternVL2-26B)|
|internvl2-40b|[OpenGVLab/InternVL2-40B](https://modelscope.cn/models/OpenGVLab/InternVL2-40B/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✔|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-40B](https://huggingface.co/OpenGVLab/InternVL2-40B)|
|internvl2-llama3-76b|[OpenGVLab/InternVL2-Llama3-76B](https://modelscope.cn/models/OpenGVLab/InternVL2-Llama3-76B/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✔|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-Llama3-76B](https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B)|
|internvl2-2b-awq|[OpenGVLab/InternVL2-2B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-2B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-2B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-2B-AWQ)|
|internvl2-8b-awq|[OpenGVLab/InternVL2-8B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-8B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-8B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-8B-AWQ)|
|internvl2-26b-awq|[OpenGVLab/InternVL2-26B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-26B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-26B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-26B-AWQ)|
|internvl2-40b-awq|[OpenGVLab/InternVL2-40B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-40B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-40B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-40B-AWQ)|
|internvl2-llama3-76b-awq|[OpenGVLab/InternVL2-Llama3-76B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-Llama3-76B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-Llama3-76B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B-AWQ)|
|deepseek-vl-1_3b-chat|[deepseek-ai/deepseek-vl-1.3b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-1.3b-chat/summary)|^(language_model\|aligner)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|deepseek-vl|✔|✘|✔|✘||vision|[deepseek-ai/deepseek-vl-1.3b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-1.3b-chat)|
|deepseek-vl-7b-chat|[deepseek-ai/deepseek-vl-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-7b-chat/summary)|^(language_model\|aligner)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|deepseek-vl|✔|✘|✔|✘||vision|[deepseek-ai/deepseek-vl-7b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat)|
|paligemma-3b-pt-224|[AI-ModelScope/paligemma-3b-pt-224](https://modelscope.cn/models/AI-ModelScope/paligemma-3b-pt-224/summary)|^(language_model\|multi_modal_projector)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|paligemma|✔|✔|✘|✘|transformers>=4.41|vision|[google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224)|
Expand Down
5 changes: 5 additions & 0 deletions docs/source_en/LLM/Supported-models-datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,11 @@ The table below introcudes all models supported by SWIFT:
|internvl2-26b|[OpenGVLab/InternVL2-26B](https://modelscope.cn/models/OpenGVLab/InternVL2-26B/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✔|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-26B](https://huggingface.co/OpenGVLab/InternVL2-26B)|
|internvl2-40b|[OpenGVLab/InternVL2-40B](https://modelscope.cn/models/OpenGVLab/InternVL2-40B/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✔|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-40B](https://huggingface.co/OpenGVLab/InternVL2-40B)|
|internvl2-llama3-76b|[OpenGVLab/InternVL2-Llama3-76B](https://modelscope.cn/models/OpenGVLab/InternVL2-Llama3-76B/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✔|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-Llama3-76B](https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B)|
|internvl2-2b-awq|[OpenGVLab/InternVL2-2B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-2B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-2B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-2B-AWQ)|
|internvl2-8b-awq|[OpenGVLab/InternVL2-8B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-8B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-8B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-8B-AWQ)|
|internvl2-26b-awq|[OpenGVLab/InternVL2-26B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-26B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-26B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-26B-AWQ)|
|internvl2-40b-awq|[OpenGVLab/InternVL2-40B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-40B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-40B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-40B-AWQ)|
|internvl2-llama3-76b-awq|[OpenGVLab/InternVL2-Llama3-76B-AWQ](https://modelscope.cn/models/OpenGVLab/InternVL2-Llama3-76B-AWQ/summary)|^(language_model\|mlp1)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|internvl2|✔|✘|✔|✘|transformers>=4.36, timm|vision, video|[OpenGVLab/InternVL2-Llama3-76B-AWQ](https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B-AWQ)|
|deepseek-vl-1_3b-chat|[deepseek-ai/deepseek-vl-1.3b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-1.3b-chat/summary)|^(language_model\|aligner)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|deepseek-vl|✔|✘|✔|✘||vision|[deepseek-ai/deepseek-vl-1.3b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-1.3b-chat)|
|deepseek-vl-7b-chat|[deepseek-ai/deepseek-vl-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-7b-chat/summary)|^(language_model\|aligner)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|deepseek-vl|✔|✘|✔|✘||vision|[deepseek-ai/deepseek-vl-7b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat)|
|paligemma-3b-pt-224|[AI-ModelScope/paligemma-3b-pt-224](https://modelscope.cn/models/AI-ModelScope/paligemma-3b-pt-224/summary)|^(language_model\|multi_modal_projector)(?!.\*(lm_head\|output\|emb\|wte\|shared)).\*|paligemma|✔|✔|✘|✘|transformers>=4.41|vision|[google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224)|
Expand Down
80 changes: 80 additions & 0 deletions swift/llm/utils/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,11 @@ class ModelType:
internvl2_26b = 'internvl2-26b'
internvl2_40b = 'internvl2-40b'
internvl2_llama3_76b = 'internvl2-llama3-76b'
internvl2_2b_awq = 'internvl2-2b-awq'
internvl2_8b_awq = 'internvl2-8b-awq'
internvl2_26b_awq = 'internvl2-26b-awq'
internvl2_40b_awq = 'internvl2-40b-awq'
internvl2_llama3_76b_awq = 'internvl2-llama3-76b-awq'
# deepseek
deepseek_7b = 'deepseek-7b'
deepseek_7b_chat = 'deepseek-7b-chat'
Expand Down Expand Up @@ -4166,6 +4171,81 @@ def _new_forward(hidden_states, *, __old_forward) -> Tensor:
placeholder_tokens=['<IMG_CONTEXT>'],
tags=['multi-modal', 'vision', 'video'],
hf_model_id='OpenGVLab/InternVL2-Llama3-76B')
@register_model(
ModelType.internvl2_2b_awq,
'OpenGVLab/InternVL2-2B-AWQ',
LoRATM.internvl,
TemplateType.internvl2,
requires=['transformers>=4.36', 'timm'],
ignore_file_pattern=[r'.+\.zip$'],
support_flash_attn=True,
support_lmdeploy=True,
support_vllm=False,
torch_dtype=torch.float16,
function_kwargs={'is_awq': True},
placeholder_tokens=['<IMG_CONTEXT>'],
tags=['multi-modal', 'vision', 'video'],
hf_model_id='OpenGVLab/InternVL2-2B-AWQ')
@register_model(
ModelType.internvl2_8b_awq,
'OpenGVLab/InternVL2-8B-AWQ',
LoRATM.internvl,
TemplateType.internvl2,
requires=['transformers>=4.36', 'timm'],
ignore_file_pattern=[r'.+\.zip$'],
support_flash_attn=True,
support_lmdeploy=True,
support_vllm=False,
torch_dtype=torch.float16,
function_kwargs={'is_awq': True},
placeholder_tokens=['<IMG_CONTEXT>'],
tags=['multi-modal', 'vision', 'video'],
hf_model_id='OpenGVLab/InternVL2-8B-AWQ')
@register_model(
ModelType.internvl2_26b_awq,
'OpenGVLab/InternVL2-26B-AWQ',
LoRATM.internvl,
TemplateType.internvl2,
requires=['transformers>=4.36', 'timm'],
ignore_file_pattern=[r'.+\.zip$'],
support_flash_attn=True,
support_lmdeploy=True,
support_vllm=False,
torch_dtype=torch.float16,
function_kwargs={'is_awq': True},
placeholder_tokens=['<IMG_CONTEXT>'],
tags=['multi-modal', 'vision', 'video'],
hf_model_id='OpenGVLab/InternVL2-26B-AWQ')
@register_model(
ModelType.internvl2_40b_awq,
'OpenGVLab/InternVL2-40B-AWQ',
LoRATM.internvl,
TemplateType.internvl2,
requires=['transformers>=4.36', 'timm'],
ignore_file_pattern=[r'.+\.zip$'],
support_flash_attn=True,
support_lmdeploy=True,
support_vllm=False,
torch_dtype=torch.float16,
function_kwargs={'is_awq': True},
placeholder_tokens=['<IMG_CONTEXT>'],
tags=['multi-modal', 'vision', 'video'],
hf_model_id='OpenGVLab/InternVL2-40B-AWQ')
@register_model(
ModelType.internvl2_llama3_76b_awq,
'OpenGVLab/InternVL2-Llama3-76B-AWQ',
LoRATM.internvl,
TemplateType.internvl2,
requires=['transformers>=4.36', 'timm'],
ignore_file_pattern=[r'.+\.zip$'],
support_flash_attn=True,
support_lmdeploy=True,
support_vllm=False,
torch_dtype=torch.float16,
function_kwargs={'is_awq': True},
placeholder_tokens=['<IMG_CONTEXT>'],
tags=['multi-modal', 'vision', 'video'],
hf_model_id='OpenGVLab/InternVL2-Llama3-76B-AWQ')
def get_model_tokenizer_internvl(model_dir: str,
torch_dtype: Dtype,
model_kwargs: Dict[str, Any],
Expand Down

0 comments on commit a533877

Please sign in to comment.