Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
hanlv15 committed Jul 17, 2024
2 parents 4635e2a + df7020b commit 8de6765
Show file tree
Hide file tree
Showing 29 changed files with 564 additions and 298 deletions.
203 changes: 103 additions & 100 deletions README.md

Large diffs are not rendered by default.

162 changes: 82 additions & 80 deletions README_CN.md

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion docs/source/GetStarted/界面训练推理.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ swift web-ui

开启界面训练和推理。

web-ui没有传入参数,所有可控部分都在界面中。但是有几个环境变量可以使用
web-ui可以通过环境变量或者参数控制UI行为。环境变量如下

> WEBUI_SHARE=1/0 默认为0 控制gradio是否是share状态
>
Expand All @@ -19,3 +19,5 @@ web-ui没有传入参数,所有可控部分都在界面中。但是有几个
> WEBUI_PORT web-ui的端口号
>
> USE_INFERENCE=1/0 默认0. 控制gradio的推理页面是直接加载模型推理或者部署(USE_INFERENCE=0)
如果使用参数,请参考[命令行参数](../LLM/命令行参数.md#web-ui-参数)
11 changes: 10 additions & 1 deletion docs/source/LLM/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ export参数继承了infer参数, 除此之外增加了以下参数:
- `--to_peft_format`: 默认为`False`. 将lora的swift format(`--tuner_backend swift`)转成peft format.
- `--merge_lora`: 默认为`False`. 该参数已在InferArguments中定义, 不属于新增参数. 是否将lora权重merge到基模型中, 并保存完整的权重. 权重会保存在`ckpt_dir`的同级目录中, e.g. `'/path/to/your/vx-xxx/checkpoint-xxx-merged'`目录下.
- `--quant_bits`: 量化的bits数. 默认为`0`, 即不进行量化. 如果你设置了`--quant_method awq`, 你可以设置为`4`进行4bits量化. 如果你设置了`--quant_method gptq`, 你可以设置为`2`,`3`,`4`,`8`进行对应bits的量化. 如果对原始模型进行量化, 权重会保存在`f'{args.model_type}-{args.quant_method}-int{args.quant_bits}'`目录中. 如果对微调后模型进行量化, 权重会保存在`ckpt_dir`的同级目录中, e.g. `f'/path/to/your/vx-xxx/checkpoint-xxx-{args.quant_method}-int{args.quant_bits}'`目录下.
- `--quant_method`: 量化方法, 默认为`'awq'`. 你可以选择为'awq', 'gptq'.
- `--quant_method`: 量化方法, 默认为`'awq'`. 你可以选择为'awq', 'gptq', 'bnb'.
- `--dataset`: 该参数已在InferArguments中定义, 在export时含义为量化数据集. 默认为`[]`. 更多细节: 包括如何自定义量化数据集, 可以参考[LLM量化文档](LLM量化文档.md).
- `--quant_n_samples`: 量化参数, 默认为`256`. 当设置为`--quant_method awq`时, 如果出现量化的时候OOM, 可以适度降低`--quant_n_samples``--quant_seqlen`. `--quant_method gptq`通常不会出现量化OOM.
- `--quant_seqlen`: 量化参数, 默认为`2048`.
Expand All @@ -323,6 +323,8 @@ export参数继承了infer参数, 除此之外增加了以下参数:
- `--hub_token`: 默认为`None`. 具体的参数介绍可以在`sft.sh命令行参数`中查看.
- `--hub_private_repo`: 默认为`False`. 具体的参数介绍可以在`sft.sh命令行参数`中查看.
- `--commit_message`: 默认是`'update files'`.
- `--to_ollama`: 转为ollama导出.
- `--ollama_output_dir`: ollama输出目录. 默认存储在当前目录下的`模型类型-ollama`文件夹内.

## eval参数

Expand Down Expand Up @@ -364,3 +366,10 @@ deploy参数继承了infer参数, 除此之外增加了以下参数:
- `--port`: 默认为`8000`.
- `--ssl_keyfile`: 默认为`None`.
- `--ssl_certfile`: 默认为`None`.
## web-ui 参数
- `--host`: 默认为`'127.0.0.1'`.
- `--port`: 默认为`None`.
- `--lang`: 默认为`'zh'`.
- `--share`: 默认为`False`.
11 changes: 6 additions & 5 deletions docs/source/LLM/支持的模型和数据集.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,9 @@
|chatglm3-6b-32k|[ZhipuAI/chatglm3-6b-32k](https://modelscope.cn/models/ZhipuAI/chatglm3-6b-32k/summary)|query_key_value|chatglm3|&#x2718;|&#x2714;|transformers<4.42|-|[THUDM/chatglm3-6b-32k](https://huggingface.co/THUDM/chatglm3-6b-32k)|
|chatglm3-6b-128k|[ZhipuAI/chatglm3-6b-128k](https://modelscope.cn/models/ZhipuAI/chatglm3-6b-128k/summary)|query_key_value|chatglm3|&#x2718;|&#x2714;|transformers<4.42|-|[THUDM/chatglm3-6b-128k](https://huggingface.co/THUDM/chatglm3-6b-128k)|
|codegeex2-6b|[ZhipuAI/codegeex2-6b](https://modelscope.cn/models/ZhipuAI/codegeex2-6b/summary)|query_key_value|chatglm-generation|&#x2718;|&#x2714;|transformers<4.34|coding|[THUDM/codegeex2-6b](https://huggingface.co/THUDM/codegeex2-6b)|
|glm4-9b|[ZhipuAI/glm-4-9b](https://modelscope.cn/models/ZhipuAI/glm-4-9b/summary)|query_key_value|chatglm-generation|&#x2714;|&#x2714;|transformers<4.42|-|[THUDM/glm-4-9b](https://huggingface.co/THUDM/glm-4-9b)|
|glm4-9b-chat|[ZhipuAI/glm-4-9b-chat](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat/summary)|query_key_value|chatglm3|&#x2714;|&#x2714;|transformers<4.42|-|[THUDM/glm-4-9b-chat](https://huggingface.co/THUDM/glm-4-9b-chat)|
|glm4-9b-chat-1m|[ZhipuAI/glm-4-9b-chat-1m](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-1m/summary)|query_key_value|chatglm3|&#x2714;|&#x2714;|transformers<4.42|-|[THUDM/glm-4-9b-chat-1m](https://huggingface.co/THUDM/glm-4-9b-chat-1m)|
|glm4-9b|[ZhipuAI/glm-4-9b](https://modelscope.cn/models/ZhipuAI/glm-4-9b/summary)|query_key_value|chatglm-generation|&#x2714;|&#x2714;||-|[THUDM/glm-4-9b](https://huggingface.co/THUDM/glm-4-9b)|
|glm4-9b-chat|[ZhipuAI/glm-4-9b-chat](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat/summary)|query_key_value|chatglm3|&#x2714;|&#x2714;||-|[THUDM/glm-4-9b-chat](https://huggingface.co/THUDM/glm-4-9b-chat)|
|glm4-9b-chat-1m|[ZhipuAI/glm-4-9b-chat-1m](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-1m/summary)|query_key_value|chatglm3|&#x2714;|&#x2714;||-|[THUDM/glm-4-9b-chat-1m](https://huggingface.co/THUDM/glm-4-9b-chat-1m)|
|codegeex4-9b-chat|[ZhipuAI/codegeex4-all-9b](https://modelscope.cn/models/ZhipuAI/codegeex4-all-9b/summary)|query_key_value|codegeex4|&#x2714;|&#x2714;|transformers<4.42|coding|[THUDM/codegeex4-all-9b](https://huggingface.co/THUDM/codegeex4-all-9b)|
|llama2-7b|[modelscope/Llama-2-7b-ms](https://modelscope.cn/models/modelscope/Llama-2-7b-ms/summary)|q_proj, k_proj, v_proj|default-generation|&#x2714;|&#x2714;||-|[meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)|
|llama2-7b-chat|[modelscope/Llama-2-7b-chat-ms](https://modelscope.cn/models/modelscope/Llama-2-7b-chat-ms/summary)|q_proj, k_proj, v_proj|llama|&#x2714;|&#x2714;||-|[meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)|
Expand Down Expand Up @@ -213,6 +213,7 @@
|deepseek-math-7b|[deepseek-ai/deepseek-math-7b-base](https://modelscope.cn/models/deepseek-ai/deepseek-math-7b-base/summary)|q_proj, k_proj, v_proj|default-generation|&#x2714;|&#x2714;||math|[deepseek-ai/deepseek-math-7b-base](https://huggingface.co/deepseek-ai/deepseek-math-7b-base)|
|deepseek-math-7b-instruct|[deepseek-ai/deepseek-math-7b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-math-7b-instruct/summary)|q_proj, k_proj, v_proj|deepseek|&#x2714;|&#x2714;||math|[deepseek-ai/deepseek-math-7b-instruct](https://huggingface.co/deepseek-ai/deepseek-math-7b-instruct)|
|deepseek-math-7b-chat|[deepseek-ai/deepseek-math-7b-rl](https://modelscope.cn/models/deepseek-ai/deepseek-math-7b-rl/summary)|q_proj, k_proj, v_proj|deepseek|&#x2714;|&#x2714;||math|[deepseek-ai/deepseek-math-7b-rl](https://huggingface.co/deepseek-ai/deepseek-math-7b-rl)|
|numina-math-7b|[AI-ModelScope/NuminaMath-7B-TIR](https://modelscope.cn/models/AI-ModelScope/NuminaMath-7B-TIR/summary)|q_proj, k_proj, v_proj|numina-math|&#x2714;|&#x2714;||math|[AI-MO/NuminaMath-7B-TIR](https://huggingface.co/AI-MO/NuminaMath-7B-TIR)|
|deepseek-v2-chat|[deepseek-ai/DeepSeek-V2-Chat](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Chat/summary)|q_a_proj, q_b_proj, kv_a_proj_with_mqa, kv_b_proj, o_proj|deepseek2|&#x2714;|&#x2714;|transformers>=4.39.3|-|[deepseek-ai/DeepSeek-V2-Chat](https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat)|
|deepseek-v2-lite|[deepseek-ai/DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite/summary)|q_a_proj, q_b_proj, kv_a_proj_with_mqa, kv_b_proj, o_proj|default-generation|&#x2714;|&#x2714;|transformers>=4.39.3|-|[deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite)|
|deepseek-v2-lite-chat|[deepseek-ai/DeepSeek-V2-Lite-Chat](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite-Chat/summary)|q_a_proj, q_b_proj, kv_a_proj_with_mqa, kv_b_proj, o_proj|deepseek2|&#x2714;|&#x2714;|transformers>=4.39.3|-|[deepseek-ai/DeepSeek-V2-Lite-Chat](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat)|
Expand Down Expand Up @@ -330,7 +331,7 @@
|qwen-vl-chat-int4|[qwen/Qwen-VL-Chat-Int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)|c_attn|qwen|&#x2714;|&#x2718;|auto_gptq>=0.5|vision|[Qwen/Qwen-VL-Chat-Int4](https://huggingface.co/Qwen/Qwen-VL-Chat-Int4)|
|qwen-audio|[qwen/Qwen-Audio](https://modelscope.cn/models/qwen/Qwen-Audio/summary)|c_attn|qwen-audio-generation|&#x2714;|&#x2718;||audio|[Qwen/Qwen-Audio](https://huggingface.co/Qwen/Qwen-Audio)|
|qwen-audio-chat|[qwen/Qwen-Audio-Chat](https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary)|c_attn|qwen-audio|&#x2714;|&#x2718;||audio|[Qwen/Qwen-Audio-Chat](https://huggingface.co/Qwen/Qwen-Audio-Chat)|
|glm4v-9b-chat|[ZhipuAI/glm-4v-9b](https://modelscope.cn/models/ZhipuAI/glm-4v-9b/summary)|self_attention.query_key_value|glm4v|&#x2718;|&#x2718;|transformers<4.42|vision|[THUDM/glm-4v-9b](https://huggingface.co/THUDM/glm-4v-9b)|
|glm4v-9b-chat|[ZhipuAI/glm-4v-9b](https://modelscope.cn/models/ZhipuAI/glm-4v-9b/summary)|self_attention.query_key_value|glm4v|&#x2718;|&#x2718;||vision|[THUDM/glm-4v-9b](https://huggingface.co/THUDM/glm-4v-9b)|
|llava1_5-7b-instruct|[swift/llava-1.5-7b-hf](https://modelscope.cn/models/swift/llava-1.5-7b-hf/summary)|q_proj, k_proj, v_proj|llava1_5|&#x2714;|&#x2714;|transformers>=4.36|vision|[llava-hf/llava-1.5-7b-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf)|
|llava1_5-13b-instruct|[swift/llava-1.5-13b-hf](https://modelscope.cn/models/swift/llava-1.5-13b-hf/summary)|q_proj, k_proj, v_proj|llava1_5|&#x2714;|&#x2714;|transformers>=4.36|vision|[llava-hf/llava-1.5-13b-hf](https://huggingface.co/llava-hf/llava-1.5-13b-hf)|
|llava1_6-mistral-7b-instruct|[swift/llava-v1.6-mistral-7b-hf](https://modelscope.cn/models/swift/llava-v1.6-mistral-7b-hf/summary)|q_proj, k_proj, v_proj|llava-mistral|&#x2714;|&#x2714;|transformers>=4.39|vision|[llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf)|
Expand All @@ -354,7 +355,7 @@
|mini-internvl-chat-2b-v1_5|[OpenGVLab/Mini-InternVL-Chat-2B-V1-5](https://modelscope.cn/models/OpenGVLab/Mini-InternVL-Chat-2B-V1-5/summary)|wqkv|internvl|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/Mini-InternVL-Chat-2B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-2B-V1-5)|
|mini-internvl-chat-4b-v1_5|[OpenGVLab/Mini-InternVL-Chat-4B-V1-5](https://modelscope.cn/models/OpenGVLab/Mini-InternVL-Chat-4B-V1-5/summary)|qkv_proj|internvl-phi3|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/Mini-InternVL-Chat-4B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-4B-V1-5)|
|internvl2-2b|[OpenGVLab/InternVL2-2B](https://modelscope.cn/models/OpenGVLab/InternVL2-2B/summary)|wqkv|internvl2|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL2-2B](https://huggingface.co/OpenGVLab/InternVL2-2B)|
|internvl2-4b|[OpenGVLab/InternVL2-4B](https://modelscope.cn/models/OpenGVLab/InternVL2-4B/summary)|wqkv|internvl2|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL2-4B](https://huggingface.co/OpenGVLab/InternVL2-4B)|
|internvl2-4b|[OpenGVLab/InternVL2-4B](https://modelscope.cn/models/OpenGVLab/InternVL2-4B/summary)|qkv_proj|internvl2-phi3|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL2-4B](https://huggingface.co/OpenGVLab/InternVL2-4B)|
|internvl2-8b|[OpenGVLab/InternVL2-8B](https://modelscope.cn/models/OpenGVLab/InternVL2-8B/summary)|wqkv|internvl2|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL2-8B](https://huggingface.co/OpenGVLab/InternVL2-8B)|
|internvl2-26b|[OpenGVLab/InternVL2-26B](https://modelscope.cn/models/OpenGVLab/InternVL2-26B/summary)|wqkv|internvl2|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL2-26B](https://huggingface.co/OpenGVLab/InternVL2-26B)|
|internvl2-40b|[OpenGVLab/InternVL2-40B](https://modelscope.cn/models/OpenGVLab/InternVL2-40B/summary)|q_proj, k_proj, v_proj|internvl2|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL2-40B](https://huggingface.co/OpenGVLab/InternVL2-40B)|
Expand Down
4 changes: 3 additions & 1 deletion docs/source_en/GetStarted/Web-ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ swift web-ui

This command starts the interface for training and inference.

The web-ui command doesn't accept parameters; all controllable parts are handled within the interface. However, there are a few environment variables that can be used:
The web-ui command has two ways to control the UI behaviour. The first way is to use environment variable:

> WEBUI_SHARE=1/0: Default is 0. Controls whether gradio is in share mode.
>
Expand All @@ -19,3 +19,5 @@ The web-ui command doesn't accept parameters; all controllable parts are handled
> WEBUI_PORT: The port number for web-ui.
>
> USE_INFERENCE=1/0: Default is 0. Controls whether the gradio inference page directly loads the model for inference or deployment (USE_INFERENCE=0).
The second way is to use [parameter](../LLM/Command-line-parameters.md#web-ui-parameters).
11 changes: 10 additions & 1 deletion docs/source_en/LLM/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ export parameters inherit from infer parameters, with the following added parame
- `--to_peft_format`: Default is `False`. Convert the swift format of LoRA (`--tuner_backend swift`) to peft format.
- `--merge_lora`: Default is `False`. This parameter is already defined in InferArguments, not a new parameter. Whether to merge lora weights into base model and save full weights. Weights will be saved in the same level directory as `ckpt_dir`, e.g. `'/path/to/your/vx-xxx/checkpoint-xxx-merged'` directory.
- `--quant_bits`: Number of bits for quantization. Default is `0`, i.e. no quantization. If you set `--quant_method awq`, you can set this to `4` for 4bits quantization. If you set `--quant_method gptq`, you can set this to `2`,`3`,`4`,`8` for corresponding bits quantization. If quantizing original model, weights will be saved in `f'{args.model_type}-{args.quant_method}-int{args.quant_bits}'` directory. If quantizing fine-tuned model, weights will be saved in the same level directory as `ckpt_dir`, e.g. `f'/path/to/your/vx-xxx/checkpoint-xxx-{args.quant_method}-int{args.quant_bits}'` directory.
- `--quant_method`: Quantization method, default is `'awq'`. Options are 'awq', 'gptq'.
- `--quant_method`: Quantization method, default is `'awq'`. Options are 'awq', 'gptq', 'bnb'.
- `--dataset`: This parameter is already defined in InferArguments, for export it means quantization dataset. Default is `[]`. More details: including how to customize quantization dataset, can be found in [LLM Quantization Documentation](LLM-quantization.md).
- `--quant_n_samples`: Quantization parameter, default is `256`. When set to `--quant_method awq`, if OOM occurs during quantization, you can moderately reduce `--quant_n_samples` and `--quant_seqlen`. `--quant_method gptq` generally does not encounter quantization OOM.
- `--quant_seqlen`: Quantization parameter, default is `2048`.
Expand All @@ -323,6 +323,8 @@ export parameters inherit from infer parameters, with the following added parame
- `--hub_token`: Default is `None`. See `sft.sh command line arguments` for parameter details.
- `--hub_private_repo`: Default is `False`. See `sft.sh command line arguments` for parameter details.
- `--commit_message`: Default is `'update files'`.
- `--to_ollama`: Export to ollama modelfile.
- `--ollama_output_dir`: ollama output dir. Default is `<modeltype>-ollama`.

## eval parameters

Expand Down Expand Up @@ -363,3 +365,10 @@ deploy parameters inherit from infer parameters, with the following added parame
- `--port`: Default is `8000`.
- `--ssl_keyfile`: Default is `None`.
- `--ssl_certfile`: Default is `None`.
## web-ui Parameters
- `--host`: Default `'127.0.0.1'`.
- `--port`: Default `None`.
- `--lang`: Default `'zh'`.
- `--share`: Default `False`.
Loading

0 comments on commit 8de6765

Please sign in to comment.