Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ernie3.0 python deploy update dev #2145

Merged
merged 13 commits into from
May 15, 2022
Merged

Ernie3.0 python deploy update dev #2145

merged 13 commits into from
May 15, 2022

Conversation

yeliang2258
Copy link
Contributor

PR types

Others

PR changes

Others

Description

  1. update python deploy doc
  2. update python benchmark script

@ZeyuChen ZeyuChen requested review from ZeyuChen and LiuChiachi and removed request for ZeyuChen May 15, 2022 02:34
@ZeyuChen ZeyuChen self-assigned this May 15, 2022
@ZeyuChen ZeyuChen self-requested a review May 15, 2022 02:34
CPU端的部署请使用如下指令安装所需依赖
```
pip install -r requirements_cpu.txt
```
### GPU端
### 1.2 GPU端
在进行GPU部署之前请先确保机器已经安装好CUDA >= 11.2,CuDNN >= 8.2,然后请使用如下指令安装所需依赖
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为了在GPU上获得最佳的推理性能和稳定性,请先确保机器已正确安装NVIDIA相关驱动和基础软件,确保CUDA >= 11.2,CuDNN >= 8.2,并使用以下命令安装所需依赖

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在计算能力(Compute Capability)大于7.0的GPU硬件上,比如T4,如需FP16或者Int8量化推理加速,还需安装TensorRT和Paddle Inference,计算能力(Compute Capability)和精度支持情况请参考:GPU算力和支持精度对照表


这里的话术逻辑调整一下,FP16和INT8采用和NV表格同样的表示方式全大写。准确术语是CUDA Compute Capability

如需使用半精度(FP16)和量化(INT8)部署,请确保GPU设备的CUDA计算能力 (CUDA Compute Capability) 大于7.0,典型的设备包括V100、T4、A10、A100、GTX 20系列和30系列显卡等。同时需额外安装TensorRT和Paddle Inference。
更多关于CUDA Compute Capability和精度支持情况请参考NVIDIA文档:GPU硬件与支持精度对照表

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



### GPU端
### 2.3 GPU端推理样例
在GPU端,请使用如下指令进行部署
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

指令的大多翻译是instruction,计算机术语中偏向于指令集。
而这些shell脚本通常中文翻译为命令,对应Command line

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议使用命令这个中文翻译

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

model_zoo/ernie-3.0/deploy/python/README.md Show resolved Hide resolved
@@ -87,18 +90,19 @@ entity: 华夏 label: LOC pos: [14, 15]
如果需要FP16进行加速,可以开启use_fp16开关,具体指令为
```
# 第一步,打开set_dynamic_shape开关,自动配置动态shape
python infer_gpu.py --task_name token_cls --model_path ./ner_model/infer --use_fp16 --set_dynamic_shape
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --use_fp16 --set_dynamic_shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个命令执行完?是否有产出什么东西是用户需要注意,而且再第二步要传入的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

```
如果需要进行int8量化加速,还需要使用量化脚本对训练的FP32模型进行量化,然后使用量化后的模型进行部署,模型的量化请参考:[模型量化脚本使用说明](./../../README.md),量化模型的部署指令为
如果需要进行int8量化加速,还需要使用量化脚本对训练的FP32模型进行量化,然后使用量化后的模型进行部署,模型的量化请参考:[模型量化脚本使用说明](./../../README.md#模型压缩),量化模型的部署指令为
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int8 统一全局改为INT8(README文档部分)
命令行部分不需要修改

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全局的“指令”改为“命令”

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

```
input data: 未来自动驾驶真的会让酒驾和疲劳驾驶成历史吗?
seq cls result:
label: 6 confidence: 4.563379287719727
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

脚本需要补充上label与明文的对应关系,最好输出的时候除了label,也同步包含真实的明文类别,体验会更好。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

-----------------------------
input data: 黄磊接受华少快问快答,不光智商逆天,情商也不逊黄渤
seq cls result:
label: 2 confidence: 5.694031238555908
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

补充输出对应的明文label

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -59,7 +59,7 @@ def parse_args():
help="The directory or name of model.")
parser.add_argument(
"--model_path",
default='tnews_quant_models/mse4/int8',
default='paddle_model/model',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议针对这个参数不要有默认值,这个目录应该是用户清晰准确的设置清楚的。
而且是require的,如果没设置就应该报错。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

model_zoo/ernie-3.0/infer.py Show resolved Hide resolved
@ZeyuChen
Copy link
Member

整体修改
int8 -> INT8
指令 -> 命令

@yeliang2258 yeliang2258 requested a review from ZeyuChen May 15, 2022 04:29
```
python infer_gpu.py --task_name token_cls --model_path ./ner_model/infer
python infer_cpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --enable_quantize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quantize是enable?其他地方是use是吗?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是故意有这个区别的吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# 第一步,打开set_dynamic_shape开关,自动配置动态shape,在当前目录下生成shape_info.txt文件
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --use_fp16 --set_dynamic_shape
# 第二步,读取上一步中生成的shape_info.txt文件,开启预测
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --use_fp16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果需要依赖shape_info.txt文件,建议第二行命令中特殊标明,这样会在用户运行过程中会更加可感知

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# 第一步,打开set_dynamic_shape开关,自动配置动态shape,在当前目录下生成shape_info.txt文件
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_quant_infer_model/int8 --set_dynamic_shape
# 第二步,读取上一步中生成的shape_info.txt文件,开启预测
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_quant_infer_model/int8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

特殊标明 --xxxx shape_info.txt

命令行不是越简短越好,而是要吧required的,有些特殊要用户注意的特殊标明出来作为文档。
否则如果要简洁,全部参数都可以默认,但这反而不是最佳的体验

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

```
input data: 未来自动驾驶真的会让酒驾和疲劳驾驶成历史吗?
seq cls result:
label: news_car confidence: 4.515341281890869
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的confidence是不是忘记过softmax的概率了?为什么confidence是一个大于1的值呢?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可否比对下原始模型这里的处理

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

-----------------------------
input data: 黄磊接受华少快问快答,不光智商逆天,情商也不逊黄渤
seq cls result:
label: news_entertainment confidence: 5.694031238555908
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confidence > 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confidnecen做一下归一化的处理

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@yeliang2258 yeliang2258 requested a review from ZeyuChen May 15, 2022 05:39
@ZeyuChen ZeyuChen merged commit d58a221 into PaddlePaddle:develop May 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants