-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ernie3.0 python deploy update dev #2145
Ernie3.0 python deploy update dev #2145
Conversation
…deploy_update_dev
CPU端的部署请使用如下指令安装所需依赖 | ||
``` | ||
pip install -r requirements_cpu.txt | ||
``` | ||
### GPU端 | ||
### 1.2 GPU端 | ||
在进行GPU部署之前请先确保机器已经安装好CUDA >= 11.2,CuDNN >= 8.2,然后请使用如下指令安装所需依赖 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为了在GPU上获得最佳的推理性能和稳定性,请先确保机器已正确安装NVIDIA相关驱动和基础软件,确保CUDA >= 11.2,CuDNN >= 8.2,并使用以下命令安装所需依赖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在计算能力(Compute Capability)大于7.0的GPU硬件上,比如T4,如需FP16或者Int8量化推理加速,还需安装TensorRT和Paddle Inference,计算能力(Compute Capability)和精度支持情况请参考:GPU算力和支持精度对照表
这里的话术逻辑调整一下,FP16和INT8采用和NV表格同样的表示方式全大写。准确术语是CUDA Compute Capability
如需使用半精度(FP16)和量化(INT8)部署,请确保GPU设备的CUDA计算能力 (CUDA Compute Capability) 大于7.0,典型的设备包括V100、T4、A10、A100、GTX 20系列和30系列显卡等。同时需额外安装TensorRT和Paddle Inference。
更多关于CUDA Compute Capability和精度支持情况请参考NVIDIA文档:GPU硬件与支持精度对照表
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
|
||
### GPU端 | ||
### 2.3 GPU端推理样例 | ||
在GPU端,请使用如下指令进行部署 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
指令的大多翻译是instruction,计算机术语中偏向于指令集。
而这些shell脚本通常中文翻译为命令,对应Command line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议使用命令这个中文翻译
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -87,18 +90,19 @@ entity: 华夏 label: LOC pos: [14, 15] | |||
如果需要FP16进行加速,可以开启use_fp16开关,具体指令为 | |||
``` | |||
# 第一步,打开set_dynamic_shape开关,自动配置动态shape | |||
python infer_gpu.py --task_name token_cls --model_path ./ner_model/infer --use_fp16 --set_dynamic_shape | |||
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --use_fp16 --set_dynamic_shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个命令执行完?是否有产出什么东西是用户需要注意,而且再第二步要传入的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
``` | ||
如果需要进行int8量化加速,还需要使用量化脚本对训练的FP32模型进行量化,然后使用量化后的模型进行部署,模型的量化请参考:[模型量化脚本使用说明](./../../README.md),量化模型的部署指令为 | ||
如果需要进行int8量化加速,还需要使用量化脚本对训练的FP32模型进行量化,然后使用量化后的模型进行部署,模型的量化请参考:[模型量化脚本使用说明](./../../README.md#模型压缩),量化模型的部署指令为 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int8 统一全局改为INT8(README文档部分)
命令行部分不需要修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
全局的“指令”改为“命令”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
``` | ||
input data: 未来自动驾驶真的会让酒驾和疲劳驾驶成历史吗? | ||
seq cls result: | ||
label: 6 confidence: 4.563379287719727 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
脚本需要补充上label与明文的对应关系,最好输出的时候除了label,也同步包含真实的明文类别,体验会更好。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
----------------------------- | ||
input data: 黄磊接受华少快问快答,不光智商逆天,情商也不逊黄渤 | ||
seq cls result: | ||
label: 2 confidence: 5.694031238555908 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
补充输出对应的明文label
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
model_zoo/ernie-3.0/infer.py
Outdated
@@ -59,7 +59,7 @@ def parse_args(): | |||
help="The directory or name of model.") | |||
parser.add_argument( | |||
"--model_path", | |||
default='tnews_quant_models/mse4/int8', | |||
default='paddle_model/model', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议针对这个参数不要有默认值,这个目录应该是用户清晰准确的设置清楚的。
而且是require的,如果没设置就应该报错。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
整体修改 |
``` | ||
python infer_gpu.py --task_name token_cls --model_path ./ner_model/infer | ||
python infer_cpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --enable_quantize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quantize是enable?其他地方是use是吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是故意有这个区别的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# 第一步,打开set_dynamic_shape开关,自动配置动态shape,在当前目录下生成shape_info.txt文件 | ||
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --use_fp16 --set_dynamic_shape | ||
# 第二步,读取上一步中生成的shape_info.txt文件,开启预测 | ||
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 --use_fp16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果需要依赖shape_info.txt文件,建议第二行命令中特殊标明,这样会在用户运行过程中会更加可感知
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# 第一步,打开set_dynamic_shape开关,自动配置动态shape,在当前目录下生成shape_info.txt文件 | ||
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_quant_infer_model/int8 --set_dynamic_shape | ||
# 第二步,读取上一步中生成的shape_info.txt文件,开启预测 | ||
python infer_gpu.py --task_name token_cls --model_path ./msra_ner_quant_infer_model/int8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
特殊标明 --xxxx shape_info.txt
命令行不是越简短越好,而是要吧required的,有些特殊要用户注意的特殊标明出来作为文档。
否则如果要简洁,全部参数都可以默认,但这反而不是最佳的体验
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
``` | ||
input data: 未来自动驾驶真的会让酒驾和疲劳驾驶成历史吗? | ||
seq cls result: | ||
label: news_car confidence: 4.515341281890869 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的confidence是不是忘记过softmax的概率了?为什么confidence是一个大于1的值呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可否比对下原始模型这里的处理
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
----------------------------- | ||
input data: 黄磊接受华少快问快答,不光智商逆天,情商也不逊黄渤 | ||
seq cls result: | ||
label: news_entertainment confidence: 5.694031238555908 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confidence > 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confidnecen做一下归一化的处理
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
PR types
Others
PR changes
Others
Description