Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ERNIE 3.0 README #2142

Merged
merged 8 commits into from
May 16, 2022

Conversation

LiuChiachi
Copy link
Contributor

@LiuChiachi LiuChiachi commented May 14, 2022

PR types

Others

PR changes

Docs

Description

Update ERNIE 3.0 README

文档的表述还在修改中
逻辑结构自查,修改
QA任务压缩指标偏低,优化中,裁剪训练中

@LiuChiachi LiuChiachi marked this pull request as draft May 14, 2022 05:49

4. PaddlePaddle版本:2.3-RC0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接按照正式版本来写,不要突出rc了


4. PaddlePaddle版本:2.3-RC0

5. PaddleNLP版本:2.3-RC0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

均适用2.3,不要带rc

| ERNIE-3.0-medium+FP32 | 311.95(1.0X) | 57.45 | 90.91(1.0x) | 93.04 | 33.74(1.0x) | 66.95 |
| ERNIE-3.0-medium+INT8 | 600.35(1.9x) | 56.57(-0.88) | 141.00(1.6x) | 92.64(-0.40) | 56.51(1.7x) | 66.23(-0.72) |
| ERNIE-3.0-medium+裁剪+FP32 | 408.65(1.3x) | 57.31(-0.14) | 122.13(1.3x) | 93.27(+0.23) | 48.47(1.4x) | 65.55(-1.40) |
| ERNIE-3.0-medium+裁减+INT8 | 704.42(2.3x) | 56.69(-0.76) | 215.58(2.4x) | 92.39(-0.65) | 75.23(2.2x) | 63.47(-3.48) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所有medium统一一下 M大写,Base也一样

| ERNIE-3.0-medium+ INT8 | 600.35 | 141.00 | 56.51 |
| ERNIE-3.0-medium+ 裁剪+FP32 | 408.65 | 122.13 | 48.47 |
| ERNIE-3.0-medium + 裁减+INT8 | 704.42 | 215.58 | 75.23 |
经过动态量化(onnx)后,加速比分别达到 1.9、1.6、1.7 倍,精度分别下降 0.88、0.40、0.72。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用突出onnx和动态这两字眼


裁剪后加速比达到 1.3、1.3、1.4 倍,精度分别:下降 0.14、提升 0.23、下降 1.40。

裁剪+量化后加速比达到 2.3、2.4、2.2 倍,精度分别下降 0.76、0.65、3.48。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这3段描述改为表格呈现


测试环境 & 结论:
裁剪后加速比达到 1.3、1.2、1.3 倍,精度分别:下降 0.14、提升0.23、下降 1.40;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用表格呈现 所有量化均使用量化即可 不要区分动态和静态


##### CPU 性能

线程数12,有以下测试数据:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

列上详细的CPU硬件信息。
CPU硬件信息,线程数XX,


6. 性能数据单位是 QPS,QPS 测试方法:设置足够大的 batch size,将显存占满,然后固定为该 batch_size 进行测试。QPS = batch_size / mean_time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

设置足够大的 batch size,将显存占满,然后固定为该 batch_size 进行测试。
这种说法会不会显得不够严谨。
是否可以给出制定的硬件信息后,每个任务跑的这个batch size列出来会更好


TODO 完善数据,结论
由表可知,`ERNIE 3.0 Medium` 模型经过裁剪和量化后,精度平均下降 0.46,其中裁剪后下降了 0.17,单独量化精度平均下降 0.77。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERNIE 3.0 Medium -> ERNIE 3.0-Medium



测试环境 & 结论:

三类任务(分类、NER、阅读理解)经过裁剪、量化后加速比均达到 3 倍左右,所有任务上平均精度降低 0.46。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

三类任务(文本分类、序列标注、阅读理解)经过裁剪+量化后加速比均达到 3 倍左右,所有任务上平均精度损失可控制在0.5以内(0.46)。


6. 性能数据单位是 QPS,QPS 测试方法:设置足够大的 batch size,将显存占满,然后固定为该 batch_size 进行测试。QPS = batch_size / mean_time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外你这里写的是CPU的性能,但却写讲显存占满,这也是不严谨的。上下会不对应。


##### CPU 性能
1. 数据集:TNEWS(分类)、MSRA_NER(命名实体识别,下面简称 NER)、CMRC2018(阅读理解)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TNEWS(文本分类)、MSRA_NER(序列标注)、CMRC2018(阅读理解)


6. 性能数据单位是 QPS,QPS 测试方法:设置足够大的 batch size,将显存占满,然后固定为该 batch_size 进行测试。QPS = batch_size / mean_time

7. 精度数据单位 NER 是 f1 值,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所有f1要写大写。
f1值 -> F1-Score
这些评测术语,清按照大家Paper写的时候习惯来写,不要随意小写改造

如Accracy/Precision/Recall/F1-Score/

@ZeyuChen ZeyuChen self-assigned this May 15, 2022
pred = paddle.argmax(logits, axis=1).numpy().tolist()
preds += pred
for idx, pred in enumerate(preds):
# import pdb; pdb.set_trace()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除无用注释

@@ -440,7 +440,7 @@ qa_model = AutoModelForQuestionAnswering.from_pretrained("ernie-3.0-base-zh")

```

ERNIE 3.0 提供了针对分类、命名实体识别、阅读理解三大场景下的微调使用样例,分别参考 `run_seq_cls.py` 、`run_token_cls.py`、`run_qa.py` 三个脚本,启动方式如下:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文本分类、序列标注、阅读理解
统一改成这三个技术术语吧

@@ -527,74 +533,87 @@ python infer.py --task_name tnews --model_path best_models/TNEWS/compress/0.75/h

**压缩 API 使用TIPS:**

1. 压缩 API 提供裁剪和量化两个过程,建议两种都选择,裁剪需要训练,训练时间视下游任务数据量而定且和微调是一个量级的。量化不需要训练,更快;因此也可以只选择量化
1. 压缩 API 提供裁剪和量化两个过程,如果硬件支持量化模型的部署,建议两种都选择。目前支持的裁剪策略需要训练,训练时间视下游任务数据量而定,且和微调的训练时间是一个量级的。静态离线量化不需要训练,更快,加速比比裁剪更明显,但是单独量化精度下降可能也更多

2. 裁剪类似蒸馏过程,方便起见,可以直接使用微调时的超参。为了进一步提升精度,可以对 `batch_size`、`learning_rate`、`epoch`、`max_seq_length` 等超参进行 grid search;

3. 模型压缩主要用于推理部署,因此压缩后的模型都是静态图模型,只可用于预测,不能再通过 `from_pretrained` 导入继续训练。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只可用于预测 ->
只可用于推理部署

预测和推理部署还是有点不一样的,普通的python脚本也可以预测


线程数12,有以下测试数据:

| | 分类性能 | 分类精度 | NER 性能 | NER精度 | 阅读理解性能 | 阅读理解精度 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果这个性能实测是在某个数据集的话,是否严格标明某个数据集会更准确?
如TNEWS性能/TNEWS精度/MSRA_NER性能/MSRA_NER精度 ...

@LiuChiachi LiuChiachi marked this pull request as ready for review May 16, 2022 05:08
@ZeyuChen ZeyuChen merged commit 6f378fd into PaddlePaddle:develop May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants