Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TinyBERT readme bug #3535

Merged
merged 1 commit into from
Oct 24, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
fix tinybert readme bug
  • Loading branch information
LiuChiachi committed Oct 21, 2022
commit 036e5de78f6b7c32b1c6ec90dae1e4bd4e7adbbc
12 changes: 6 additions & 6 deletions model_zoo/tinybert/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ python -u ./run_glue.py \

```

训练完成之后,可将训练效果最好的模型保存在本项目下的`pretrained_models/$TASK_NAME/`下。模型目录下有`model_config.json`, `model_state.pdparams`, `tokenizer_config.json`及`vocab.txt`这几个文件。
训练完成之后,可将训练效果最好的模型保存在本项目下的 `$TEACHER_DIR` 下。模型目录下有`model_config.json`, `model_state.pdparams`, `tokenizer_config.json`及`vocab.txt`这几个文件。


### 对TinyBERT在特定任务下蒸馏
Expand All @@ -68,7 +68,10 @@ python -u ./run_glue.py \
```shell
export CUDA_VISIBLE_DEVICES=0
export TASK_NAME=SST-2
export TEACHER_DIR=../pretrained_models/SST-2/best_model_610
export TEACHER_DIR=teacher_models

# Moves the best model to $TEACHER_DIR
mv ../../examples/benchmark/glue/tmp/SST-2/sst-2_ft_model_xx.pdparams/ $TEACHER_DIR

python task_distill.py \
--model_type tinybert \
Expand Down Expand Up @@ -103,12 +106,9 @@ python task_distill.py \
然后对预测层进行蒸馏:

```shell

export TEACHER_DIR=../pretrained_models/SST-2/best_model_610

python task_distill.py \
--model_type tinybert \
--student_model_name_or_path tmp/TASK_NAME best_inter_model \
--student_model_name_or_path tmp/$TASK_NAME/intermediate_distill_model_final.pdparams \
--task_name $TASK_NAME \
--max_seq_length 64 \
--batch_size 32 \
Expand Down