This repository contains the implementation of our paper "BinaryBERT: Pushing the Limit of BERT Quantization" in ACL 2021. The overall workflow of training BinaryBERT is shown below. We first train a half-sized ternary BERT model, and then apply ternary weight splitting to initalize the full-sized BinaryBERT. We then fine-tune BinaryBERT for further refinement.
pip install -r requirements.txt
We train and test BinaryBERT on GLUE and SQuAD benchmarks. Both dataset are available online:
For data augmentation on GLUE, please follow the instruction in TinyBERT.
Our experiments are based on the fine-tuned full-precision DynaBERT,
which can be found here.
Complete running scripts and more detailed tips are provided in ./scripts
.
There are two steps for execution, and we illustrate them
with training BinaryBERT with 4-bit activations on MRPC.
This correponds to scripts/ternary_glue.sh
. For example
sh scripts/terarny_glue.sh mrpc data/mrpc/ models/dynabert_model/mrpc/width_0.5_depth_1.0/ models/dynabert_model/mrpc/width_0.5_depth_1.0/ 2 4
This correponds to scripts/tws_glue.sh
. Based on the model checkpoint of ternary BERT, execute:
sh scripts/tws_glue.sh mrpc data/mrpc/ models/dynabert_model/mrpc/width_0.5_depth_1.0/ output/Ternary_W2A8/mrpc/kd_stage2/ 1 4
Go through each script for more detail.
If you find this repo helpful for your research, please:
@inproceedings{bai2021binarybert,
title={BinaryBERT: Pushing the Limit of BERT Quantization},
author={Bai, H. and Zhang, W. and Hou, L. and Shang, L. and Jin, J. and Jiang, X. and Liu, Q. and Lyu, M. and King, I.},
booktitle={Annual Meeting of the Association for Computational Linguistics},
year={2021}
}