Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
arguments.py		arguments.py
config_data.py		config_data.py
evaluate.sh		evaluate.sh
finetune.sh		finetune.sh
finetune_chatglm.py		finetune_chatglm.py
infer_chatglm.py		infer_chatglm.py
requirements.txt		requirements.txt
utils.py		utils.py

Repository files navigation

ChatGLM Efficient Tuning

Fine-tuning 🤖ChatGLM-6B model with 🤗PEFT.

[ English | 中文 ]

Datasets

Our script now supports the following datasets:

Please refer to config_data.py for details.

[23/04/11] Now we support training with combined datasets! Try dataset1,dataset2 argument for training with multiple datasets.

Fine-Tuning Methods

Our script now supports the following fine-tuning methods:

Requirement

Python 3.10 and PyTorch 2.0.0
🤗Transformers, Datasets, and PEFT
protobuf, cpm_kernels, sentencepiece
jieba, rouge_chinese, nltk

And powerful GPUs!

Getting Started

Preparation (optional)

git clone https://github.com/hiyouga/ChatGLM-Efficient-Tuning.git
conda create -n cet python=3.10
conda activate cet
pip install -r requirements.txt

Fine-tuning

CUDA_VISIBLE_DEVICES=0 python finetune_chatglm.py \
    --do_train \
    --dataset alpaca_zh \
    --finetuning_type lora \
    --output_dir output \
    --overwrite_cache \
    --overwrite_output_dir \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --warmup_steps 100 \
    --max_train_samples 10000 \
    --learning_rate 5e-5 \
    --num_train_epochs 1.0 \
    --fp16

Evaluation (BLEU and ROUGE_CHINESE)

CUDA_VISIBLE_DEVICES=0 python finetune_chatglm.py \
    --do_eval \
    --dataset alpaca_zh \
    --output_dir eval \
    --overwrite_cache \
    --overwrite_output_dir \
    --per_device_eval_batch_size 1 \
    --max_eval_samples 20 \
    --predict_with_generate

Inference

CUDA_VISIBLE_DEVICES=0 python infer_chatglm.py

Hardware Requirements

Batch size	LoRA `r`	Mode	GRAM
8	8	FP16	24GB

Compared with Existing Implementations

THUDM/ChatGLM-6B
- Official implementation of fine-tuning ChatGLM with P-Tuning v2 on the ADGEN dataset.
- Our fine-tuning script is largely depend on it. We further implement the LoRA tuning method. Additionally, we dynamically pad the inputs to the longest sequence in the batch instead of the maximum length, to accelerate the fine-tuning.
mymusise/ChatGLM-Tuning
- An unoffical implementation of fine-tuning ChatGLM with LoRA on the Stanford Alpaca dataset.
- We borrowed some ideas from it. Our fine-tuning script integrates the data pre-processing part into the training procedure, so we need not generate a pre-processed dataset before training.
ssbuild/chatglm_finetuning
- An unofficial implementation of fine-tuning ChatGLM with several PEFT methods on the Stanford Alpaca dataset.
- Our fine-tuning script is implemented purely with Huggingface transformers and is independent of the deep_training framework.
lich99/ChatGLM-finetune-LoRA
- An unofficial implementation of fine-tuning ChatGLM with LoRA on the Stanford Alpaca dataset.
- We use the Huggingface PEFT to provide the state-of-the-art PEFT methods.
liucongg/ChatGLM-Finetuning
- An unofficial implementation of fine-tuning ChatGLM with several methods including Freeze, LoRA and P-Tuning on the industrial dataset.
- We are aim to incorporate more instruction-following datasets for fine-tuning the ChatGLM model.
yanqiangmiffy/InstructGLM
- An unofficial implementation of fine-tuning ChatGLM that explores the ChatGLM's ability on the instruction-following datasets.
- Our fine-tuning script integrates the data pre-processing part in to the training procedure.

TODO

Incorporating Chinese datasets into the training sets.
- ~~BELLE~~
- pCLUE
- CLUECorpus
- ~~GuanacoDataset~~
- ~~FireflyDataset~~
Incorporating ChatGPT & GPT-4 self-chat data into the training sets.
- Baize
- ~~GPT-4-LLM~~
Implementing the Freeze-Tuning and ~~P-Tuning~~ method.
Supporting Multi-GPUs fine-tuning.
~~Add script for evaluation.~~ (but it appears very slow)

License

This repository is licensed under the Apache-2.0 License.

Citation

If this work is helpful, please cite as:

@Misc{chatglm-efficient-tuning,
  title = {ChatGLM Efficient Tuning},
  author = {hiyouga},
  howpublished = {\url{https://github.com/hiyouga/ChatGLM-Efficient-Tuning}},
  year = {2023}
}

Acknowledgement

This repo benefits from ChatGLM-6B, ChatGLM-Tuning and yuanzhoulvpi2017/zero_nlp. Thanks for their wonderful works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatGLM Efficient Tuning

Datasets

Fine-Tuning Methods

Requirement

Getting Started

Preparation (optional)

Fine-tuning

Evaluation (BLEU and ROUGE_CHINESE)

Inference

Hardware Requirements

Compared with Existing Implementations

TODO

License

Citation

Acknowledgement

About

Releases

Packages

Languages

License

RayJue/ChatGLM-Efficient-Tuning

Folders and files

Latest commit

History

Repository files navigation

ChatGLM Efficient Tuning

Datasets

Fine-Tuning Methods

Requirement

Getting Started

Preparation (optional)

Fine-tuning

Evaluation (BLEU and ROUGE_CHINESE)

Inference

Hardware Requirements

Compared with Existing Implementations

TODO

License

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages