update readme

GeekCodingX · Jul 21, 2023 · fda94d5 · fda94d5
1 parent dbe68ec
commit fda94d5
Show file tree

Hide file tree

Showing 2 changed files with 4 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -324,9 +324,9 @@ We select 100 instances in the `alpaca_gpt4_zh` dataset to evaluate the fine-tun
 
 > FZ: freeze tuning, PT: P-Tuning V2 (we use `pre_seq_len=16` for fair comparison with LoRA), Params: the percentange of trainable parameters.
 
-### RLHF Labeling
+## Projects
 
-In the RLHF stage, it is necessary to manually rank the k response generated by the LLM. If there is no good labeling tool, you can choose [SupritYoung/RLHF-Label-Tool](https://github.com/SupritYoung/RLHF-Label-Tool/tree/master).
+- [SupritYoung/RLHF-Label-Tool](https://github.com/SupritYoung/RLHF-Label-Tool/tree/master): A tool for ranking the responses of LLMs to generate annotated samples used in RLHF training.
 
 ## Compared with Existing Implementations
 

diff --git a/README_zh.md b/README_zh.md
@@ -329,9 +329,9 @@ python src/export_model.py \
 
 > FZ：Freeze 微调，PT：P-Tuning V2 微调（为了与 LoRA 公平比较，我们使用了 `pre_seq_len=16`），训练参数：可训练参数占全部参数的百分比。
 
-### RLHF 标注工具
+## 友情链接
 
-在 RLHF 阶段需要手工对模型生成的 k 个数据进行排序，如果没有好的标注工具，可以选用 [SupritYoung/RLHF-Label-Tool](https://github.com/SupritYoung/RLHF-Label-Tool/tree/master) 进行标注。
+- [SupritYoung/RLHF-Label-Tool](https://github.com/SupritYoung/RLHF-Label-Tool/tree/master)：一个给大模型生成结果进行排序，从而获得用于 RLHF 训练的标注数据的平台。
 
 ## 和现有类似项目的比较