手动构建Llama3模型,并使用RLHF方法训练. 运行环境: torch==1.13.1+cu117 transformers==4.38.2 datasets==2.18.0 accelerate==0.28.0 视频课程:https://www.bilibili.com/video/BV1mZ421M7Wm