qwen2-vl微调使用flash_attn报错 #1887

zhangfan-algo · 2024-09-02T03:21:54Z

Describe the bug

Your hardware and system info
torchrun --nproc_per_node ${num_gpu_per_node} --master_port $MASTER_PORT --master_addr $MASTER_ADDR --node_rank $RANK --nnodes $WORLD_SIZE examples/pytorch/llm/llm_sft.py
--model_cache_dir models/Qwen/Qwen2-VL-7B-Instruct
--model_type qwen2-vl-7b-instruct
--sft_type full
--freeze_vit true
--tuner_backend swift
--template_type AUTO
--output_dir output/-correction-0830
--ddp_backend nccl
--custom_train_dataset_path homework_correction_train2.jsonl
--system "你是一位小学数学作业批改专家。"
--dataset_test_ratio 0.01
--self_cognition_sample -1
--preprocess_num_proc 60
--dataloader_num_workers 60
--train_dataset_sample -1
--dataset_test_ratio 0.01
--save_strategy epoch
--lr_scheduler_type cosine
--save_total_limit 5
--num_train_epochs 5
--eval_steps 50
--logging_steps 10
--max_length 2048
--check_dataset_strategy warning
--gradient_checkpointing true
--batch_size 4
--gradient_accumulation_steps 1
--deepspeed_config_path ds_z2_config.json
--weight_decay 0.01
--learning_rate 1e-5
--max_grad_norm 0.5
--warmup_ratio 0.03
--use_flash_attn true
--save_only_model false
--save_on_each_node false
--lazy_tokenize true
--neftune_noise_alpha 5
--dtype AUTO

zhangfan-algo · 2024-09-02T03:30:47Z

cuda 版本是12.1 ，GPU是64张A800

Jintao-Huang · 2024-09-02T03:38:45Z

去掉flash_attn

zhangfan-algo · 2024-09-02T03:49:28Z

去掉flash_attn
只微调LLM部分也不支持使用flash attn嘛

Jintao-Huang · 2024-09-02T04:39:22Z

flash attn & qwen2-vl有bug

zhangfan-algo · 2024-09-02T05:59:48Z

flash attn关闭后又有了新的报错

tastelikefeet · 2024-09-05T02:59:13Z

#1857 可以查看这个的解决方案

Jintao-Huang changed the title ~~qwen2-vl微调报错~~ qwen2-vl微调使用flash_attn报错 Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen2-vl微调使用flash_attn报错 #1887

qwen2-vl微调使用flash_attn报错 #1887

zhangfan-algo commented Sep 2, 2024

zhangfan-algo commented Sep 2, 2024

Jintao-Huang commented Sep 2, 2024

zhangfan-algo commented Sep 2, 2024

Jintao-Huang commented Sep 2, 2024

zhangfan-algo commented Sep 2, 2024

tastelikefeet commented Sep 5, 2024

qwen2-vl微调使用flash_attn报错 #1887

qwen2-vl微调使用flash_attn报错 #1887

Comments

zhangfan-algo commented Sep 2, 2024

zhangfan-algo commented Sep 2, 2024

Jintao-Huang commented Sep 2, 2024

zhangfan-algo commented Sep 2, 2024

Jintao-Huang commented Sep 2, 2024

zhangfan-algo commented Sep 2, 2024

tastelikefeet commented Sep 5, 2024