Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qwen2-vl微调使用flash_attn报错 #1887

Open
zhangfan-algo opened this issue Sep 2, 2024 · 6 comments
Open

qwen2-vl微调使用flash_attn报错 #1887

zhangfan-algo opened this issue Sep 2, 2024 · 6 comments

Comments

@zhangfan-algo
Copy link

Describe the bug
image
image

Your hardware and system info
torchrun --nproc_per_node ${num_gpu_per_node} --master_port $MASTER_PORT --master_addr $MASTER_ADDR --node_rank $RANK --nnodes $WORLD_SIZE examples/pytorch/llm/llm_sft.py
--model_cache_dir models/Qwen/Qwen2-VL-7B-Instruct
--model_type qwen2-vl-7b-instruct
--sft_type full
--freeze_vit true
--tuner_backend swift
--template_type AUTO
--output_dir output/-correction-0830
--ddp_backend nccl
--custom_train_dataset_path homework_correction_train2.jsonl
--system "你是一位小学数学作业批改专家。"
--dataset_test_ratio 0.01
--self_cognition_sample -1
--preprocess_num_proc 60
--dataloader_num_workers 60
--train_dataset_sample -1
--dataset_test_ratio 0.01
--save_strategy epoch
--lr_scheduler_type cosine
--save_total_limit 5
--num_train_epochs 5
--eval_steps 50
--logging_steps 10
--max_length 2048
--check_dataset_strategy warning
--gradient_checkpointing true
--batch_size 4
--gradient_accumulation_steps 1
--deepspeed_config_path ds_z2_config.json
--weight_decay 0.01
--learning_rate 1e-5
--max_grad_norm 0.5
--warmup_ratio 0.03
--use_flash_attn true
--save_only_model false
--save_on_each_node false
--lazy_tokenize true
--neftune_noise_alpha 5
--dtype AUTO

@zhangfan-algo
Copy link
Author

cuda 版本是12.1 ,GPU是64张A800

@Jintao-Huang
Copy link
Collaborator

去掉flash_attn

@Jintao-Huang Jintao-Huang changed the title qwen2-vl微调报错 qwen2-vl微调使用flash_attn报错 Sep 2, 2024
@zhangfan-algo
Copy link
Author

去掉flash_attn
只微调LLM部分也不支持使用flash attn嘛

@Jintao-Huang
Copy link
Collaborator

flash attn & qwen2-vl有bug

@zhangfan-algo
Copy link
Author

flash attn关闭后又有了新的报错
image

@tastelikefeet
Copy link
Collaborator

#1857 可以查看这个的解决方案

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants