fix do_sample #1946

Jintao-Huang · 2024-09-05T05:01:30Z

PR type

Bug Fix

FIX ISSUE: #1943

baibaiw5 · 2024-09-05T13:39:45Z

@Jintao-Huang @tastelikefeet
测试了最新分支，还是有问题
UserWarning: do_sample is set to False. However, temperature is set to 0.0 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.

run sh: /data/xxx/anaconda3/envs/ms-swift/bin/python /home/xxx/projects/swift/swift/cli/infer.py --do_sample True --temperature 0 --model_type qwen2-vl-2b-instruct --sft_type full --ckpt_dir /home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532 --val_dataset /home/xxx/data/datasets/boygirl/diff_20240813_qwenvl2.jsonl
[INFO:swift] Successfully registered /home/xxx/projects/swift/swift/llm/data/dataset_info.json
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'
[INFO:swift] Start time of running main: 2024-09-05 21:33:22.186028
[INFO:swift] Using val_dataset, ignoring dataset_test_ratio
[INFO:swift] ckpt_dir: /home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532
[INFO:swift] Setting model_info['revision']: master
[INFO:swift] Setting self.eval_human: False
[INFO:swift] args: InferArguments(model_type='qwen2-vl-2b-instruct', model_id_or_path='/home/xxx/models/qwen/Qwen2-VL-2B-Instruct', model_revision='master', sft_type='full', template_type='qwen2-vl', infer_backend='pt', ckpt_dir='/home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532', result_dir=None, load_args_from_ckpt_dir=True, load_dataset_config=False, eval_human=False, seed=42, dtype='bf16', model_kwargs=None, dataset=[], val_dataset=['/home/xxx/data/datasets/boygirl/diff_20240813_qwenvl2.jsonl'], dataset_seed=42, dataset_test_ratio=0.0, show_dataset_sample=-1, save_result=True, system='You are a helpful assistant.', tools_prompt='react_en', max_length=None, truncation_strategy='delete', check_dataset_strategy='none', model_name=[None, None], model_author=[None, None], quant_method=None, quantization_bit=0, hqq_axis=0, hqq_dynamic_config_path=None, bnb_4bit_comp_dtype='bf16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=2048, do_sample=False, temperature=0.0, top_k=None, top_p=None, repetition_penalty=None, num_beams=1, stop_words=[], rope_scaling=None, use_flash_attn=None, ignore_args_error=False, stream=True, merge_lora=False, merge_device_map='cpu', save_safetensors=True, overwrite_generation_config=False, verbose=None, local_repo_path=None, custom_register_path=None, custom_dataset_info=None, device_map_config=None, device_max_memory=[], hub_token=None, gpu_memory_utilization=0.9, tensor_parallel_size=1, max_num_seqs=256, max_model_len=None, disable_custom_all_reduce=True, enforce_eager=False, vllm_enable_lora=False, vllm_max_lora_rank=16, lora_modules=[], max_logprobs=20, tp=1, cache_max_entry_count=0.8, quant_policy=0, vision_batch_size=1, self_cognition_sample=0, train_dataset_sample=-1, val_dataset_sample=None, safe_serialization=None, model_cache_dir=None, merge_lora_and_save=None, custom_train_dataset_path=[], custom_val_dataset_path=[], vllm_lora_modules=None, device_map_config_path=None)
[INFO:swift] Global seed set to 42
[INFO:swift] device_count: 4
[INFO:swift] Loading the model using model_dir: /home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532
[INFO:swift] model_kwargs: {'low_cpu_mem_usage': True, 'device_map': 'auto'}
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
[INFO:swift] model.max_model_len: 32768
[INFO:swift] model_config: Qwen2VLConfig {
"_name_or_path": "/home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532",
"architectures": [
"Qwen2VLForConditionalGeneration"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 1536,
"image_token_id": 151655,
"initializer_range": 0.02,
"intermediate_size": 8960,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2_vl",
"num_attention_heads": 12,
"num_hidden_layers": 28,
"num_key_value_heads": 2,
"rms_norm_eps": 1e-06,
"rope_scaling": {
"mrope_section": [
16,
24,
24
],
"type": "mrope"
},
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.45.0.dev0",
"use_cache": false,
"use_sliding_window": false,
"video_token_id": 151656,
"vision_config": {
"hidden_size": 1536,
"in_chans": 3,
"model_type": "qwen2_vl",
"spatial_patch_size": 14
},
"vision_end_token_id": 151653,
"vision_start_token_id": 151652,
"vision_token_id": 151654,
"vocab_size": 151936
}

/data/xxx/anaconda3/envs/ms-swift/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: do_sample is set to False. However, temperature is set to 0.0 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
[INFO:swift] model.generation_config: GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151645,
"max_new_tokens": 2048,
"pad_token_id": 151643,
"repetition_penalty": 1.05,
"temperature": 0.0,
"top_k": 1,
"top_p": 0.001
}

[INFO:swift] [visual.patch_embed.proj.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm1.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm1.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm2.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm2.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.qkv.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.qkv.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.proj.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.proj.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc1.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc1.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc2.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc2.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm1.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm1.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm2.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm2.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.attn.qkv.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.attn.qkv.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.attn.proj.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] ...
[INFO:swift] Qwen2VLForConditionalGeneration(
(visual): Qwen2VisionTransformerPretrainedModel(
(patch_embed): PatchEmbed(
(proj): Conv3d(3, 1280, kernel_size=(2, 14, 14), stride=(2, 14, 14), bias=False)
)
(rotary_pos_emb): VisionRotaryEmbedding()
(blocks): ModuleList(
(0-31): 32 x Qwen2VLVisionBlock(
(norm1): LayerNorm((1280,), eps=1e-06, elementwise_affine=True)
(norm2): LayerNorm((1280,), eps=1e-06, elementwise_affine=True)
(attn): VisionSdpaAttention(
(qkv): Linear(in_features=1280, out_features=3840, bias=True)
(proj): Linear(in_features=1280, out_features=1280, bias=True)
)
(mlp): VisionMlp(
(fc1): Linear(in_features=1280, out_features=5120, bias=True)
(act): QuickGELUActivation()
(fc2): Linear(in_features=5120, out_features=1280, bias=True)
)
)
)
(merger): PatchMerger(
(ln_q): LayerNorm((1280,), eps=1e-06, elementwise_affine=True)
(mlp): Sequential(
(0): Linear(in_features=5120, out_features=5120, bias=True)
(1): GELU(approximate='none')
(2): Linear(in_features=5120, out_features=1536, bias=True)
)
)
)
(model): Qwen2VLModel(
(embed_tokens): Embedding(151936, 1536)
(layers): ModuleList(
(0-27): 28 x Qwen2VLDecoderLayer(
(self_attn): Qwen2VLSdpaAttention(
(q_proj): Linear(in_features=1536, out_features=1536, bias=True)
(k_proj): Linear(in_features=1536, out_features=256, bias=True)
(v_proj): Linear(in_features=1536, out_features=256, bias=True)
(o_proj): Linear(in_features=1536, out_features=1536, bias=False)
(rotary_emb): Qwen2RotaryEmbedding()
)
(mlp): Qwen2MLP(
(gate_proj): Linear(in_features=1536, out_features=8960, bias=False)
(up_proj): Linear(in_features=1536, out_features=8960, bias=False)
(down_proj): Linear(in_features=8960, out_features=1536, bias=False)
(act_fn): SiLU()
)
(input_layernorm): Qwen2RMSNorm((1536,), eps=1e-06)
(post_attention_layernorm): Qwen2RMSNorm((1536,), eps=1e-06)
)
)
(norm): Qwen2RMSNorm((1536,), eps=1e-06)
)
(lm_head): Linear(in_features=1536, out_features=151936, bias=False)
)
[INFO:swift] Qwen2VLForConditionalGeneration: 2208.9856M Params (0.0000M Trainable [0.0000%]), 234.8828M Buffers.
[INFO:swift] system: You are a helpful assistant.
[INFO:swift] val_dataset: Dataset({
features: ['system', 'query', 'response'],

Jintao-Huang · 2024-09-05T13:48:49Z

不复现啊，你是不是改代码了

baibaiw5 · 2024-09-05T13:53:03Z

@Jintao-Huang
没有改代码，使用transformers 4.45.0.dev0

baibaiw5 · 2024-09-05T13:53:46Z

上一个版本运行到时候，程序直接报错，这版可以往后执行，但是实际没有设置成功

Jintao-Huang · 2024-09-05T13:55:04Z

--do_sample True --temperature 0 这两个参数是不能共存的

Jintao-Huang · 2024-09-05T13:55:34Z

temperature 为 0. -> do_sample false

baibaiw5 · 2024-09-05T13:59:05Z

OK

fix do_sample

82b3255

tastelikefeet approved these changes Sep 5, 2024

View reviewed changes

Jintao-Huang merged commit a1f9393 into modelscope:main Sep 5, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix do_sample #1946

fix do_sample #1946

Jintao-Huang commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024

Jintao-Huang commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024

Jintao-Huang commented Sep 5, 2024

Jintao-Huang commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024

fix do_sample #1946

fix do_sample #1946

Conversation

Jintao-Huang commented Sep 5, 2024

PR type

baibaiw5 commented Sep 5, 2024

Jintao-Huang commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024

Jintao-Huang commented Sep 5, 2024

Jintao-Huang commented Sep 5, 2024

baibaiw5 commented Sep 5, 2024