Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix do_sample #1946

Merged
merged 1 commit into from
Sep 5, 2024
Merged

fix do_sample #1946

merged 1 commit into from
Sep 5, 2024

Conversation

Jintao-Huang
Copy link
Collaborator

PR type

  • Bug Fix

FIX ISSUE: #1943

@Jintao-Huang Jintao-Huang merged commit a1f9393 into modelscope:main Sep 5, 2024
2 checks passed
@baibaiw5
Copy link

baibaiw5 commented Sep 5, 2024

@Jintao-Huang @tastelikefeet
测试了最新分支,还是有问题
UserWarning: do_sample is set to False. However, temperature is set to 0.0 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.

run sh: /data/xxx/anaconda3/envs/ms-swift/bin/python /home/xxx/projects/swift/swift/cli/infer.py --do_sample True --temperature 0 --model_type qwen2-vl-2b-instruct --sft_type full --ckpt_dir /home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532 --val_dataset /home/xxx/data/datasets/boygirl/diff_20240813_qwenvl2.jsonl
[INFO:swift] Successfully registered /home/xxx/projects/swift/swift/llm/data/dataset_info.json
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'
[INFO:swift] Start time of running main: 2024-09-05 21:33:22.186028
[INFO:swift] Using val_dataset, ignoring dataset_test_ratio
[INFO:swift] ckpt_dir: /home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532
[INFO:swift] Setting model_info['revision']: master
[INFO:swift] Setting self.eval_human: False
[INFO:swift] args: InferArguments(model_type='qwen2-vl-2b-instruct', model_id_or_path='/home/xxx/models/qwen/Qwen2-VL-2B-Instruct', model_revision='master', sft_type='full', template_type='qwen2-vl', infer_backend='pt', ckpt_dir='/home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532', result_dir=None, load_args_from_ckpt_dir=True, load_dataset_config=False, eval_human=False, seed=42, dtype='bf16', model_kwargs=None, dataset=[], val_dataset=['/home/xxx/data/datasets/boygirl/diff_20240813_qwenvl2.jsonl'], dataset_seed=42, dataset_test_ratio=0.0, show_dataset_sample=-1, save_result=True, system='You are a helpful assistant.', tools_prompt='react_en', max_length=None, truncation_strategy='delete', check_dataset_strategy='none', model_name=[None, None], model_author=[None, None], quant_method=None, quantization_bit=0, hqq_axis=0, hqq_dynamic_config_path=None, bnb_4bit_comp_dtype='bf16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=2048, do_sample=False, temperature=0.0, top_k=None, top_p=None, repetition_penalty=None, num_beams=1, stop_words=[], rope_scaling=None, use_flash_attn=None, ignore_args_error=False, stream=True, merge_lora=False, merge_device_map='cpu', save_safetensors=True, overwrite_generation_config=False, verbose=None, local_repo_path=None, custom_register_path=None, custom_dataset_info=None, device_map_config=None, device_max_memory=[], hub_token=None, gpu_memory_utilization=0.9, tensor_parallel_size=1, max_num_seqs=256, max_model_len=None, disable_custom_all_reduce=True, enforce_eager=False, vllm_enable_lora=False, vllm_max_lora_rank=16, lora_modules=[], max_logprobs=20, tp=1, cache_max_entry_count=0.8, quant_policy=0, vision_batch_size=1, self_cognition_sample=0, train_dataset_sample=-1, val_dataset_sample=None, safe_serialization=None, model_cache_dir=None, merge_lora_and_save=None, custom_train_dataset_path=[], custom_val_dataset_path=[], vllm_lora_modules=None, device_map_config_path=None)
[INFO:swift] Global seed set to 42
[INFO:swift] device_count: 4
[INFO:swift] Loading the model using model_dir: /home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532
[INFO:swift] model_kwargs: {'low_cpu_mem_usage': True, 'device_map': 'auto'}
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
[INFO:swift] model.max_model_len: 32768
[INFO:swift] model_config: Qwen2VLConfig {
"_name_or_path": "/home/xxx/checkpoints/bc_boygirl_version8/qwen2-vl-2b-instruct/v5-20240903-202620/checkpoint-2532",
"architectures": [
"Qwen2VLForConditionalGeneration"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 1536,
"image_token_id": 151655,
"initializer_range": 0.02,
"intermediate_size": 8960,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2_vl",
"num_attention_heads": 12,
"num_hidden_layers": 28,
"num_key_value_heads": 2,
"rms_norm_eps": 1e-06,
"rope_scaling": {
"mrope_section": [
16,
24,
24
],
"type": "mrope"
},
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.45.0.dev0",
"use_cache": false,
"use_sliding_window": false,
"video_token_id": 151656,
"vision_config": {
"hidden_size": 1536,
"in_chans": 3,
"model_type": "qwen2_vl",
"spatial_patch_size": 14
},
"vision_end_token_id": 151653,
"vision_start_token_id": 151652,
"vision_token_id": 151654,
"vocab_size": 151936
}

/data/xxx/anaconda3/envs/ms-swift/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: do_sample is set to False. However, temperature is set to 0.0 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
[INFO:swift] model.generation_config: GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151645,
"max_new_tokens": 2048,
"pad_token_id": 151643,
"repetition_penalty": 1.05,
"temperature": 0.0,
"top_k": 1,
"top_p": 0.001
}

[INFO:swift] [visual.patch_embed.proj.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm1.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm1.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm2.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.norm2.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.qkv.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.qkv.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.proj.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.attn.proj.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc1.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc1.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc2.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.0.mlp.fc2.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm1.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm1.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm2.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.norm2.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.attn.qkv.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.attn.qkv.bias]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] [visual.blocks.1.attn.proj.weight]: requires_grad=False, dtype=torch.bfloat16, device=cuda:0
[INFO:swift] ...
[INFO:swift] Qwen2VLForConditionalGeneration(
(visual): Qwen2VisionTransformerPretrainedModel(
(patch_embed): PatchEmbed(
(proj): Conv3d(3, 1280, kernel_size=(2, 14, 14), stride=(2, 14, 14), bias=False)
)
(rotary_pos_emb): VisionRotaryEmbedding()
(blocks): ModuleList(
(0-31): 32 x Qwen2VLVisionBlock(
(norm1): LayerNorm((1280,), eps=1e-06, elementwise_affine=True)
(norm2): LayerNorm((1280,), eps=1e-06, elementwise_affine=True)
(attn): VisionSdpaAttention(
(qkv): Linear(in_features=1280, out_features=3840, bias=True)
(proj): Linear(in_features=1280, out_features=1280, bias=True)
)
(mlp): VisionMlp(
(fc1): Linear(in_features=1280, out_features=5120, bias=True)
(act): QuickGELUActivation()
(fc2): Linear(in_features=5120, out_features=1280, bias=True)
)
)
)
(merger): PatchMerger(
(ln_q): LayerNorm((1280,), eps=1e-06, elementwise_affine=True)
(mlp): Sequential(
(0): Linear(in_features=5120, out_features=5120, bias=True)
(1): GELU(approximate='none')
(2): Linear(in_features=5120, out_features=1536, bias=True)
)
)
)
(model): Qwen2VLModel(
(embed_tokens): Embedding(151936, 1536)
(layers): ModuleList(
(0-27): 28 x Qwen2VLDecoderLayer(
(self_attn): Qwen2VLSdpaAttention(
(q_proj): Linear(in_features=1536, out_features=1536, bias=True)
(k_proj): Linear(in_features=1536, out_features=256, bias=True)
(v_proj): Linear(in_features=1536, out_features=256, bias=True)
(o_proj): Linear(in_features=1536, out_features=1536, bias=False)
(rotary_emb): Qwen2RotaryEmbedding()
)
(mlp): Qwen2MLP(
(gate_proj): Linear(in_features=1536, out_features=8960, bias=False)
(up_proj): Linear(in_features=1536, out_features=8960, bias=False)
(down_proj): Linear(in_features=8960, out_features=1536, bias=False)
(act_fn): SiLU()
)
(input_layernorm): Qwen2RMSNorm((1536,), eps=1e-06)
(post_attention_layernorm): Qwen2RMSNorm((1536,), eps=1e-06)
)
)
(norm): Qwen2RMSNorm((1536,), eps=1e-06)
)
(lm_head): Linear(in_features=1536, out_features=151936, bias=False)
)
[INFO:swift] Qwen2VLForConditionalGeneration: 2208.9856M Params (0.0000M Trainable [0.0000%]), 234.8828M Buffers.
[INFO:swift] system: You are a helpful assistant.
[INFO:swift] val_dataset: Dataset({
features: ['system', 'query', 'response'],

@Jintao-Huang
Copy link
Collaborator Author

不复现啊,你是不是改代码了

@baibaiw5
Copy link

baibaiw5 commented Sep 5, 2024

@Jintao-Huang
没有改代码,使用transformers 4.45.0.dev0

@baibaiw5
Copy link

baibaiw5 commented Sep 5, 2024

上一个版本运行到时候,程序直接报错,这版可以往后执行,但是实际没有设置成功

@Jintao-Huang
Copy link
Collaborator Author

--do_sample True --temperature 0 这两个参数是不能共存的

@Jintao-Huang
Copy link
Collaborator Author

temperature 为 0. -> do_sample false

@baibaiw5
Copy link

baibaiw5 commented Sep 5, 2024

OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants