Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] HuggingFacewithChatTemplate #1557

Open
2 tasks done
ZCzzzzzz opened this issue Sep 24, 2024 · 2 comments
Open
2 tasks done

[Bug] HuggingFacewithChatTemplate #1557

ZCzzzzzz opened this issue Sep 24, 2024 · 2 comments

Comments

@ZCzzzzzz
Copy link

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True,
'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0',
'MMEngine': '0.10.4',
'MUSA available': False,
'NVCC': 'Not Available',
'OpenCV': '4.10.0',
'PyTorch': '2.1.0',
'Python': '3.10.12 (main, May 26 2024, 00:14:02) [GCC 9.4.0]',
'TorchVision': '0.16.0',
'lmdeploy': '0.2.6',
'numpy_random_seed': 2147483648,
'opencompass': '0.3.2.post1+ee058e2',
'sys.platform': 'linux',
'transformers': '4.44.2'}

Reproduces the problem - code/configuration sample

运行的命令是:python run.py configs/eval_hf_llama3-70b.py;
其中configs/eval_hf_llama3-70b.py内容为:”from mmengine.config import read_base

with read_base():
from .datasets.ceval.ceval_gen import ceval_datasets
from .models.hf_llama.hf_llama3_70b_instruct_awq import models
from .summarizers.example import summarizer

datasets = sum([v for k, v in locals().items() if k.endswith('_datasets') or k == 'datasets'], [])
work_dir = './outputs/llama3/'“
hf_llama3_70b_instruct_awq.py的内容为:”from opencompass.models import HuggingFacewithChatTemplate

models = [
dict(
type=HuggingFacewithChatTemplate,
abbr='llama-3-70b-instruct-hf',
path='/data/Llama-3-70B-Instruct-AWQ/',
max_out_len=1024,
batch_size=8,
run_cfg=dict(num_gpus=8),
stop_words=['<|end_of_text|>', '<|eot_id|>'],
)
]“

Reproduces the problem - command or script

Reproduces the problem - error message

09/23 20:31:34 - OpenCompass - INFO - Task [llama-3-70b-instruct-hf/ceval-computer_network,llama-3-70b-instruct-hf/ceval-operating_system,llama-3-70b-instruct-hf/ceval-computer_architecture,llama-3-70b-instruct-hf/ceval-college_programming,llama-3-70b-instruct-hf/ceval-college_physics,llama-3-70b-instruct-hf/ceval-college_chemistry,llama-3-70b-instruct-hf/ceval-advanced_mathematics,llama-3-70b-instruct-hf/ceval-probability_and_statistics,llama-3-70b-instruct-hf/ceval-discrete_mathematics,llama-3-70b-instruct-hf/ceval-electrical_engineer,llama-3-70b-instruct-hf/ceval-metrology_engineer,llama-3-70b-instruct-hf/ceval-high_school_mathematics,llama-3-70b-instruct-hf/ceval-high_school_physics,llama-3-70b-instruct-hf/ceval-high_school_chemistry,llama-3-70b-instruct-hf/ceval-high_school_biology,llama-3-70b-instruct-hf/ceval-middle_school_mathematics,llama-3-70b-instruct-hf/ceval-middle_school_biology,llama-3-70b-instruct-hf/ceval-middle_school_physics,llama-3-70b-instruct-hf/ceval-middle_school_chemistry,llama-3-70b-instruct-hf/ceval-veterinary_medicine,llama-3-70b-instruct-hf/ceval-college_economics,llama-3-70b-instruct-hf/ceval-business_administration,llama-3-70b-instruct-hf/ceval-marxism,llama-3-70b-instruct-hf/ceval-mao_zedong_thought,llama-3-70b-instruct-hf/ceval-education_science,llama-3-70b-instruct-hf/ceval-teacher_qualification,llama-3-70b-instruct-hf/ceval-high_school_politics,llama-3-70b-instruct-hf/ceval-high_school_geography,llama-3-70b-instruct-hf/ceval-middle_school_politics,llama-3-70b-instruct-hf/ceval-middle_school_geography,llama-3-70b-instruct-hf/ceval-modern_chinese_history,llama-3-70b-instruct-hf/ceval-ideological_and_moral_cultivation,llama-3-70b-instruct-hf/ceval-logic,llama-3-70b-instruct-hf/ceval-law,llama-3-70b-instruct-hf/ceval-chinese_language_and_literature,llama-3-70b-instruct-hf/ceval-art_studies,llama-3-70b-instruct-hf/ceval-professional_tour_guide,llama-3-70b-instruct-hf/ceval-legal_professional,llama-3-70b-instruct-hf/ceval-high_school_chinese,llama-3-70b-instruct-hf/ceval-high_school_history,llama-3-70b-instruct-hf/ceval-middle_school_history,llama-3-70b-instruct-hf/ceval-civil_servant,llama-3-70b-instruct-hf/ceval-sports_science,llama-3-70b-instruct-hf/ceval-plant_protection,llama-3-70b-instruct-hf/ceval-basic_medicine,llama-3-70b-instruct-hf/ceval-clinical_medicine,llama-3-70b-instruct-hf/ceval-urban_and_rural_planner,llama-3-70b-instruct-hf/ceval-accountant,llama-3-70b-instruct-hf/ceval-fire_engineer,llama-3-70b-instruct-hf/ceval-environmental_impact_assessment_engineer,llama-3-70b-instruct-hf/ceval-tax_accountant,llama-3-70b-instruct-hf/ceval-physician]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
/usr/local/lib/python3.10/site-packages/awq/modules/linear/exllama.py:12: UserWarning: AutoAWQ could not load ExLlama kernels extension. Details: No module named 'exl_ext'
warnings.warn(f"AutoAWQ could not load ExLlama kernels extension. Details: {ex}")
/usr/local/lib/python3.10/site-packages/awq/modules/linear/exllamav2.py:13: UserWarning: AutoAWQ could not load ExLlamaV2 kernels extension. Details: No module named 'exlv2_ext'
warnings.warn(f"AutoAWQ could not load ExLlamaV2 kernels extension. Details: {ex}")
/usr/local/lib/python3.10/site-packages/awq/modules/linear/gemm.py:14: UserWarning: AutoAWQ could not load GEMM kernels extension. Details: No module named 'awq_ext'
warnings.warn(f"AutoAWQ could not load GEMM kernels extension. Details: {ex}")
/usr/local/lib/python3.10/site-packages/awq/modules/linear/gemv.py:11: UserWarning: AutoAWQ could not load GEMV kernels extension. Details: No module named 'awq_ext'
warnings.warn(f"AutoAWQ could not load GEMV kernels extension. Details: {ex}")
/usr/local/lib/python3.10/site-packages/awq/modules/linear/gemv_fast.py:10: UserWarning: AutoAWQ could not load GEMVFast kernels extension. Details: No module named 'awq_v2_ext'
warnings.warn(f"AutoAWQ could not load GEMVFast kernels extension. Details: {ex}")
Loading checkpoint shards: 100%|██████████| 9/9 [00:10<00:00, 1.17s/it]
09/23 20:31:55 - OpenCompass - INFO - using stop words: ['<|eot_id|>', '<|end_of_text|>']
09/23 20:31:55 - OpenCompass - INFO - Start inferencing [llama-3-70b-instruct-hf/ceval-computer_network]
100%|██████████| 19/19 [00:00<00:00, 463324.28it/s]
[2024-09-23 20:31:56,017] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting build dataloader
[2024-09-23 20:31:56,017] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/workspace/opencompass/opencompass/tasks/openicl_infer.py", line 161, in
inferencer.run()
File "/workspace/opencompass/opencompass/tasks/openicl_infer.py", line 89, in run
self._inference()
File "/workspace/opencompass/opencompass/tasks/openicl_infer.py", line 139, in _inference
inferencer.inference(retriever,
File "/workspace/opencompass/opencompass/openicl/icl_inferencer/icl_gen_inferencer.py", line 153, in inference
results = self.model.generate_from_template(
File "/workspace/opencompass/opencompass/models/base.py", line 201, in generate_from_template
return self.generate(inputs, max_out_len=max_out_len, **kwargs)
File "/workspace/opencompass/opencompass/models/huggingface_above_v4_33.py", line 440, in generate
messages = [self.tokenizer.apply_chat_template(m, add_generation_prompt=True, tokenize=False) for m in messages]
File "/workspace/opencompass/opencompass/models/huggingface_above_v4_33.py", line 440, in
messages = [self.tokenizer.apply_chat_template(m, add_generation_prompt=True, tokenize=False) for m in messages]
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1786, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2025, in get_chat_template
raise ValueError(
ValueError: Cannot use apply_chat_template() because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating
[2024-09-23 20:32:05,799] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 299549) of binary: /usr/local/bin/python3.10
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/usr/local/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/workspace/opencompass/opencompass/tasks/openicl_infer.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2024-09-23_20:32:05
host : localhost.localdomain
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 299549)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Other information

No response

@tonysy
Copy link
Collaborator

tonysy commented Sep 24, 2024

Could you please ensure that the model can be loaded successfully directly using transformers?

@daidaiershidi
Copy link

Could you please ensure that the model can be loaded successfully directly using transformers?

这个错误时因为transformers在4.43后取消了默认的chat template,但是opencompass之前很多测试似乎都是用了默认的chat template?之后会在代码中添加这一块吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants