Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump up transformers version & Remove MistralConfig #1254

Merged
merged 4 commits into from
Oct 13, 2023
Merged

Conversation

WoosukKwon
Copy link
Collaborator

Now that MistralConfig is officially supported by the stable release of HF transformers, we can remove our MistralConfig.

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR LGTM! However, I think we need to make sure that the tokenizer performance is good before we merge this PR.

@TensorTemplar
Copy link

TensorTemplar commented Oct 4, 2023

Hey i built this from source and there is an issue with the openai server:

  1. please consider pinning the pydantic version to avoid:
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/src/vllm/vllm/entrypoints/openai/api_server.py", line 21, in <module>
    from vllm.entrypoints.openai.protocol import (
  File "/usr/src/vllm/vllm/entrypoints/openai/protocol.py", line 111, in <module>
    class CompletionResponseChoice(BaseModel):
  File "pydantic/main.py", line 198, in pydantic.main.ModelMetaclass.__new__
  File "pydantic/fields.py", line 506, in pydantic.fields.ModelField.infer
  File "pydantic/fields.py", line 436, in pydantic.fields.ModelField.__init__
  File "pydantic/fields.py", line 552, in pydantic.fields.ModelField.prepare
  File "pydantic/fields.py", line 661, in pydantic.fields.ModelField._type_analysis
  File "pydantic/fields.py", line 668, in pydantic.fields.ModelField._type_analysis
  File "/usr/lib/python3.8/typing.py", line 774, in __subclasscheck__
    return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class

this worked for me into requirements.txt pydantic >=1.10.12, < 2

  1. The next issue is a missing fschat dependency:
    when trying to call the endpoint:
curl -X 'POST' \
  'http://0.0.0.0:8000/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "mistralai/Mistral-7B-Instruct-v0.1",
  "messages": [{"role": "user", "content": "what is the capital of Germany?"}],
  "temperature": 0.7,
  "top_p": 1,
  "n": 1,
  "max_tokens": 20
}'
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.8/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/cors.py", line 91, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/cors.py", line 146, in simple_response
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.8/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/usr/local/lib/python3.8/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/usr/local/lib/python3.8/dist-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.8/dist-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/usr/src/vllm/vllm/entrypoints/openai/api_server.py", line 206, in create_chat_completion
    prompt = await get_gen_prompt(request)
  File "/usr/src/vllm/vllm/entrypoints/openai/api_server.py", line 74, in get_gen_prompt
    raise ModuleNotFoundError(
ModuleNotFoundError: fastchat is not installed. Please install fastchat to use the chat completion and conversation APIs: `$ pip install fschat`

This can be fixed via pip uninstall transformer-engine -y and adding fschat to dependencies fschat[model_worker]==0.2.24
After this i get a correct response:

{
  "id": "cmpl-798b021c1adc46c987dd0f01160937b8",
  "object": "chat.completion",
  "created": 211402,
  "model": "mistralai/Mistral-7B-Instruct-v0.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": " The capital city of Germany is Berlin. It is a vibrant and multicultural city with a"
      },
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 543,
    "total_tokens": 563,
    "completion_tokens": 20
  }
}

@WoosukKwon
Copy link
Collaborator Author

The PR LGTM! However, I think we need to make sure that the tokenizer performance is good before we merge this PR.

@zhuohan123 I think the performance issue is orthogonal to the PR, since new users will install the newest version of transformers and experience the performance issue anyway.

@WoosukKwon WoosukKwon merged commit e7c8555 into main Oct 13, 2023
2 checks passed
@WoosukKwon WoosukKwon deleted the remove-mistral branch October 13, 2023 17:05
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants