Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] MiniCPM-llama3-V2_5 启动后使用image url 使用base64 没有回复结果 #1819

Open
2 tasks
weiminw opened this issue Jun 21, 2024 · 13 comments
Open
2 tasks
Assignees

Comments

@weiminw
Copy link

weiminw commented Jun 21, 2024

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.

Describe the bug

MiniCPM-llama3-V2_5 启动后使用image url 使用base64 没有回复结果

Reproduction

启动: LMDeploy server api_server /workspace/vlm/MiniCPM-Llama3-V-2_5 --model-name mini --backend pytorch --server-port 8000

api call:

{
"model": "mini",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": [{"type": "text": "text":"描述你看到的图片"}, {"type":"image_url","image_url": {"url": "data:image/jpeg;base64,........."}}]

]
}

响应 的 message 里面是空的.

Environment

sys.platform: linux
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr/local/cuda
NVCC: Not Available
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
PyTorch: 2.2.2+cu121
PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 12.1
NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
CuDNN 8.9.2
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,
TorchVision: 0.17.2+cu121
LMDeploy: 0.4.2+
transformers: 4.41.2
gradio: Not Found
fastapi: 0.111.0
pydantic: 2.7.4
triton: 2.2.0

Error traceback

No response

@irexyc
Copy link
Collaborator

irexyc commented Jun 21, 2024

启动的命令可以加上 --log-level INFO,然后看一下server侧的日志。

api call 可以把完整的请求发出来。可以写到文本文件中,然后上传到issue这里。

@weiminw
Copy link
Author

weiminw commented Jun 24, 2024

文件太大, 我尝试传一下.
out.txt

@irexyc
Copy link
Collaborator

irexyc commented Jun 24, 2024

MiniCPM-Llama3-V-2_5 目前不支持 pytorch backend, 可以把 --backend pytorch 去掉试试看

@weiminw
Copy link
Author

weiminw commented Jul 4, 2024

更新到最新的lmdeploy 后启动 MiniCPM-Llama3-V-2_5 lmdeploy serve api_server /workspace/models/vlm/MiniCPM-Llama3-V-2_5 --server-port 8000 --model-name test. 然后通过API调用, 卡住了, 没有任何返回. 是不是这个版本不支持多模态?

@irexyc
Copy link
Collaborator

irexyc commented Jul 4, 2024

我这里正常。

启动时可以加 --log-level INFO,然后观察服务端打印的日志。

@weiminw
Copy link
Author

weiminw commented Jul 6, 2024

...
[TM][INFO] NCCL group_id = 0
[TM][INFO] [BlockManager] block_size = 8 MB
[TM][INFO] [BlockManager] max_block_count = 454
[TM][INFO] [BlockManager] chunk_size = 454
[TM][INFO] LlamaBatch<T>::Start()
HINT:    Please open http://0.0.0.0:8000 in a browser for detailed api usage!!!
HINT:    Please open http://0.0.0.0:8000 in a browser for detailed api usage!!!
HINT:    Please open http://0.0.0.0:8000 in a browser for detailed api usage!!!
INFO:     Started server process [597]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2024-07-06 14:40:01,761 - lmdeploy - INFO - start ImageEncoder._forward_loop
2024-07-06 14:40:01,761 - lmdeploy - INFO - ImageEncoder received 1 images, left 1 images.
2024-07-06 14:40:01,761 - lmdeploy - INFO - ImageEncoder process 1 images, left 0 images.

最后三行是请求后的输出,然后就一直卡着了. 没有任何响应. 10来分钟都没有响应.
这个算是正常的么

@irexyc
Copy link
Collaborator

irexyc commented Jul 7, 2024

你启动server时的工作目录下面有lmdeploy文件夹么?如果有的话,import相关内容的时候,有可能引入的内容是旧的。

v0.5.0已经发版了,建议直接安装v0.5.0的版本,然后启动命令之前,可以建一个空的文件夹并进入。

@irexyc irexyc self-assigned this Jul 7, 2024
@weiminw
Copy link
Author

weiminw commented Jul 8, 2024

按照你的方式, 我安装了v0.5.0最新版本, 新建环境重新安装的。然后在工作路经下创建了一个lmdeploy文件夹。然后启动。

drwxr-xr-x 2 root root 4096 Jul  8 07:19 ./
drwxr-xr-x 1 root root 4096 Jul  8 07:14 ../
(vision2) root@b8cc0fe3fc89:~/lmdeploy# pwd
/workspace/lmdeploy
(vision2) root@b8cc0fe3fc89:~/lmdeploy# lmdeploy serve api_server /workspace/models/vlm/MiniCPM-Llama3-V-2_5 --model-name test --server-port 8000 --log-level INFO

启动后,调用还是一直卡着

@weiminw
Copy link
Author

weiminw commented Jul 9, 2024

今天重新测试, 启动参数:lmdeploy serve api_server /workspace/models/vlm/MiniCPM-Llama3-V-2_5 --model-name test --server-port 8000 --log-level INFO --session-len 10240

请求后没有返回,看日志如下: 图片文件为1024x1024 左右。之前卡住没有返回也没有日值,图片大小为2048x2048左右.

请求后输出:

2024-07-09 01:49:54,167 - lmdeploy - INFO - session_id=1, history_tokens=0, input_tokens=553, max_new_tokens=None, seq_start=True, seq_end=True, step=0, prep=True
2024-07-09 01:49:54,167 - lmdeploy - INFO - Register stream callback for 1
[TM][INFO] [forward] Enqueue requests
[TM][INFO] [forward] Wait for requests to complete ...
[TM][INFO] [ProcessInferRequests] Request for 1 received.
[TM][WARNING] [ProcessInferRequests] [1] total sequence length (553 + 9687) exceeds `session_len` (10240), `request_output_len` is truncated to 9686
[TM][INFO] [Forward] [0, 1), dc_bsz = 0, pf_bsz = 1, n_tok = 553, max_q = 553, max_k = 553
[TM][INFO] ------------------------- step = 560 -------------------------
[TM][INFO] ------------------------- step = 570 -------------------------
[TM][INFO] ------------------------- step = 580 -------------------------
[TM][INFO] ------------------------- step = 590 -------------------------
[TM][INFO] ------------------------- step = 600 -------------------------
[TM][INFO] ------------------------- step = 610 -------------------------
[TM][INFO] ------------------------- step = 620 -------------------------
[TM][INFO] ------------------------- step = 630 -------------------------
[TM][INFO] ------------------------- step = 640 -------------------------
[TM][INFO] ------------------------- step = 650 -------------------------
[TM][INFO] ------------------------- step = 660 -------------------------
[TM][INFO] ------------------------- step = 670 -------------------------
[TM][INFO] ------------------------- step = 680 -------------------------
[TM][INFO] ------------------------- step = 690 -------------------------
[TM][INFO] ------------------------- step = 700 -------------------------
[TM][INFO] ------------------------- step = 710 -------------------------
[TM][INFO] ------------------------- step = 720 -------------------------
[TM][INFO] ------------------------- step = 730 -------------------------
[TM][INFO] ------------------------- step = 740 -------------------------
^CINFO:     Shutting down
[TM][INFO] ------------------------- step = 750 -------------------------
INFO:     Waiting for connections to close. (CTRL+C to force quit)
[TM][INFO] ------------------------- step = 760 -------------------------
[TM][INFO] ------------------------- step = 770 -------------------------
[TM][INFO] ------------------------- step = 780 -------------------------
[TM][INFO] ------------------------- step = 790 -------------------------
[TM][INFO] ------------------------- step = 800 -------------------------
[TM][INFO] ------------------------- step = 810 -------------------------
[TM][INFO] ------------------------- step = 820 -------------------------
[TM][INFO] ------------------------- step = 830 -------------------------
[TM][INFO] ------------------------- step = 840 -------------------------
[TM][INFO] ------------------------- step = 850 -------------------------
^C[TM][INFO] ------------------------- step = 860 -------------------------
INFO:     Finished server process [1850]
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "uvloop/loop.pyx", line 1511, in uvloop.loop.Loop.run_until_complete
  File "uvloop/loop.pyx", line 1504, in uvloop.loop.Loop.run_until_complete
  File "uvloop/loop.pyx", line 1377, in uvloop.loop.Loop.run_forever
  File "uvloop/loop.pyx", line 555, in uvloop.loop.Loop._run
  File "uvloop/loop.pyx", line 474, in uvloop.loop.Loop._on_idle
  File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run
  File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
  File "/workspace/vision2/lib/python3.10/site-packages/uvicorn/server.py", line 68, in serve
    with self.capture_signals():

@weiminw
Copy link
Author

weiminw commented Jul 9, 2024

使用你们提供的tiger文件可以正常返回,但是使用自己的图片文件就不行, 使用的是 4090 24G

@weiminw
Copy link
Author

weiminw commented Jul 9, 2024

另外早上测试发现 InternVL2-8B 使用 backed pytorch 可以正常使用, 但是TurboMind 就不行。

@irexyc
Copy link
Collaborator

irexyc commented Jul 9, 2024

方便把图片传上来么?

另外可以用 docker pull openmmlab/lmdeploy:v0.5.0 这个镜像试下么,这样我们的环境就一致了,出了问题也好定位。

@weiminw
Copy link
Author

weiminw commented Jul 9, 2024

方便把图片传上来么?

另外可以用 docker pull openmmlab/lmdeploy:v0.5.0 这个镜像试下么,这样我们的环境就一致了,出了问题也好定位。

我明天试一下这个镜像,图片因为涉及客户隐私信息,不方便上传在这,不过原图大概4500x4500像素,大约4m左右。另外请问 图片大小和 session-len 有没有关? 和显存暂用大致如何计算

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants