-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] lmdeploy - [31mERROR[0m - Truncate max_new_tokens to 221 #1841
Comments
I also encountered this problem and hope to get an official answer. |
This is not a bug.
The default session_len is 2056, meaning the max sequence length of a session, including the input and output tokens. In your example, the number of input tokens is It indicates that |
Hi @lvhan028, @zhyncs, and @AllentDan, Thank you very much for your quick reply and all your help before. Even though it was not a bug in this case, I do not know why it came across in the wheel versions If I using ======================================= [TM][WARNING] Device 3 peer access Device 0 is not available. ======================================= So I guess there is some minor difference between your previous version and the new version which may generate this result variation. Can you please double check the difference? or can you please refer me the changes so I make in my local fork and run workflow for myself? Thank you again! |
You may ignore the log, it does not influence the usage. We will change that log level from error to warning. |
Thank you for your quick response. Yeah, I wanted to ignore the error directly, but for my large and dense chart, the model's outputs are truncated due to the error mentioned earlier. However, the lmdeploy version installed via pip install lmdeploy provides a complete output without the truncation issue. If the same input causes truncation in the wheels version, why does it not cause the same error in the pip-installed version? Thank you. |
I see. Seems in the current branch, |
Sure, thanks a lot. |
Checklist
Describe the bug
Hi all,
Thank you for your good work!
As suggested from issue, I tried the latest lmdeploy (
lmdeploy-0.4.2+cu121+da439df-cp39-cp39-manylinux2014_x86_64.whl
andlmdeploy-0.4.2+cu118+da439df-cp39-cp39-manylinux2014_x86_64.whl
to get the deterministic output, but I meet the error as below.Beside the error, the results are deterministic but for very dense input images, the results are truncated as the ERROR shown.
However, if I install the lmdeploy using "
pip install lmdeploy
", then, I do not have this error and the results are not truncated even for the dense input images, but the results are NOT deterministic.========================================
[TM][WARNING] Device 2 peer access Device 3 is not available.
<IMAGE_TOKEN>\nPlease inference this chart into a detailed table<|im_end|>\n<|im_start|>assistant\n', gen_config=EngineGenerationConfig(n=1, max_new_tokens=1024, top_p=0.8, top_k=40, temperature=0, repetition_penalty=1.0, ignore_eos=False, random_seed=6725412376424003715, stop_words=[92542, 92540], bad_words=None, min_new_tokens=None, skip_special_tokens=True, logprobs=None), prompt_token_id=[1, 92543, 9081, 364, 2770, 657, 589, 15358, 17993, 6843, 963, 505, 4576, 11146, 451, 60628, 60384, 60721, 62442, 60752, 699, 92542, 364, 92543, 1008, 364, 92544, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 92545, 364, 5658, 43929, 550, 9617, 1263, 395, 11832, 2115, 92542, 364, 92543, 525, 11353, 364], adapter_name=None.
[TM][WARNING] Device 3 peer access Device 0 is not available.
[TM][WARNING] Device 3 peer access Device 1 is not available.
[TM][WARNING] Device 3 peer access Device 2 is not available.
test image is: Meta_2022_13_0_1659551667_stacked_bar_chart_plus_legend.png
2024-06-24 18:30:26,329 - lmdeploy - INFO - start ImageEncoder._forward_loop
2024-06-24 18:30:26,329 - lmdeploy - INFO - ImageEncoder received 1 images, left 1 images.
2024-06-24 18:30:26,329 - lmdeploy - INFO - ImageEncoder process 1 images, left 0 images.
2024-06-24 18:30:34,239 - lmdeploy - INFO - ImageEncoder forward 1 images, cost 7.910s
2024-06-24 18:30:34,240 - lmdeploy - INFO - ImageEncoder done 1 images, left 0 images.
2024-06-24 18:30:34,241 - lmdeploy - INFO - prompt='<|im_start|>system\nYou are an AI assistant whose name is InternLM (书生·浦语).<|im_end|>\n<|im_start|>user\n
2024-06-24 18:30:34,241 - lmdeploy - INFO - session_id=0, history_tokens=0, input_tokens=1835, max_new_tokens=1024, seq_start=True, seq_end=True, step=0, prep=True
2024-06-24 18:30:34,241 - lmdeploy - ERROR - Truncate max_new_tokens to 221
[TM][INFO] [forward] Enqueue requests
[TM][INFO] [forward] Wait for requests to complete ...
[TM][INFO] [ProcessInferRequests] Request for 0 received.
[TM][WARNING] [ProcessInferRequests] [0] total sequence length (1835 + 221) exceeds
session_len
(2056),request_output_len
is truncated to 220[TM][INFO] [Forward] [0, 1), dc_bsz = 0, pf_bsz = 1, n_tok = 1835, max_q = 1835, max_k = 1835
[TM][INFO] ------------------------- step = 1840 -------------------------
[TM][INFO] ------------------------- step = 1850 -------------------------
[TM][INFO] ------------------------- step = 1860 -------------------------
[TM][INFO] ------------------------- step = 1870 -------------------------
[TM][INFO] ------------------------- step = 1880 -------------------------
[TM][INFO] ------------------------- step = 1890 -------------------------
[TM][INFO] ------------------------- step = 1900 -------------------------
[TM][INFO] ------------------------- step = 1910 -------------------------
[TM][INFO] ------------------------- step = 1920 -------------------------
[TM][INFO] ------------------------- step = 1930 -------------------------
[TM][INFO] ------------------------- step = 1940 -------------------------
[TM][INFO] ------------------------- step = 1950 -------------------------
[TM][INFO] ------------------------- step = 1960 -------------------------
[TM][INFO] ------------------------- step = 1970 -------------------------
[TM][INFO] ------------------------- step = 1980 -------------------------
[TM][INFO] ------------------------- step = 1990 -------------------------
[TM][INFO] ------------------------- step = 2000 -------------------------
[TM][INFO] [Interrupt] slot = 0, id = 0
[TM][INFO] [forward] Request completed for 0
====> The question is: Please inference this chart into a detailed table
========================================
The test input image is:
![gettyimages-182495865-2048x2048](https://private-user-images.githubusercontent.com/32938376/342455249-ece4bf69-967a-48cf-812f-c0c9848776a8.jpg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjEwNTc5OTAsIm5iZiI6MTcyMTA1NzY5MCwicGF0aCI6Ii8zMjkzODM3Ni8zNDI0NTUyNDktZWNlNGJmNjktOTY3YS00OGNmLTgxMmYtYzBjOTg0ODc3NmE4LmpwZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE1VDE1MzQ1MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTljZjY1MDE1OWM4MTU1YjU2NGExOWViYTRkODdmODQwMDZkZTU2YmUwZDhjNTRkOGQ3MTQ2MmE2NzI3MWQ4Y2QmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.EVHSYpuI7eNwqKRTSwsDXhnysr1eMrxBgqgSHvqaMk4)
Reproduction
from lmdeploy import pipeline, GenerationConfig
from lmdeploy.messages import TurbomindEngineConfig
from lmdeploy.vl import load_image
model = 'OpenGVLab/InternVL-Chat-V1-5-AWQ'
image = load_image("/app/342455249-ece4bf69-967a-48cf-812f-c0c9848776a8.jpg")
backend_config = TurbomindEngineConfig(model_format='awq', tp=4, cache_max_entry_count=0.1)
pipe = pipeline(model, backend_config=backend_config, log_level='INFO')
gen_config = GenerationConfig(top_p=0.8,
top_k=40,
temperature=0,
max_new_tokens=1024)
sel_question = "Please inference this chart into a detailed table"
response = pipe((sel_question, image), gen_config=gen_config)
print(response.text)
Environment
Error traceback
No response
The text was updated successfully, but these errors were encountered: