-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Error: Start the triton server #2673
Comments
This problem was solved after I restarted the container, but a new error occurred when executing the program. Traceback (most recent call last): |
It seems that access to the triton server timeout. Are there any logs on the server? |
docker logs shows that: NVIDIA Release 22.07 (build 41737377) Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. I1109 06:53:09.532688 1 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f6a4e000000' with size 268435456 I1109 06:53:30.082215 1 server.cc:586] +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ I1109 06:53:30.082348 1 server.cc:629] I1109 06:53:30.135753 1 metrics.cc:650] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3090 |
curl http://0.0.0.0:8000/v2/models/stats Check the server is available |
I set the local port to 8010. So I can get such a result, what may be the cause of the error in this case, thank you for your help. (base) eg@eg-HP-Z8-G4-Workstation:~$ curl http://0.0.0.0:8010/v2/models/stats |
It is possible to optimize performance by adjusting parameters such as the number of instances and batch size. For more information, please refer to the Triton documentation: https://github.com/triton-inference-server/server |
Thank you very much for your help. I think my problem has been resolved. |
Is there an existing issue for this?
Current Behavior
root@85d70c862b32:/opt/tritonserver# tritonserver --model-repository
pwd
/modelsW1109 05:31:06.568839 124 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version
I1109 05:31:06.568981 124 cuda_memory_manager.cc:115] CUDA memory pool disabled
I1109 05:31:06.569292 124 tritonserver.cc:2176]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.24.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens |
| | or_data statistics trace |
| model_repository_path[0] | /opt/tritonserver/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I1109 05:31:06.569348 124 server.cc:257] No server context available. Exiting immediately.
error: creating server: Internal - failed to stat file /opt/tritonserver/models
Expected Behavior
I'm following the official documentation to deploy triton server and start towhee to speed up coding.
I got an error in step “Start the Triton server”after entering the server.
But I can use towhee for encoding in the local environment if I don't use the triton server method. It prompts whether the cuda driver and version in the error message is the reason why it cannot be executed. How can I continue the operation?
Steps To Reproduce
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: