Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多模态模型如何多卡部署 #1236

Closed
AlbertBJ opened this issue Jun 26, 2024 · 2 comments
Closed

多模态模型如何多卡部署 #1236

AlbertBJ opened this issue Jun 26, 2024 · 2 comments
Assignees

Comments

@AlbertBJ
Copy link

我看文档,多模态模型这块的部署,都是针对单卡部署的,那如果 单卡太小,如何多卡的 tensor 并行部署

@AlbertBJ
Copy link
Author

我看文档,多模态模型这块的部署,都是针对单卡部署的,那如果 单卡太小,如何多卡的 tensor 并行部署

我这边用qwen-vl-chat来测试的,设置可见两张卡,模型可以运行,但是 我看 两张卡上 gpu 显存使用量 不一致啊,是 因为 vit的存在么?

@Jintao-Huang Jintao-Huang self-assigned this Jun 28, 2024
@tastelikefeet
Copy link
Collaborator

考虑使用lmdeploy进行部署:
--infer_backend lmdeploy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants