Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long Inference Time #142

Open
debasishaimonk opened this issue Jul 4, 2023 · 7 comments
Open

Long Inference Time #142

debasishaimonk opened this issue Jul 4, 2023 · 7 comments

Comments

@debasishaimonk
Copy link

Valle model is taking very long time to generate voices. is there any ongoing issues or PR being raised to work on it. Has there been any discussion how to speed up?

@lifeiteng
Copy link
Owner

Implementing kv cache can speedup 10-20x.

@debasishaimonk
Copy link
Author

has it been implemented?

@RahulBhalley
Copy link

Valle model is taking very long time to generate voices.

How long does it take?

@RuntimeRacer
Copy link
Contributor

I think the Inference times on an RTX 3090 / 4090 are quite acceptable actually. Not perfect, but acceptable. What Hardware are you using?

Implementing kv cache can speedup 10-20x.

@lifeiteng Is this planned to be added to the repo? A speedup by this degree would be incredible.

@lifeiteng
Copy link
Owner

@RuntimeRacer I don't have time to do it.

@RahulBhalley
Copy link

I think the Inference times on an RTX 3090 / 4090 are quite acceptable actually. Not perfect, but acceptable. What Hardware are you using?

I'll probably use 4090. Do you know how much time it takes? I haven't run the code yet. Just exploring my option rn.

@bank010
Copy link

bank010 commented Sep 13, 2023

我认为RTX 3090 / 4090上的推理时间实际上是可以接受的。不完美,但可以接受。您使用什么硬件?

实施可以加快 10-20 倍。kv cache

是否计划将其添加到存储库中?这种程度的加速将是不可思议的。

Hello, do you have any experience in this field? It would be incredible if you did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants