Stars
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 …
🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI …
An open source chat bot architecture for voice/vision (and multimodal) assistants, local and remote to run; if u run achatbot by yourself, u can learn more, fork to contribute
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Anim-400K: A dataset designed from the ground up for automated dubbing of video
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
MARS5 speech model (TTS) from CAMB.AI
GUI for a Vocal Remover that uses Deep Neural Networks.
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
🍦 ChatTTS-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
A generative speech model for daily dialogue.
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context