GitHub

Language

In order to catch up with the pace of LLM and VLM, this project is dedicated to the research of large speech models. Including the following tasks: We build a latest large model fine-tuning, inference pipeline, and benchmarks for testing downstream tasks. Simple tasks include speech classification, speech recognition, and voiceprint recognition. Complex tasks include speech production, semantic understanding, voice continuation，multimodal tasks, and reasoning speed. Why emphasize reasoning speed? We believe that real-time speech understanding is the goal of speech development. GPT4o shows us GPT that can listen and listen at the same time. In order to promote the development of speech research, we hope that interested friends will join us to study large speech models and make progress together!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
readme		readme
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Language

About

Releases

Packages

wntg/ALM

Folders and files

Latest commit

History

Repository files navigation

Language

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages