Skip to content

wntg/ALM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Language

In order to catch up with the pace of LLM and VLM, this project is dedicated to the research of large speech models. Including the following tasks: We build a latest large model fine-tuning, inference pipeline, and benchmarks for testing downstream tasks. Simple tasks include speech classification, speech recognition, and voiceprint recognition. Complex tasks include speech production, semantic understanding, voice continuation,multimodal tasks, and reasoning speed. Why emphasize reasoning speed? We believe that real-time speech understanding is the goal of speech development. GPT4o shows us GPT that can listen and listen at the same time. In order to promote the development of speech research, we hope that interested friends will join us to study large speech models and make progress together!

About

Audio large model study

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published