-
The Chinese University of Hong Kong
- Shatin, N.T., Hong Kong SAR
- https://lixucuhk.github.io
Highlights
- Pro
Stars
Foundational Models for State-of-the-Art Speech and Text Translation
Generative models for conditional audio generation
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Bark Voice Cloning and Voice Cloning for Chinese Speech
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
🔊 Text-Prompted Generative Audio Model
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
使用unity实现AI聊天相关功能。目前这个库包含了对chatgpt、chatglm等大语言模型的api调用的代码实现以及实现了微软Azure以及百度AI的语音服务功能,语音服务均采用web api实现,支持Windows/WebGL/Android等平台
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Awesome-LLM: a curated list of Large Language Model
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Instant voice cloning by MIT and MyShell.
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
A family of diffusion models for text-to-audio generation.
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
ImageBind One Embedding Space to Bind Them All
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Official Implementation of "Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models"
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/