lixucuhk

Follow

🐬

Growing up

LI Xu lixucuhk

🐬

Growing up

Follow

Mr. LI Xu obtained Ph.D. degree at CUHK. His research interests include automatic speaker verification, anti-spoofing counter-measures, language learning

67 followers · 93 following

The Chinese University of Hong Kong
Shatin, N.T., Hong Kong SAR
https://lixucuhk.github.io

Achievements

Achievements

Highlights

Pro

Stars

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,804 1,054 Updated Aug 15, 2024

mdeff / fma

FMA: A Dataset For Music Analysis

Jupyter Notebook 2,212 432 Updated Jan 5, 2023

Stability-AI / stable-audio-tools

Generative models for conditional audio generation

Python 2,557 240 Updated Jul 15, 2024

openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 11,974 816 Updated Oct 3, 2024

zhvng / open-musiclm

Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.

Python 514 58 Updated Jun 3, 2023

descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,144 106 Updated Jul 11, 2024

KevinWang676 / Bark-Voice-Cloning

Bark Voice Cloning and Voice Cloning for Chinese Speech

Jupyter Notebook 2,741 396 Updated Aug 8, 2024

jitwxs / 163MusicLyrics

Windows 云音乐歌词获取【网易云、QQ音乐】

C# 2,023 107 Updated Aug 25, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,417 233 Updated Oct 3, 2024

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 35,526 4,175 Updated Aug 19, 2024

HumanAIGC / EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

7,437 901 Updated Aug 21, 2024

zhangliwei7758 / unity-AI-Chat-Toolkit

使用unity实现AI聊天相关功能。目前这个库包含了对chatgpt、chatglm等大语言模型的api调用的代码实现以及实现了微软Azure以及百度AI的语音服务功能，语音服务均采用web api实现，支持Windows/WebGL/Android等平台

443 63 Updated Sep 29, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 33,467 3,840 Updated Oct 2, 2024

TencentARC / PhotoMaker

PhotoMaker [CVPR 2024]

Jupyter Notebook 9,396 750 Updated Aug 15, 2024

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

18,005 1,453 Updated Oct 2, 2024

mlabonne / llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 37,716 3,966 Updated Jul 28, 2024

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell.

Python 28,966 2,824 Updated Aug 21, 2024

lllyasviel / ControlNet

Let us control diffusion models!

Python 29,938 2,703 Updated Feb 25, 2024

thuhcsi / NeuCoSVC

Python 252 37 Updated May 22, 2024

EmulationAI / awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

558 31 Updated Aug 3, 2024

guyyariv / TempoTokens

This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Python 101 11 Updated Apr 23, 2024

declare-lab / tango

A family of diffusion models for text-to-audio generation.

Python 991 79 Updated Jul 3, 2024

NExT-GPT / NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Python 3,226 320 Updated Sep 29, 2024

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,250 177 Updated Sep 29, 2024

facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All

Python 8,250 758 Updated Jul 31, 2024

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,332 85 Updated Sep 23, 2024

PhonemeHallucinator / Phoneme_Hallucinator

Jupyter Notebook 43 8 Updated Aug 16, 2023

AILab-CVC / VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Python 4,486 333 Updated Jul 10, 2024

Weifeng-Chen / control-a-video

Official Implementation of "Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models"

Python 368 26 Updated Jul 4, 2023

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,582 756 Updated Feb 11, 2024