🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 54,476 5,623 Updated Aug 24, 2024

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 4,482 449 Updated Oct 4, 2024

cwx-worst-one / EAT

[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Python 102 3 Updated Apr 19, 2024

collabora / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 3,818 207 Updated Jun 18, 2024

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 589 42 Updated Sep 9, 2024

thuhcsi / SECap

Python 131 11 Updated Jul 9, 2024

zeroQiaoba / gpt4v-emotion

GPT-4V with Emotion

Python 83 6 Updated Dec 8, 2023

bytedance / SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Python 996 78 Updated Sep 24, 2024

X-LANCE / VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 301 21 Updated Sep 3, 2024

YuanGongND / whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Python 318 25 Updated Feb 21, 2024

yangdongchao / UniAudio

The Open Source Code of UniAudio

Python 509 31 Updated Jul 22, 2024

zhijing-jin / nlp-phd-global-equality

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

848 72 Updated Sep 22, 2024

facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2

Python 682 78 Updated Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ziyang Ma ddlBoJack

Achievements

Achievements

Highlights

Block or report ddlBoJack

Stars

kyutai-labs / moshi

wangtianrui / PM-EVC

OpenT2S / LlamaVoice

Glaciohound / LM-Steer

FunAudioLLM / SenseVoice

FunAudioLLM / CosyVoice

Ereboas / TacoLM

SpeechColab / GigaSpeech2

emo-box / EmoBox

X-LANCE / SLAM-LLM

2noise / ChatTTS

liutaocode / TTS-arxiv-daily

multimodal-art-projection / MAP-NEO

theodorblackbird / lina-speech

zszheng147 / Spatial-AST

Chinese-Tiny-LLM / Chinese-Tiny-LLM

EmulationAI / awesome-large-audio-models

labmlai / annotated_deep_learning_paper_implementations