Stars
Robust Speech Recognition via Large-Scale Weak Supervision
A resource for learning about Machine learning & Deep Learning
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
CommonMark spec, with reference implementations in C and JavaScript
Source code complementing our paper for acoustic event classification using convolutional neural networks.
Grapheme to phoneme conversion with deep learning.
Audio super resolution using neural networks
The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
Large, modern dataset for speech recognition
Self-Supervised Speech Pre-training and Representation Learning Toolkit
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.