-
Netflix, Inc.
- Los Gatos
Stars
HAAQI-Net is a novel DNN-based non-intrusive method for assessing music audio quality in hearing aid users.
Code accompanying the paper "Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning" (CVPR 2024)
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system in 275+ supported cars.
Learning audio concepts from natural language supervision
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Revisiting Singing Voice Detection : a Quantitative Review and the Future Outlook
An efficient loudness meter with support for anchoring, median, and multithreading
A pytorch package for non-negative matrix factorization.
Differentiable dynamic range controller in PyTorch.
Robust Speech Recognition via Large-Scale Weak Supervision
Machine Learning applied to sound
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Self-supervised learning for fast pitch estimation
Codes for ICASSP 2024 paper: BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. An online beat tracking system based on streaming Transformer
Code for the paper "Soft Dynamic Time Warping With Variable Step Weights", ICASSP 2024
A simple library for Fréchet Audio Distance (FAD) calculation
PAM is a no-reference audio quality metric for audio generation tasks
VBAP & Define Loudspeaker from Ville Pulkki - updated and adapted by Christophe B.
AudioLDM training, finetuning, evaluation and inference.
Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.