Highlights
- Pro
Block or Report
Block or report yzyouzhang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024)
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
A list of tools, papers and code related to Deepfake Detection.
An 1D optimal transport inspired loss function in the spectral domain. Can be used for improving frequency localization/estimation in differentiable digital signal processing. Experiments from pape…
A large synthetic dataset of spatial audio with multiple labels
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
Solve forward and inverse problems related to partial differential equations using finite basis physics-informed neural networks (FBPINNs)
"Brian Hears" auditory modelling toolbox for the brian2 simulator
Scaling Out-of-Distribution Detection for Multiple Modalities
DeepFake Detection using Siamese Neural Networks
openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
The LAP Challenge aims at advancing spatial audio technologies through the personalization of HRTFs.
Audio Diarization Annotation tool
Inspect: A framework for large language model evaluations
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
Community list of startups working with AI in audio and music technology
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.