Starred repositories
An open-source RAG-based tool for chatting with your documents.
A powerful tool that translates ComfyUI workflows into executable Python code - now as a UI button.
real time face swap and one-click video deepfake with only a single image
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
Various AI scripts. Mostly Stable Diffusion stuff.
Run PyTorch LLMs locally on servers, desktop and mobile
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stable-Hair: Real-World Hair Transfer via Diffusion Model
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing
#1 Locally hosted web application that allows you to perform various operations on PDF files
Official Implementation of "The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval"
Synchronized Translation for Videos. Video dubbing
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
The suite of modeling video with Mamba
🌀 R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 …
[NeurIPS 2024 D&B Track] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.