Block or Report
Block or report yangbinb
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
COLMAP - Structure-from-Motion and Multi-View Stereo
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
Official Pytorch Implementation of Our CVPR2023 Paper: "Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation"
Open-MAGVIT2: Democratizing Autoregressive Visual Generation
Implementation of MagViT2 Tokenizer in Pytorch
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
[CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
Infinite Photorealistic Worlds using Procedural Generation
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Official implementation of FIFO-Diffusion
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official PyTorch implementation of TrackDiffusion (https://arxiv.org/abs/2312.00651)
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Pandora: Towards General World Model with Natural Language Actions and Video States
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Fine-Grained Open Domain Image Animation with Motion Guidance
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding