Lists (1)
Sort Name ascending (A-Z)
Stars
world modeling challenge for humanoid robots
Official implementation of HumanVid, NeurIPS D&B Track 2024
This is the official reproduction of FancyVideo.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Hiera: A fast, powerful, and simple hierarchical vision transformer.
A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model
[LMM + codec] A new paradigm of visual signal compression!
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV 2022)
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
Reference implementation for DPO (Direct Preference Optimization)
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
A quick guide (especially) for trending instruction finetuning datasets
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.