[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

Python 3,772 285 Updated Apr 30, 2024

Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.

Python 1,430 146 Updated Jun 20, 2024

NUS-HPC-AI-Lab / OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Python 1,249 74 Updated Jul 1, 2024

GaussianCube / GaussianCube

GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling

Python 252 12 Updated Jun 25, 2024

vislearn / ControlNet-XS

Python 411 12 Updated Jan 31, 2024

GaParmar / img2img-turbo

One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more

Python 1,310 141 Updated Apr 30, 2024

xai-org / grok-1

Grok open release

Python 49,127 8,313 Updated May 29, 2024

gnobitab / RectifiedFlow

Official Implementation of Rectified Flow (ICLR2023 Spotlight)

Python 665 40 Updated Jun 30, 2024

THUDM / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,576 389 Updated May 29, 2024

layerdiffusion / sd-forge-layerdiffuse

[WIP] Layer Diffusion for WebUI (via Forge)

Python 3,633 323 Updated Jun 12, 2024

PRIV-Creation / Awesome-Controllable-T2I-Diffusion-Models

A collection of resources on controllable generation with text-to-image diffusion models.

734 20 Updated Jun 10, 2024

whlzy / FiT

[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model

339 7 Updated Feb 20, 2024

chuanyangjin / fast-DiT

Fast Diffusion Models with Transformers

Python 602 83 Updated Oct 7, 2023

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 5,590 498 Updated May 31, 2024

mit-han-lab / fastcomposer

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

Python 621 34 Updated Dec 9, 2023

cloneofsimo / minSDXL

Huggingface-compatible SDXL Unet implementation that is readily hackable

Jupyter Notebook 364 29 Updated Aug 9, 2023

openai / weak-to-strong

Python 2,454 296 Updated May 19, 2024

ytongbai / LVM

Python 1,682 51 Updated Jun 28, 2024

MhLiao / MaskTextSpotterV3

The code of "Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting"

Python 617 121 Updated Jan 20, 2022

rom1504 / img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,422 322 Updated Jun 16, 2024

tgxs002 / align_sd

Better Aligning Text-to-Image Models with Human Preference. ICCV 2023

Python 255 8 Updated Jul 14, 2023

AIGText / GlyphControl-release

[NeurIPS2023] This is the official code of the paper "GlyphControl: Glyph Conditional Control for Visual Text Generation"

Python 189 12 Updated Feb 12, 2024

OFA-Sys / Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Python 3,925 415 Updated Nov 29, 2023

openai / consistencydecoder

Consistency Distilled Diff VAE

Python 2,096 76 Updated Nov 7, 2023

Zhendong Wang ZhendongWang6

Highlights

Block or report ZhendongWang6

Lists (24)

chatgpt

clip

controlnet

dataset

diffusion model

face-anti-spoofing

face-forgery-detection

flow

gan

img2img

knowledge distillation

large language models

large vision model

ocr

pretrain

sam系列

score metrics

segmentation

subject driven generation

survey

tools

vae

vision_language

visual text generation

Stars