Awesome Diffusion Transformers Title Initial Date Venue Task Resource MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model 31 Aug 2022 TPAMI'2024 All are Worth Words: A ViT Backbone for Diffusion Models 25 Sep 2022 CVPR'2023 Learning to Learn with Generative Models of Neural Network Checkpoints 26 Sep 2022 arXiv Scalable Diffusion Models with Transformers 19 Dec 2022 ICCV'2023 Exploring Vision Transformers as Diffusion Learners 28 Dec 2022 arXiv DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer 07 Mar 2023 ICCV'2023 Masked Diffusion Transformer is a Strong Image Synthesizer 25 Mar 2023 ICCV'2023 Diffusion Transformer for Adaptive Text-to-Speech 03 May 2023 Interspeech'2023 VDT: General-purpose Video Diffusion Transformers via Mask Modeling 22 May 2023 ICLR'2024 ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer 22 May 2023 EMNLP'2023 U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech 22 May 2023 arXiv Fast Training of Diffusion Models with Masked Transformers 15 Jun 2023 TMLR DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation 04 Jul 2023 NeurIPS'2023 Large-Vocabulary 3D Diffusion Model with Transformer 14 Sep 2023 ICLR'2024 Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models 15 Sep 2023 arXiv PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis 30 Sep 2023 ICLR'2024 Dolfin: Diffusion Layout Transformers without Autoencoder 25 Oct 2023 arXiv Mapache: Masked parallel transformer for advanced speech editing and synthesis 03 Dec 2023 ICASSP'2024 DiffiT: Diffusion Vision Transformers for Image Generation 04 Dec 2023 arXiv GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation 07 Dec 2023 CVPR'2024 Photorealistic Video Generation with Diffusion Models 11 Dec 2023 arXiv DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers 11 Dec 2023 arXiv Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation 12 Dec 2023 arXiv NViST: In the Wild New View Synthesis from a Single Image with Transformers 13 Dec 2023 arXiv TransDDPM: Transformer-Based Denoising Diffusion Probabilistic Model for Image Restoration 28 Dec 2023 PRCV'2023 Latte: Latent Diffusion Transformer for Video Generation 05 Jan 2024 arXiv PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models 10 Jan 2024 arXiv SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers 16 Jan 2024 arXiv Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers 21 Jan 2024 arXiv Cross-view Masked Diffusion Transformers for Person Image Synthesis 02 Feb 2024 arXiv DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation 05 Feb 2024 arXiv Sora 15 Feb 2024 OpenAI SDiT: Spiking Diffusion Model with Transformer 18 Feb 2024 arXiv FiT: Flexible Vision Transformer for Diffusion Model 19 Feb 2024 arXiv Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis 22 Feb 2024 arXiv OpenDiT 26 Feb 2024 GitHub FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes 28 Feb 2024 arXiv Open-Sora-Plan 01 Mar 2024 GitHub Stable Diffusion 3: Research Paper 05 Mar 2024 Stability AI Contributing Your contributions are always welcome! Feel free to add/update contents in the data.json file. This README and the website will be updated automatically, powered by GitHub Actions. 🚀 🚀 🚀