[![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url]
Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-07-09 | ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction | Shaozhe Hao et.al. | 2407.07077v1 | link |
2024-07-09 | RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models | Bowen Zhang et.al. | 2407.06938v1 | null |
2024-07-09 | HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance | Guian Fang et.al. | 2407.06937v1 | link |
2024-07-09 | A reaction-diffusion model for relapsing-remitting multiple sclerosis with a treatment term | Romina Travaglini et.al. | 2407.06802v1 | null |
2024-07-09 | Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning | Fanyue Wei et.al. | 2407.06642v1 | link |
2024-07-09 | Mobius: An High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task | Yiran Yang et.al. | 2407.06617v1 | null |
2024-07-09 | VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving | Yibo Liu et.al. | 2407.06516v1 | null |
2024-07-09 | Sketch-Guided Scene Image Generation | Tianyu Zhang et.al. | 2407.06469v1 | null |
2024-07-08 | Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation | Jianuo Huang et.al. | 2407.06317v1 | null |
2024-07-08 | VIMI: Grounding Video Generation through Multi-modal Instruction | Yuwei Fang et.al. | 2407.06304v1 | null |
2024-07-08 | Beyond theory driven discovery: hot random search and datum derived structures | Chris J. Pickard et.al. | 2407.06294v1 | null |
2024-07-08 | JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation | Yu Zeng et.al. | 2407.06187v1 | null |
2024-07-08 | The Tug-of-War Between Deepfake Generation and Detection | Hannah Lee et.al. | 2407.06174v1 | null |
2024-07-08 | ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Ethan Chern et.al. | 2407.06135v1 | link |
2024-07-08 | Structured Generations: Using Hierarchical Clusters to guide Diffusion Models | Jorge da Silva Goncalves et.al. | 2407.06124v1 | null |
2024-07-08 | PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models | Jinhua Zhang et.al. | 2407.06109v1 | link |
2024-07-08 | Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation | Xinyu Bai et.al. | 2407.06095v1 | null |
2024-07-08 | Layered Diffusion Model for One-Shot High Resolution Text-to-Image Synthesis | Emaad Khwaja et.al. | 2407.06079v1 | null |
2024-07-08 | Analysis and finite element approximation of a diffuse interface approach to the Stokes--Biot coupling | Francis R. A. Aznaran et.al. | 2407.05949v1 | null |
2024-07-08 | Minutes to Seconds: Speeded-up DDPM-based Image Inpainting with Coarse-to-Fine Sampling | Lintao Zhang et.al. | 2407.05875v1 | link |
2024-07-08 | RadiomicsFill-Mammo: Synthetic Mammogram Mass Manipulation with Radiomics Features | Inye Na et.al. | 2407.05683v1 | link |
2024-07-08 | BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space | Yumeng Zhang et.al. | 2407.05679v1 | link |
2024-07-08 | Ada-adapter:Fast Few-shot Style Personlization of Diffusion Model with Pre-trained Image Encoder | Jia Liu et.al. | 2407.05552v1 | null |
2024-07-08 | Read, Watch and Scream! Sound Generation from Text and Video | Yujin Jeong et.al. | 2407.05551v1 | null |
2024-07-08 | LaSe-E2V: Towards Language-guided Semantic-Aware Event-to-Video Reconstruction | Kanghao Chen et.al. | 2407.05547v1 | null |
2024-07-07 | Diffusion as Sound Propagation: Physics-inspired Model for Ultrasound Image Generation | Marina Domínguez et.al. | 2407.05428v1 | link |
2024-07-07 | BiRoDiff: Diffusion policies for bipedal robot locomotion on unseen terrains | GVS Mothish et.al. | 2407.05424v1 | null |
2024-07-07 | Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model | Danni Yang et.al. | 2407.05352v1 | null |
2024-07-07 | Enhancing Label-efficient Medical Image Segmentation with Text-guided Diffusion Models | Chun-Mei Feng et.al. | 2407.05323v1 | null |
2024-07-07 | An Improved Method for Personalizing Diffusion Models | Yan Zeng et.al. | 2407.05312v1 | null |
2024-07-07 | DM-MIMO: Diffusion Models for Robust Semantic Communications over MIMO Channels | Yiheng Duan et.al. | 2407.05289v1 | null |
2024-07-07 | Gradient Diffusion: A Perturbation-Resilient Gradient Leakage Attack | Xuan Liu et.al. | 2407.05285v1 | null |
2024-07-07 | Multi-scale Conditional Generative Modeling for Microscopic Image Restoration | Luzhe Huang et.al. | 2407.05259v1 | null |
2024-07-06 | FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning | Boyu Fan et.al. | 2407.05098v1 | null |
2024-07-06 | Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Bridge Diffusion Model | Kyobin Choo et.al. | 2407.05059v1 | link |
2024-07-06 | Laminar-Turbulent Patterns in Shear Flows : Evasion of Tipping, Saddle-Loop Bifurcation and Log scaling of the Turbulent Fraction | Pavan V. Kashyap et.al. | 2407.04993v1 | null |
2024-07-06 | FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior | Zhekai Chen et.al. | 2407.04947v1 | null |
2024-07-05 | Improving ensemble extreme precipitation forecasts using generative artificial intelligence | Yingkai Sha et.al. | 2407.04882v1 | null |
2024-07-05 | Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates | Ryotaro Okabe et.al. | 2407.04557v1 | null |
2024-07-05 | Unified continuous-time q-learning for mean-field game and mean-field control problems | Xiaoli Wei et.al. | 2407.04521v1 | null |
2024-07-08 | Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport | Kotaro Ikeda et.al. | 2407.04495v2 | null |
2024-07-05 | PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation | Yinghua Yao et.al. | 2407.04493v1 | null |
2024-07-05 | VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing | Shang Liu et.al. | 2407.04461v1 | null |
2024-07-05 | Comparing metallicity correlations in nearby non-AGN and AGN-host galaxies | Song-lin Li et.al. | 2407.04252v1 | null |
2024-07-05 | GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction | Yuxuan Mu et.al. | 2407.04237v1 | null |
2024-07-05 | T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models | Zhongqi Wang et.al. | 2407.04215v1 | link |
2024-07-05 | TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation | Jian Qian et.al. | 2407.04211v1 | null |
2024-07-04 | Advances in Diffusion Models for Image Data Augmentation: A Review of Methods, Models, Evaluation Metrics and Future Research Directions | Panagiotis Alimisis et.al. | 2407.04103v1 | null |
2024-07-04 | Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection | Federico Girella et.al. | 2407.03961v1 | link |
2024-07-04 | The second-order Esscher martingale densities for continuous-time market models | Tahir Choulli et.al. | 2407.03960v1 | null |
2024-07-04 | Timestep-Aware Correction for Quantized Diffusion Models | Yuzhe Yao et.al. | 2407.03917v1 | null |
2024-07-04 | Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy | Lijun Bo et.al. | 2407.03888v1 | null |
2024-07-04 | Generative Technology for Human Emotion Recognition: A Scope Review | Fei Ma et.al. | 2407.03640v1 | null |
2024-07-04 | Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration | Yuhong Zhang et.al. | 2407.03636v1 | null |
2024-07-04 | MRIR: Integrating Multimodal Insights for Diffusion-based Realistic Image Restoration | Yuhong Zhang et.al. | 2407.03635v1 | null |
2024-07-04 | Feedback-guided Domain Synthesis with Multi-Source Conditional Diffusion Models for Domain Generalization | Mehrdad Noori et.al. | 2407.03588v1 | link |
2024-07-03 | HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation | Tao Chen et.al. | 2407.03548v1 | link |
2024-07-03 | BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement | Ruirui Lin et.al. | 2407.03535v1 | null |
2024-07-03 | DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents | Yilun Xu et.al. | 2407.03300v1 | null |
2024-07-03 | Improved Noise Schedule for Diffusion Training | Tiankai Hang et.al. | 2407.03297v1 | null |
2024-07-04 | Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis | Tong Zhou et.al. | 2407.03089v2 | null |
2024-07-03 | Electromagnetic Property Sensing Based on Diffusion Model in ISAC System | Yuhua Jiang et.al. | 2407.03075v1 | null |
2024-07-03 | Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models | Chunmei Xu et.al. | 2407.03050v1 | null |
2024-07-03 | SlerpFace: Face Template Protection via Spherical Linear Interpolation | Zhizhou Zhong et.al. | 2407.03043v1 | null |
2024-07-03 | Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation | Xiang Gao et.al. | 2407.03006v1 | link |
2024-07-04 | VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors | Sungwon Hwang et.al. | 2407.02945v2 | null |
2024-07-03 | Single Image Rolling Shutter Removal with Diffusion Models | Zhanglei Yang et.al. | 2407.02906v1 | null |
2024-07-03 | Robot Shape and Location Retention in Video Generation Using Diffusion Models | Peng Wang et.al. | 2407.02873v1 | null |
2024-07-03 | Mirage Sources and Large TeV Halo-Pulsar Offsets: Exploring the Parameter Space | Yiwei Bao et.al. | 2407.02829v1 | null |
2024-07-03 | Highly Accelerated MRI via Implicit Neural Representation Guided Posterior Sampling of Diffusion Models | Jiayue Chu et.al. | 2407.02744v1 | null |
2024-07-02 | No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models | Seyedmorteza Sadat et.al. | 2407.02687v1 | null |
2024-07-02 | Diffusion Models for Tabular Data Imputation and Synthetic Data Generation | Mario Villaizán-Vallelado et.al. | 2407.02549v1 | null |
2024-07-02 | Magic Insert: Style-Aware Drag-and-Drop | Nataniel Ruiz et.al. | 2407.02489v1 | null |
2024-07-03 | Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models | Fei Shen et.al. | 2407.02482v2 | null |
2024-07-02 | GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models | Jian Ma et.al. | 2407.02252v1 | link |
2024-07-02 | LaMoD: Latent Motion Diffusion Model For Myocardial Strain Generation | Jiarui Xing et.al. | 2407.02229v1 | null |
2024-07-04 | UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks | Jingjing Ren et.al. | 2407.02158v2 | null |
2024-07-02 | Counterfactual Data Augmentation with Denoising Diffusion for Graph Anomaly Detection | Chunjing Xiao et.al. | 2407.02143v1 | link |
2024-07-04 | Latent Diffusion Model for Generating Ensembles of Climate Simulations | Johannes Meuer et.al. | 2407.02070v2 | null |
2024-07-02 | Accompanied Singing Voice Synthesis with Fully Text-controlled Melody | Ruiqi Li et.al. | 2407.02049v1 | null |
2024-07-02 | ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation | Zhiyuan Ma et.al. | 2407.02040v1 | link |
2024-07-02 | SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules | Suyi Li et.al. | 2407.02031v1 | null |
2024-07-02 | Zero-shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model | Cong Cao et.al. | 2407.01960v1 | null |
2024-07-02 | LDP: A Local Diffusion Planner for Efficient Robot Navigation and Collision Avoidance | Wenhao Yu et.al. | 2407.01950v1 | null |
2024-07-04 | GVDIFF: Grounded Text-to-Video Generation with Diffusion Models | Huanzhang Dou et.al. | 2407.01921v2 | null |
2024-07-02 | Enhancing Multi-Class Anomaly Detection via Diffusion Refinement with Dual Conditioning | Jiawei Zhan et.al. | 2407.01905v1 | null |
2024-07-02 | Text-Aware Diffusion for Policy Learning | Calvin Luo et.al. | 2407.01903v1 | null |
2024-07-01 | Equivariant Diffusion Policy | Dian Wang et.al. | 2407.01812v1 | null |
2024-07-01 | Label-free Neural Semantic Image Synthesis | Jiayi Wang et.al. | 2407.01790v1 | null |
2024-07-01 | Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization | Siyi Gu et.al. | 2407.01648v1 | null |
2024-06-29 | Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization | Taeyoung Yun et.al. | 2407.01624v1 | null |
2024-07-01 | Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing | Bingliang Zhang et.al. | 2407.01521v1 | null |
2024-07-01 | DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models | Chang-Han Yeh et.al. | 2407.01519v1 | null |
2024-07-01 | EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning | Jingyun Yang et.al. | 2407.01479v1 | null |
2024-07-01 | FORA: Fast-Forward Caching in Diffusion Transformer Acceleration | Pratheba Selvaraju et.al. | 2407.01425v1 | null |
2024-07-04 | Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion | Boyuan Chen et.al. | 2407.01392v3 | link |
2024-07-01 | Learning data efficient coarse-grained molecular dynamics from forces and noise | Aleksander E. P. Durumeric et.al. | 2407.01286v1 | null |
2024-07-01 | Semantic-guided Adversarial Diffusion Model for Self-supervised Shadow Removal | Ziqi Zeng et.al. | 2407.01104v1 | null |
2024-07-01 | Blind Inversion using Latent Diffusion Priors | Weimin Bai et.al. | 2407.01027v1 | null |
2024-07-01 | An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations | Weimin Bai et.al. | 2407.01014v1 | null |
2024-07-01 | Hybrid RAG-empowered Multi-modal LLM for Secure Healthcare Data Management: A Diffusion-based Contract Theory Approach | Cheng Su et.al. | 2407.00978v1 | null |
2024-07-01 | Diffusion Transformer Model With Compact Prior for Low-dose PET Reconstruction | Bin Huang et.al. | 2407.00944v1 | null |
2024-07-01 | Mittag-Leffler stability of complete monotonicity-preserving schemes for time-dependent coefficients sub-diffusion equations | Wen Dong et.al. | 2407.00893v1 | null |
2024-06-30 | InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation | Haofan Wang et.al. | 2407.00788v1 | null |
2024-06-30 | Diffusion Models and Representation Learning: A Survey | Michael Fuest et.al. | 2407.00783v1 | link |
2024-06-30 | Chest-Diffusion: A Light-Weight Text-to-Image Model for Report-to-CXR Generation | Peng Huang et.al. | 2407.00752v1 | null |
2024-06-30 | Posterior Sampling with Denoising Oracles via Tilted Transport | Joan Bruna et.al. | 2407.00745v1 | null |
2024-07-03 | Diffusion Models for Offline Multi-agent Reinforcement Learning with Safety Constraints | Jianuo Huang et.al. | 2407.00741v2 | null |
2024-06-30 | LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation | Mushui Liu et.al. | 2407.00737v1 | null |
2024-06-30 | Generative prediction of flow field based on the diffusion model | Jiajun Hu et.al. | 2407.00735v1 | null |
2024-06-30 | Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models | Sangwoong Yoon et.al. | 2407.00626v1 | null |
2024-06-30 | Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness | Yiquan Li et.al. | 2407.00623v1 | null |
2024-06-30 | Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization | Dongxia Wu et.al. | 2407.00610v1 | null |
2024-06-30 | GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing | Yisong Xiao et.al. | 2407.00600v1 | null |
2024-06-29 | Accelerating Longitudinal MRI using Prior Informed Latent Diffusion | Yonatan Urman et.al. | 2407.00537v1 | null |
2024-06-29 | Toward a Diffusion-Based Generalist for Dense Vision Tasks | Yue Fan et.al. | 2407.00503v1 | null |
2024-06-29 | OccFusion: Rendering Occluded Humans with Generative Diffusion Priors | Adam Sun et.al. | 2407.00316v1 | null |
2024-06-29 | A new characterization of the dissipation structure and the relaxation limit for the compressible Euler-Maxwell system | Timothée Crin-Barat et.al. | 2407.00277v1 | null |
2024-06-28 | DiffuseDef: Improved Robustness to Adversarial Attacks | Zhenhao Li et.al. | 2407.00248v1 | null |
2024-06-28 | HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model | Hieu T. Nguyen et.al. | 2406.20077v1 | null |
2024-06-28 | Neural Differentiable Modeling with Diffusion-Based Super-resolution for Two-Dimensional Spatiotemporal Turbulence | Xiantao Fan et.al. | 2406.20047v1 | null |
2024-06-28 | HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI | Haykel Snoussi et.al. | 2406.20042v1 | null |
2024-06-28 | Deceptive Diffusion: Generating Synthetic Adversarial Examples | Lucas Beerens et.al. | 2406.19807v1 | null |
2024-06-28 | Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting | Wei Li et.al. | 2406.19796v1 | link |
2024-06-28 | Decision Transformer for IRS-Assisted Systems with Diffusion-Driven Generative Channels | Jie Zhang et.al. | 2406.19769v1 | null |
2024-06-28 | DISCO: Efficient Diffusion Solver for Large-Scale Combinatorial Optimization Problems | Kexiong Yu et.al. | 2406.19705v1 | null |
2024-06-28 | Network Bending of Diffusion Models for Audio-Visual Generation | Luke Dzwonczyk et.al. | 2406.19589v1 | null |
2024-06-27 | A Thermal Study of Terahertz Induced Protein Interactions | Hadeel Elayan et.al. | 2406.19521v1 | null |
2024-06-27 | pop-cosmos: Scaleable inference of galaxy properties and redshifts with a data-driven population model | Stephen Thorp et.al. | 2406.19437v1 | null |
2024-06-27 | Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations | Jaehong Chung et.al. | 2406.19333v1 | null |
2024-06-27 | Subtractive Training for Music Stem Insertion using Latent Diffusion Models | Ivan Villa-Renteria et.al. | 2406.19328v1 | null |
2024-06-27 | Compositional Image Decomposition with Diffusion Models | Jocelin Su et.al. | 2406.19298v1 | null |
2024-06-27 | Using diffusion model as constraint: Empower Image Restoration Network Training with Diffusion Model | Jiangtong Tan et.al. | 2406.19030v1 | null |
2024-06-28 | AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation | Yanan Sun et.al. | 2406.18958v2 | null |
2024-06-27 | Investigating and Defending Shortcut Learning in Personalized Diffusion Models | Yixin Liu et.al. | 2406.18944v1 | null |
2024-06-28 | AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models | Aishwarya Agarwal et.al. | 2406.18893v2 | null |
2024-06-27 | Chemical Continuous Time Random Walks under Anomalous Diffusion | Hong Zhang et.al. | 2406.18869v1 | null |
2024-06-26 | MultiDiff: Consistent Novel View Synthesis from a Single Image | Norman Müller et.al. | 2406.18524v1 | null |
2024-06-26 | Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration | Kang Liao et.al. | 2406.18516v1 | link |
2024-06-26 | DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance | Younghyun Kim et.al. | 2406.18459v1 | null |
2024-06-26 | Towards diffusion models for large-scale sea-ice modelling | Tobias Sebastian Finn et.al. | 2406.18417v1 | null |
2024-06-27 | Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process | Tianyu Lin et.al. | 2406.18361v2 | link |
2024-06-26 | Molecular Diffusion Models with Virtual Receptors | Matan Halfon et.al. | 2406.18330v1 | null |
2024-06-26 | Galaxy spectroscopy without spectra: Galaxy properties from photometric images with conditional diffusion models | Lars Doorenbos et.al. | 2406.18175v1 | link |
2024-06-26 | Human-Aware 3D Scene Generation with Spatially-constrained Diffusion Models | Xiaolin Hong et.al. | 2406.18159v1 | null |
2024-06-26 | Leveraging Pre-trained Models for FF-to-FFPE Histopathological Image Translation | Qilai Zhang et.al. | 2406.18054v1 | link |
2024-06-25 | DiffusionPDE: Generative PDE-Solving Under Partial Observation | Jiahe Huang et.al. | 2406.17763v1 | link |
2024-06-25 | Unified Auto-Encoding with Masked Diffusion | Philippe Hansen-Estruch et.al. | 2406.17688v1 | link |
2024-06-25 | LaTable: Towards Large Tabular Models | Boris van Breugel et.al. | 2406.17673v1 | null |
2024-06-25 | Aligning Diffusion Models with Noise-Conditioned Perception | Alexander Gambashidze et.al. | 2406.17636v1 | null |
2024-06-25 | Diffusion-based Adversarial Purification for Intrusion Detection | Mohamed Amine Merzouk et.al. | 2406.17606v1 | null |
2024-06-25 | Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text | Xinyang Li et.al. | 2406.17601v1 | link |
2024-06-25 | Detection of Synthetic Face Images: Accuracy, Robustness, Generalization | Nela Petrzelkova et.al. | 2406.17547v1 | null |
2024-06-25 | Principal Component Clustering for Semantic Segmentation in Synthetic Data Generation | Felix Stillger et.al. | 2406.17541v1 | null |
2024-06-25 | The Tree of Diffusion Life: Evolutionary Embeddings to Understand the Generation Process of Diffusion Models | Vidya Prasad et.al. | 2406.17462v1 | null |
2024-06-25 | SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing | Ruihuang Li et.al. | 2406.17396v1 | null |
2024-06-25 | Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers | Lei Chen et.al. | 2406.17343v1 | link |
2024-06-25 | Generative Modelling of Structurally Constrained Graphs | Manuel Madeira et.al. | 2406.17341v1 | link |
2024-06-25 | Disentangled Motion Modeling for Video Frame Interpolation | Jaihyun Lew et.al. | 2406.17256v1 | link |
2024-06-25 | Expansive Synthesis: Generating Large-Scale Datasets from Minimal Samples | Vahid Jebraeeli et.al. | 2406.17238v1 | null |
2024-06-25 | LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing | Aoyang Liu et.al. | 2406.17236v1 | null |
2024-06-26 | Diff3Dformer: Leveraging Slice Sequence Diffusion for Enhanced 3D CT Classification with Transformer Networks | Zihao Jin et.al. | 2406.17173v2 | null |
2024-06-24 | Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation | Zhenyi Liao et.al. | 2406.17100v1 | null |
2024-06-23 | On Instabilities of Unsupervised Denoising Diffusion Models in Magnetic Resonance Imaging Reconstruction | Tianyu Han et.al. | 2406.16983v1 | null |
2024-06-24 | FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Haonan Qiu et.al. | 2406.16863v1 | link |
2024-06-24 | Dreamitate: Real-World Visuomotor Policy Learning via Video Generation | Junbang Liang et.al. | 2406.16862v1 | null |
2024-06-24 | General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design | Yue Jian et.al. | 2406.16821v1 | null |
2024-06-24 | Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image | Jinkun Hao et.al. | 2406.16710v1 | null |
2024-07-01 | Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling | Min-Seop Kwak et.al. | 2406.16695v2 | null |
2024-06-24 | Repulsive Score Distillation for Diverse Sampling of Diffusion Models | Nicolas Zilberstein et.al. | 2406.16683v1 | link |
2024-06-24 | OAML: Outlier Aware Metric Learning for OOD Detection Enhancement | Heng Gao et.al. | 2406.16525v1 | link |
2024-06-24 | DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution | Aiwen Jiang et.al. | 2406.16477v1 | null |
2024-06-24 | ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance | Shuwei Shi et.al. | 2406.16476v1 | null |
2024-06-24 | Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models | Yichen Sun et.al. | 2406.16333v1 | null |
2024-06-24 | YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals | Sandeep Mishra et.al. | 2406.16273v1 | null |
2024-06-24 | Repairing Catastrophic-Neglect in Text-to-Image Diffusion Models via Attention-Guided Feature Enhancement | Zhiyuan Chang et.al. | 2406.16272v1 | null |
2024-06-24 | Video-Infinity: Distributed Long Video Generation | Zhenxiong Tan et.al. | 2406.16260v1 | null |
2024-06-23 | Provable Statistical Rates for Consistency Diffusion Models | Zehao Dou et.al. | 2406.16213v1 | null |
2024-06-23 | UDHF2-Net: An Uncertainty-diffusion-model-based High-Frequency TransFormer Network for High-accuracy Interpretation of Remotely Sensed Imagery | Pengfei Zhang et.al. | 2406.16129v1 | null |
2024-06-23 | Diffusion Spectral Representation for Reinforcement Learning | Dmitry Shribak et.al. | 2406.16121v1 | null |
2024-06-23 | Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification | Inès Hyeonsu Kim et.al. | 2406.16042v1 | null |
2024-06-23 | TimeAutoDiff: Combining Autoencoder and Diffusion model for time series tabular data synthesizing | Namjoon Suh et.al. | 2406.16028v1 | null |
2024-06-22 | PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection | Alvaro Lopez Pellcier et.al. | 2406.15921v1 | null |
2024-06-22 | Soft Masked Mamba Diffusion Model for CT to MRI Conversion | Zhenbin Wang et.al. | 2406.15910v1 | link |
2024-06-22 | EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation | Tianyu Wei et.al. | 2406.15863v1 | null |
2024-06-22 | MVOC: a training-free multiple video object composition method with diffusion models | Wei Wang et.al. | 2406.15829v1 | null |
2024-06-22 | PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud by 2D Inpainting | Qiao Yu et.al. | 2406.15811v1 | link |
2024-06-22 | Rethinking the Diffusion Models for Numerical Tabular Data Imputation from the Perspective of Wasserstein Gradient Flow | Zhichao Chen et.al. | 2406.15762v1 | null |
2024-06-22 | Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model | Min Zhao et.al. | 2406.15735v1 | null |
2024-06-21 | Adaptive Self-Supervised Consistency-Guided Diffusion Model for Accelerated MRI Reconstruction | Mojtaba Safari et.al. | 2406.15656v1 | null |
2024-06-21 | Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild | Nadav Orzech et.al. | 2406.15331v1 | null |
2024-06-21 | You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation | Hongyu Chen et.al. | 2406.15269v1 | null |
2024-06-21 | Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior | Junbo Peng et.al. | 2406.15219v1 | null |
2024-06-21 | A3D: Does Diffusion Dream about 3D Alignment? | Savva Ignatyev et.al. | 2406.15020v1 | null |
2024-06-21 | Probabilistic and Differentiable Wireless Simulation with Geometric Transformers | Thomas Hehn et.al. | 2406.14995v1 | null |
2024-06-21 | VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation | Zixuan Chen et.al. | 2406.14964v1 | null |
2024-06-24 | LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multi-modal Foundation Models | Mengdan Zhu et.al. | 2406.14862v2 | null |
2024-06-21 | Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models | Jie Ren et.al. | 2406.14855v1 | null |
2024-06-21 | DExter: Learning and Controlling Performance Expression with Diffusion Models | Huan Zhang et.al. | 2406.14850v1 | null |
2024-06-21 | Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning | Xu Han et.al. | 2406.14847v1 | null |
2024-06-21 | Latent diffusion models for parameterization and data assimilation of facies-based geomodels | Guido Di Federico et.al. | 2406.14815v1 | null |
2024-06-21 | Probabilistic Emulation of a Global Climate Model with Spherical DYffusion | Salva Rühling Cachay et.al. | 2406.14798v1 | null |
2024-06-20 | Regularized Distribution Matching Distillation for One-step Unpaired Image-to-Image Translation | Denis Rakitin et.al. | 2406.14762v1 | null |
2024-06-20 | Diffusion-Based Failure Sampling for Cyber-Physical Systems | Harrison Delecki et.al. | 2406.14761v1 | link |
2024-06-20 | Computing Nonequilibrium Responses with Score-shifted Stochastic Differential Equations | Jérémie Klinger et.al. | 2406.14752v1 | null |
2024-06-20 | Stylebreeder: Exploring and Democratizing Artistic Styles through Text-to-Image Models | Matthew Zheng et.al. | 2406.14599v1 | null |
2024-06-20 | A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Xincheng Shuai et.al. | 2406.14555v1 | link |
2024-06-21 | Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation | Eyal Michaeli et.al. | 2406.14551v2 | link |
2024-06-20 | Consistency Models Made Easy | Zhengyang Geng et.al. | 2406.14548v1 | link |
2024-06-20 | Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps | Nikita Starodubcev et.al. | 2406.14539v1 | null |
2024-06-20 | V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data | Rotem Shalev-Arkushin et.al. | 2406.14510v1 | null |
2024-06-20 | SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset | Josef Dai et.al. | 2406.14477v1 | link |
2024-06-20 | CollaFuse: Collaborative Diffusion Models | Simeon Allmendinger et.al. | 2406.14429v1 | link |
2024-06-20 | Active Diffusion Subsampling | Oisin Nolan et.al. | 2406.14388v1 | null |
2024-06-20 | In Tree Structure Should Sentence Be Generated | Yaguang Li et.al. | 2406.14189v1 | link |
2024-06-20 | CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation | Tingwei Liu et.al. | 2406.14186v1 | link |
2024-06-20 | ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning | Zhongjie Duan et.al. | 2406.14130v1 | null |
2024-06-20 | HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models | Xinrui Zhou et.al. | 2406.14098v1 | null |
2024-06-20 | Bridging bulk and surface: An interacting particle system towards the field-road diffusion model | Matthieu Alfaro et.al. | 2406.14093v1 | null |
2024-06-20 | A Practical Diffusion Path for Sampling | Omar Chehab et.al. | 2406.14040v1 | null |
2024-06-20 | Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning | Tingyi Lin et.al. | 2406.13977v1 | null |
2024-06-20 | Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models | Yuan Zhong et.al. | 2406.13942v1 | null |
2024-06-20 | EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations | Jie Ren et.al. | 2406.13933v1 | null |
2024-06-19 | INFusion: Diffusion Regularized Implicit Neural Representations for 2D and 3D accelerated MRI reconstruction | Yamin Arefeen et.al. | 2406.13895v1 | null |
2024-06-19 | Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics | Weitong Zhang et.al. | 2406.13652v1 | null |
2024-06-19 | On AI-Inspired UI-Design | Jialiang Wei et.al. | 2406.13631v1 | null |
2024-06-19 | Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy | Elena Tomasi et.al. | 2406.13627v1 | null |
2024-06-19 | Enhance the Image: Super Resolution using Artificial Intelligence in MRI | Ziyu Li et.al. | 2406.13625v1 | null |
2024-06-19 | Image Distillation for Safe Data Sharing in Histopathology | Zhe Li et.al. | 2406.13536v1 | null |
2024-06-19 | Multi-messenger modeling of the Monogem pulsar halo | Youyou Li et.al. | 2406.13426v1 | null |
2024-06-24 | Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images | Haruo Fujiwara et.al. | 2406.13393v2 | null |
2024-06-19 | ARDuP: Active Region Video Diffusion for Universal Policies | Shuaiyi Huang et.al. | 2406.13301v1 | null |
2024-06-19 | AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models | Ken Chen et.al. | 2406.13272v1 | null |
2024-06-19 | Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction | Xinyang Wang et.al. | 2406.13252v1 | null |
2024-06-19 | Neural Residual Diffusion Models for Deep Scalable Vision Generation | Zhiyuan Ma et.al. | 2406.13215v1 | null |
2024-06-24 | Surgical Triplet Recognition via Diffusion Model | Daochang Liu et.al. | 2406.13210v2 | null |
2024-06-19 | Diffusion Model-based FOD Restoration from High Distortion in dMRI | Shuo Huang et.al. | 2406.13209v1 | null |
2024-06-21 | Conditional score-based diffusion models for solving inverse problems in mechanics | Agnimitra Dasgupta et.al. | 2406.13154v2 | null |
2024-06-19 | MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction | Jiaqi Cui et.al. | 2406.13150v1 | null |
2024-06-18 | Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models | Paul Henderson et.al. | 2406.13099v1 | null |
2024-06-18 | MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification | Harrison Gietz et.al. | 2406.13066v1 | link |
2024-06-18 | Evaluating the design space of diffusion-based generative models | Yuqing Wang et.al. | 2406.12839v1 | null |
2024-06-18 | Neural Approximate Mirror Maps for Constrained Diffusion Models | Berthy T. Feng et.al. | 2406.12816v1 | null |
2024-06-18 | Extracting Training Data from Unconditional Diffusion Models | Yunhao Chen et.al. | 2406.12752v1 | null |
2024-06-18 | Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation | Miseul Kim et.al. | 2406.12688v1 | null |
2024-06-21 | GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Yongtao Ge et.al. | 2406.12671v2 | link |
2024-06-18 | Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images | Shivank Garg et.al. | 2406.12592v1 | link |
2024-06-18 | Training Diffusion Models with Federated Learning | Matthijs de Goede et.al. | 2406.12575v1 | null |
2024-06-18 | Variational Distillation of Diffusion Policies into Mixture of Experts | Hongyi Zhou et.al. | 2406.12538v1 | null |
2024-06-18 | HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors | Panwang Pan et.al. | 2406.12459v1 | link |
2024-06-18 | Planning Using Schrödinger Bridge Diffusion Models | Adarsh Srivastava et.al. | 2406.12458v1 | link |
2024-06-18 | Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models | David Bergström et.al. | 2406.12423v1 | null |
2024-06-18 | TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI | Mattia Litrico et.al. | 2406.12411v1 | null |
2024-06-18 | Effective Generation of Feasible Solutions for Integer Programming via Guided Diffusion | Hao Zeng et.al. | 2406.12349v1 | null |
2024-06-18 | Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment | Yiheng Li et.al. | 2406.12303v1 | null |
2024-06-17 | COT Flow: Learning Optimal-Transport Image Sampling and Editing by Contrastive Pairs | Xinrui Zu et.al. | 2406.12140v1 | null |
2024-06-17 | Adding Conditional Control to Diffusion Models with Reinforcement Learning | Yulai Zhao et.al. | 2406.12120v1 | null |
2024-06-17 | Optimal withdrawals in a general diffusion model with control rates subject to a state-dependent upper bound | Hélène Guérin et.al. | 2406.12067v1 | null |
2024-06-17 | ARTIST: Improving the Generation of Text-rich Images by Disentanglement | Jianyi Zhang et.al. | 2406.12044v1 | null |
2024-06-17 | Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models | Alireza Ganjdanesh et.al. | 2406.12042v1 | null |
2024-06-17 | Decomposed evaluations of geographic disparities in text-to-image models | Abhishek Sureddy et.al. | 2406.11988v1 | null |
2024-06-17 | Crossfusor: A Cross-Attention Transformer Enhanced Conditional Diffusion Model for Car-Following Trajectory Prediction | Junwei You et.al. | 2406.11941v1 | null |
2024-06-17 | Bridging Design Gaps: A Parametric Data Completion Approach With Graph Guided Diffusion Models | Rui Zhou et.al. | 2406.11934v1 | null |
2024-06-16 | Mixture-of-Subspaces in Low-Rank Adaptation | Taiqiang Wu et.al. | 2406.11909v1 | null |
2024-06-17 | Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models | Bingqi Ma et.al. | 2406.11831v1 | null |
2024-06-17 | MegaScenes: Scene-Level View Synthesis at Scale | Joseph Tung et.al. | 2406.11819v1 | null |
2024-06-17 | DiffMM: Multi-Modal Diffusion Model for Recommendation | Yangqin Jiang et.al. | 2406.11781v1 | null |
2024-06-17 | Latent Denoising Diffusion GAN: Faster sampling, Higher image quality | Luan Thanh Trinh et.al. | 2406.11713v1 | link |
2024-06-17 | MusicScore: A Dataset for Music Score Modeling and Generation | Yuheng Lin et.al. | 2406.11462v1 | null |
2024-06-17 | AnyTrans: Translate AnyText in the Image with Large Scale Models | Zhipeng Qian et.al. | 2406.11432v1 | null |
2024-06-17 | DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer | Keon Lee et.al. | 2406.11427v1 | null |
2024-06-17 | Unfolding Time: Generative Modeling for Turbulent Flows in 4D | Abdullah Saydemir et.al. | 2406.11390v1 | null |
2024-06-17 | Diffusion Models in Low-Level Vision: A Survey | Chunming He et.al. | 2406.11138v1 | null |
2024-06-16 | Exploiting Diffusion Prior for Out-of-Distribution Detection | Armando Zhu et.al. | 2406.11105v1 | null |
2024-06-16 | An Analysis on Quantizing Diffusion Transformers | Yuewei Yang et.al. | 2406.11100v1 | null |
2024-06-16 | A Bayesian Drift-Diffusion Model of Schachter-Singer's Two Factor Theory of Emotion | Lance Ying et.al. | 2406.11086v1 | null |
2024-06-16 | ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models | Kaifeng Gao et.al. | 2406.10981v1 | null |
2024-06-16 | Graph Neural Reaction Diffusion Models | Moshe Eliasof et.al. | 2406.10871v1 | null |
2024-06-16 | Diffusion Model With Optimal Covariance Matching | Zijing Ou et.al. | 2406.10808v1 | null |
2024-06-16 | Diffusion Models Are Promising for Ab Initio Structure Solutions from Nanocrystalline Powder Diffraction Data | Gabe Guo et.al. | 2406.10796v1 | link |
2024-06-15 | Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft | Ian Vyse et.al. | 2406.10724v1 | link |
2024-06-18 | A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing | Ming Meng et.al. | 2406.10553v2 | null |
2024-06-15 | Self-Supervised Vision Transformer for Enhanced Virtual Clothes Try-On | Lingxiao Lu et.al. | 2406.10539v1 | null |
2024-06-15 | Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space | Mohamed Amine Ketata et.al. | 2406.10513v1 | null |
2024-06-14 | Consistency-diversity-realism Pareto fronts of conditional image generative models | Pietro Astolfi et.al. | 2406.10429v1 | null |
2024-06-14 | SigDiffusions: Score-Based Diffusion Models for Long Time Series via Log-Signature Embeddings | Barbora Barancikova et.al. | 2406.10354v1 | null |
2024-06-14 | SatDiffMoE: A Mixture of Estimation Method for Satellite Image Super-resolution with Latent Diffusion Models | Zhaoxu Luo et.al. | 2406.10225v1 | null |
2024-06-14 | DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction | Bowen Song et.al. | 2406.10211v1 | null |
2024-06-14 | Make It Count: Text-to-Image Generation with an Accurate Number of Objects | Lital Binyamin et.al. | 2406.10210v1 | null |
2024-06-14 | Crafting Parts for Expressive Object Composition | Harsh Rangwani et.al. | 2406.10197v1 | null |
2024-06-14 | Training-free Camera Control for Video Generation | Chen Hou et.al. | 2406.10126v1 | null |
2024-06-14 | Group and Shuffle: Efficient Structured Orthogonal Parametrization | Mikhail Gorbunov et.al. | 2406.10019v1 | null |
2024-06-14 | OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control | Yuzhong Huang et.al. | 2406.10000v1 | null |
2024-06-14 | InstructRL4Pix: Training Diffusion for Image Editing by Reinforcement Learning | Tiancheng Li et.al. | 2406.09973v1 | null |
2024-06-14 | GradeADreamer: Enhanced Text-to-3D Generation Using Gaussian Splatting and Multi-View Diffusion | Trapoom Ukarapol et.al. | 2406.09850v1 | link |
2024-06-14 | Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion | Runze Liu et.al. | 2406.09782v1 | null |
2024-06-14 | Bayesian Conditioned Diffusion Models for Inverse Problems | Alper Güngör et.al. | 2406.09768v1 | null |
2024-06-14 | Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting | Ce Hao et.al. | 2406.09767v1 | null |
2024-06-14 | ControlVAR: Exploring Controllable Visual Autoregressive Modeling | Xiang Li et.al. | 2406.09750v1 | null |
2024-06-14 | Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses | Seungwoo Yoo et.al. | 2406.09728v1 | null |
2024-06-14 | Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models | Changjiang Li et.al. | 2406.09669v1 | null |
2024-06-14 | New algorithms for sampling and diffusion models | Xicheng Zhang et.al. | 2406.09665v1 | null |
2024-06-13 | Turns Out I'm Not Real: Towards Robust Detection of AI-Generated Videos | Qingyuan Liu et.al. | 2406.09601v1 | null |
2024-06-13 | Improving Consistency Models with Generator-Induced Coupling | Thibaut Issenhuth et.al. | 2406.09570v1 | link |
2024-06-13 | e-COP : Episodic Constrained Optimization of Policies | Akhil Agnihotri et.al. | 2406.09563v1 | null |
2024-06-13 | My Body My Choice: Human-Centric Full-Body Anonymization | Umur Aybars Ciftci et.al. | 2406.09553v1 | null |
2024-06-13 | Between Randomness and Arbitrariness: Some Lessons for Reliable Machine Learning at Scale | A. Feder Cooper et.al. | 2406.09548v1 | null |
2024-06-13 | CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making | Zibin Dong et.al. | 2406.09509v1 | link |
2024-06-13 | Fair Data Generation via Score-based Diffusion Model | Yujie Lin et.al. | 2406.09495v1 | null |
2024-06-13 | Language-driven Grasp Detection | An Dinh Vuong et.al. | 2406.09489v1 | null |
2024-06-13 | Is Diffusion Model Safe? Severe Data Leakage via Gradient-Guided Diffusion Model | Jiayang Meng et.al. | 2406.09484v1 | null |
2024-06-13 | Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models | Qihao Liu et.al. | 2406.09416v1 | null |
2024-06-13 | An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels | Duy-Kien Nguyen et.al. | 2406.09415v1 | null |
2024-06-13 | Interpreting the Weight Space of Customized Diffusion Models | Amil Dravid et.al. | 2406.09413v1 | link |
2024-06-13 | ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing | Jun-Kun Chen et.al. | 2406.09404v1 | null |
2024-06-13 | Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion | Linzhan Mou et.al. | 2406.09402v1 | null |
2024-06-13 | OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation | Junke Wang et.al. | 2406.09399v1 | link |
2024-06-13 | SimGen: Simulator-conditioned Driving Scene Generation | Yunsong Zhou et.al. | 2406.09386v1 | null |
2024-06-13 | CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models | Yigit Ekin et.al. | 2406.09368v1 | null |
2024-06-13 | Understanding Hallucinations in Diffusion Models through Mode Interpolation | Sumukh K Aithal et.al. | 2406.09358v1 | link |
2024-06-13 | Advancing Graph Generation through Beta Diffusion | Yilin He et.al. | 2406.09357v1 | null |
2024-06-13 | StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning | Giuseppe Vecchio et.al. | 2406.09293v1 | null |
2024-06-13 | Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models | Ziyi Wu et.al. | 2406.09292v1 | null |
2024-06-14 | Generative Inverse Design of Crystal Structures via Diffusion Models with Transformers | Izumi Takahara et.al. | 2406.09263v2 | null |
2024-06-13 | EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts | Yucheng Han et.al. | 2406.09162v1 | null |
2024-06-13 | Complex Image-Generative Diffusion Transformer for Audio Denoising | Junhui Li et.al. | 2406.09161v1 | null |
2024-06-13 | Diffusion Gaussian Mixture Audio Denoise | Pu Wang et.al. | 2406.09154v1 | null |
2024-06-13 | Operator-informed score matching for Markov diffusion models | Zheyang Shen et.al. | 2406.09084v1 | null |
2024-06-13 | EquiPrompt: Debiasing Diffusion Models via Iterative Bootstrapping in Chain of Thoughts | Zahraa Al Sahili et.al. | 2406.09070v1 | null |
2024-06-13 | Preserving Identity with Variational Score for General-purpose 3D Editing | Duong H. Le et.al. | 2406.08953v1 | null |
2024-06-13 | Step-by-Step Diffusion: An Elementary Tutorial | Preetum Nakkiran et.al. | 2406.08929v1 | null |
2024-06-13 | Heuristics for Influence Maximization with Tiered Influence and Activation thresholds | Rahul Kumar Gautam et.al. | 2406.08876v1 | null |
2024-06-13 | COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing | Jiangshan Wang et.al. | 2406.08850v1 | null |
2024-06-13 | FouRA: Fourier Low Rank Adaptation | Shubhankar Borse et.al. | 2406.08798v1 | null |
2024-06-13 | Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis | Xinrui Yang et.al. | 2406.08713v1 | null |
2024-06-12 | Vivid-ZOO: Multi-View Video Generation with Diffusion Model | Bing Li et.al. | 2406.08659v1 | null |
2024-06-12 | How to Distinguish AI-Generated Images from Authentic Photographs | Negar Kamali et.al. | 2406.08651v1 | null |
2024-06-12 | FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion | George Cazenavette et.al. | 2406.08603v1 | null |
2024-06-12 | Predicting Cascading Failures with a Hyperparametric Diffusion Model | Bin Xiang et.al. | 2406.08522v1 | null |
2024-06-12 | Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation | Raphael Tang et.al. | 2406.08482v1 | null |
2024-06-12 | Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models | Yuxuan Xue et.al. | 2406.08475v1 | null |
2024-06-12 | ** |
Pranath Reddy et.al. | 2406.08442v1 | null |
2024-06-12 | Diffusion Soup: Model Merging for Text-to-Image Diffusion Models | Benjamin Biggs et.al. | 2406.08431v1 | null |
2024-06-12 | FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation | Xinzhi Mu et.al. | 2406.08392v1 | null |
2024-06-12 | Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion Models | Javier Nistal et.al. | 2406.08384v1 | null |
2024-06-12 | 2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction | Tianqi Chen et.al. | 2406.08374v1 | null |
2024-06-12 | WMAdapter: Adding WaterMark Control to Latent Diffusion Models | Hai Ci et.al. | 2406.08337v1 | null |
2024-06-12 | Dataset Enhancement with Instance-Level Augmentations | Orest Kupyn et.al. | 2406.08249v1 | link |
2024-06-12 | Diffusion-Promoted HDR Video Reconstruction | Yuanshen Guan et.al. | 2406.08204v1 | null |
2024-06-12 | LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation | Wenhao Guan et.al. | 2406.08203v1 | null |
2024-06-14 | One-Step Effective Diffusion Network for Real-World Image Super-Resolution | Rongyuan Wu et.al. | 2406.08177v2 | link |
2024-06-12 | Defect-related Anomalous Mobility of Small polarons in Oxides: the Case of Congruent Lithium Niobate | Anton Pfannstiel et.al. | 2406.08123v1 | null |
2024-06-12 | Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement | Runyi Yu et.al. | 2406.08096v1 | null |
2024-06-12 | CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models | Hyungjin Chung et.al. | 2406.08070v1 | null |
2024-06-12 | Ablation Based Counterfactuals | Zheng Dai et.al. | 2406.07908v1 | null |
2024-06-12 | DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition | Jiacheng Liu et.al. | 2406.07852v1 | null |
2024-06-12 | Hierarchical Patch Diffusion Models for High-Resolution Video Generation | Ivan Skorokhodov et.al. | 2406.07792v1 | null |
2024-06-11 | HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness | Zihui Xue et.al. | 2406.07754v1 | null |
2024-06-11 | CUPID: Contextual Understanding of Prompt-conditioned Image Distributions | Yayan Zhao et.al. | 2406.07699v1 | null |
2024-06-11 | Treeffuser: Probabilistic Predictions via Conditional Diffusions with Gradient-Boosted Trees | Nicolas Beltran-Velez et.al. | 2406.07658v1 | link |
2024-06-11 | Pre-training Feature Guided Diffusion Model for Speech Enhancement | Yiyuan Yang et.al. | 2406.07646v1 | null |
2024-06-11 | An Image is Worth 32 Tokens for Reconstruction and Generation | Qihang Yu et.al. | 2406.07550v1 | null |
2024-06-11 | Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance | Kuan Heng Lin et.al. | 2406.07540v1 | null |
2024-06-11 | Simple and Effective Masked Diffusion Language Models | Subham Sekhar Sahoo et.al. | 2406.07524v1 | link |
2024-06-11 | Neural Gaffer: Relighting Any Object via Diffusion | Haian Jin et.al. | 2406.07520v1 | null |
2024-06-11 | Instant 3D Human Avatar Generation using Image Diffusion Models | Nikos Kolotouros et.al. | 2406.07516v1 | null |
2024-06-11 | Flow Map Matching | Nicholas M. Boffi et.al. | 2406.07507v1 | null |
2024-06-11 | GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection | Hang Yao et.al. | 2406.07487v1 | null |
2024-06-11 | Image Neural Field Diffusion Models | Yinbo Chen et.al. | 2406.07480v1 | null |
2024-06-11 | 4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models | Heng Yu et.al. | 2406.07472v1 | null |
2024-06-11 | Noise-robust Speech Separation with Fast Generative Correction | Helin Wang et.al. | 2406.07461v1 | null |
2024-06-11 | DiffCom: Channel Received Signal is a Natural Condition to Guide Diffusion Posterior Sampling | Sixian Wang et.al. | 2406.07390v1 | null |
2024-06-12 | Towards Realistic Data Generation for Real-World Super-Resolution | Long Peng et.al. | 2406.07255v2 | null |
2024-06-12 | Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models | Athanasios Tragakis et.al. | 2406.07251v2 | link |
2024-06-11 | Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models | Sooyeon Go et.al. | 2406.07008v1 | null |
2024-06-11 | DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach | Zhang Liu et.al. | 2406.06986v1 | null |
2024-06-11 | Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey | Ping Liu et.al. | 2406.06965v1 | null |
2024-06-11 | Unleashing the Denoising Capability of Diffusion Prior for Solving Inverse Problems | Jiawei Zhang et.al. | 2406.06959v1 | link |
2024-06-11 | AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising | Zigeng Chen et.al. | 2406.06911v1 | link |
2024-06-11 | Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation | Yuanhao Zhai et.al. | 2406.06890v1 | null |
2024-06-09 | Latent Diffusion Model-Enabled Real-Time Semantic Communication Considering Semantic Ambiguities and Channel Noises | Jianhua Pei et.al. | 2406.06644v1 | null |
2024-06-10 | IllumiNeRF: 3D Relighting without Inverse Rendering | Xiaoming Zhao et.al. | 2406.06527v1 | null |
2024-06-10 | Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation | Peize Sun et.al. | 2406.06525v1 | link |
2024-06-10 | Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer | Sigal Raab et.al. | 2406.06508v1 | link |
2024-06-10 | AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction | Zhen Xing et.al. | 2406.06465v1 | null |
2024-06-10 | Cometh: A continuous-time discrete-state graph diffusion model | Antoine Siraudin et.al. | 2406.06449v1 | null |
2024-06-10 | Margin-aware Preference Optimization for Aligning Diffusion Models without Reference | Jiwoo Hong et.al. | 2406.06424v1 | null |
2024-06-10 | Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization | Yi Gu et.al. | 2406.06382v1 | link |
2024-06-10 | Improving Deep Learning-based Automatic Cranial Defect Reconstruction by Heavy Data Augmentation: From Image Registration to Latent Diffusion Models | Marek Wodzinski et.al. | 2406.06372v1 | null |
2024-06-10 | MVGamba: Unify 3D Content Generation as State Space Sequence Modeling | Xuanyu Yi et.al. | 2406.06367v1 | null |
2024-06-11 | Tuning-Free Visual Customization via View Iterative Self-Attention Control | Xiaojie Li et.al. | 2406.06258v2 | null |
2024-06-10 | Data Augmentation in Earth Observation: A Diffusion Model Approach | Tiago Sousa et.al. | 2406.06218v1 | null |
2024-06-10 | The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems | Philippe Gonzalez et.al. | 2406.06160v1 | null |
2024-06-10 | Thunder : Unified Regression-Diffusion Speech Enhancement with a Single Reverse Step using Brownian Bridge | Thanapat Trachu et.al. | 2406.06139v1 | null |
2024-06-10 | DiffInject: Revisiting Debias via Synthetic Data Generation using Diffusion-based Style Injection | Donggeun Ko et.al. | 2406.06134v1 | null |
2024-06-10 | ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models | Meng-Li Shih et.al. | 2406.06133v1 | null |
2024-06-10 | Latent Representation Matters: Human-like Sketches in One-shot Drawing Tasks | Victor Boutin et.al. | 2406.06079v1 | null |
2024-06-10 | Generalizable Human Gaussians from Single-View Image | Jinnan Chen et.al. | 2406.06050v1 | link |
2024-06-10 | Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training | Ke Niu et.al. | 2406.06045v1 | link |
2024-06-10 | FRAG: Frequency Adapting Group for Diffusion Video Editing | Sunjae Yoon et.al. | 2406.06044v1 | null |
2024-06-09 | Improving Antibody Design with Force-Guided Sampling in Diffusion Models | Paulina Kulytė et.al. | 2406.05832v1 | null |
2024-06-12 | MLCM: Multistep Consistency Distillation of Latent Diffusion Model | Qingsong Xie et.al. | 2406.05768v3 | null |
2024-06-09 | Binarized Diffusion Model for Image Super-Resolution | Zheng Chen et.al. | 2406.05723v1 | link |
2024-06-11 | Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling | Yuepeng Jiang et.al. | 2406.05681v2 | null |
2024-06-09 | PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction | Shangyu Chen et.al. | 2406.05641v1 | null |
2024-06-08 | Autoregressive Diffusion Transformer for Text-to-Speech Synthesis | Zhijun Liu et.al. | 2406.05551v1 | null |
2024-06-08 | Exploring Bridges Between Creative Coding and Visual Generative AI | Jiaqi Wu et.al. | 2406.05508v1 | null |
2024-06-08 | Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis | Zanlin Ni et.al. | 2406.05478v1 | null |
2024-06-08 | 3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes | Aghiles Kebaili et.al. | 2406.05421v1 | link |
2024-06-08 | Mean-field Chaos Diffusion Models | Sungwoo Park et.al. | 2406.05396v1 | null |
2024-06-12 | MotionClone: Training-Free Motion Cloning for Controllable Video Generation | Pengyang Ling et.al. | 2406.05338v2 | null |
2024-06-08 | LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance | Shihao Chen et.al. | 2406.05325v1 | null |
2024-06-08 | CoBL-Diffusion: Diffusion-Based Conditional Robot Planning in Dynamic Environments Using Control Barrier and Lyapunov Functions | Kazuki Mizuta et.al. | 2406.05309v1 | null |
2024-06-07 | Modelling effects of moisture on mechanical properties of crosslinked polyurethane adhesives | S. P. Josyula et.al. | 2406.05278v1 | null |
2024-06-07 | Efficient Differentially Private Fine-Tuning of Diffusion Models | Jing Liu et.al. | 2406.05257v1 | null |
2024-06-07 | DiffusionPID: Interpreting Diffusion via Partial Information Decomposition | Shaurya Dewan et.al. | 2406.05191v1 | null |
2024-06-07 | CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion | Xingrui Wang et.al. | 2406.05082v1 | null |
2024-06-07 | Generative diffusion models for synthetic trajectories of heavy and light particles in turbulence | Tianyi Li et.al. | 2406.05008v1 | null |
2024-06-07 | Learning Divergence Fields for Shift-Robust Graph Representations | Qitian Wu et.al. | 2406.04963v1 | link |
2024-06-07 | Combinatorial Complex Score-based Diffusion Modelling through Stochastic Differential Equations | Adrien Carrel et.al. | 2406.04916v1 | link |
2024-06-07 | Online Continual Learning of Video Diffusion Models From a Single Video Stream | Jason Yoo et.al. | 2406.04814v1 | null |
2024-06-07 | TEDi Policy: Temporally Entangled Diffusion for Robotic Control | Sigmund H. Høeg et.al. | 2406.04806v1 | null |
2024-06-07 | Diffusion-based Generative Image Outpainting for Recovery of FOV-Truncated CT Images | Michelle Espranita Liman et.al. | 2406.04769v1 | null |
2024-06-07 | PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction | Eduard Poesina et.al. | 2406.04746v1 | link |
2024-06-07 | FlowMM: Generating Materials with Riemannian Flow Matching | Benjamin Kurt Miller et.al. | 2406.04713v1 | null |
2024-06-07 | MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models | Sanjoy Chowdhury et.al. | 2406.04673v1 | null |
2024-06-07 | GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models | Diptanu De et.al. | 2406.04654v1 | null |
2024-06-07 | Boosting Diffusion Model for Spectrogram Up-sampling in Text-to-speech: An Empirical Study | Chong Zhang et.al. | 2406.04633v1 | null |
2024-06-07 | STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting | Zenghao Chai et.al. | 2406.04629v1 | link |
2024-06-07 | CTSyn: A Foundational Model for Cross Tabular Data Generation | Xiaofeng Lin et.al. | 2406.04619v1 | null |
2024-06-07 | Diverse Intra- and Inter-Domain Activity Style Fusion for Cross-Person Generalization in Activity Recognition | Junru Zhang et.al. | 2406.04609v1 | null |
2024-06-06 | Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance | Reyhane Askari Hemmat et.al. | 2406.04551v1 | null |
2024-06-06 | Single Exposure Quantitative Phase Imaging with a Conventional Microscope using Diffusion Models | Gabriel della Maggiora et.al. | 2406.04388v1 | null |
2024-06-07 | Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion | Fangfu Liu et.al. | 2406.04338v2 | null |
2024-06-08 | Coherent Zero-Shot Visual Instruction Generation | Quynh Phung et.al. | 2406.04337v2 | null |
2024-06-06 | BitsFusion: 1.99 bits Weight Quantization of Diffusion Model | Yang Sui et.al. | 2406.04333v1 | link |
2024-06-06 | Simplified and Generalized Masked Diffusion for Discrete Data | Jiaxin Shi et.al. | 2406.04329v1 | null |
2024-06-06 | SF-V: Single Forward Video Generation Model | Zhixing Zhang et.al. | 2406.04324v1 | null |
2024-06-06 | ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories | Qianlan Yang et.al. | 2406.04323v1 | null |
2024-06-07 | DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data | Qihao Liu et.al. | 2406.04322v2 | link |
2024-06-06 | Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step | Zhanhao Liang et.al. | 2406.04314v1 | null |
2024-06-06 | Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment | Jiayi Guo et.al. | 2406.04295v1 | link |
2024-06-06 | VideoTetris: Towards Compositional Text-to-Video Generation | Ye Tian et.al. | 2406.04277v1 | link |
2024-06-06 | A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation | Ruihe Wang et.al. | 2406.04253v1 | null |
2024-06-06 | Diffusion-based image inpainting with internal learning | Nicolas Cherel et.al. | 2406.04206v1 | null |
2024-06-06 | Multistep Distillation of Diffusion Models via Moment Matching | Tim Salimans et.al. | 2406.04103v1 | null |
2024-06-06 | Enhancing Weather Predictions: Super-Resolution via Deep Diffusion Models | Jan Martinů et.al. | 2406.04099v1 | null |
2024-06-06 | LDM-RSIC: Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression | Junhui Li et.al. | 2406.03961v1 | null |
2024-06-06 | LLplace: The 3D Indoor Scene Layout Generation and Editing via Large Language Model | Yixuan Yang et.al. | 2406.03866v1 | null |
2024-06-06 | Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data | Jingyang Ou et.al. | 2406.03736v1 | null |
2024-06-06 | JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits | Minzhou Pan et.al. | 2406.03720v1 | link |
2024-06-06 | Pi-fusion: Physics-informed diffusion model for learning fluid dynamics | Jing Qiu et.al. | 2406.03711v1 | null |
2024-06-06 | Mean-variance portfolio selection in jump-diffusion model under no-shorting constraint: A viscosity solution approach | Xiaomin Shi et.al. | 2406.03709v1 | null |
2024-06-06 | BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning | Artem Zholus et.al. | 2406.03686v1 | null |
2024-06-06 | Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models | Ding Huang et.al. | 2406.03683v1 | link |
2024-06-05 | Understanding the Limitations of Diffusion Concept Algebra Through Food | E. Zhixuan Zeng et.al. | 2406.03582v1 | null |
2024-06-05 | A Geometric View of Data Complexity: Efficient Local Intrinsic Dimension Estimation with Diffusion Models | Hamidreza Kamkari et.al. | 2406.03537v1 | null |
2024-06-05 | Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input | Joachim Ott et.al. | 2406.03439v1 | null |
2024-06-05 | Text-to-Image Rectified Flow as Plug-and-Play Priors | Xiaofeng Yang et.al. | 2406.03293v1 | link |
2024-06-05 | Generative Diffusion Models for Fast Simulations of Particle Collisions at CERN | Mikołaj Kita et.al. | 2406.03233v1 | null |
2024-06-05 | Searching Priors Makes Text-to-Video Synthesis Better | Haoran Cheng et.al. | 2406.03215v1 | null |
2024-06-05 | Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion | Hao Wen et.al. | 2406.03184v1 | link |
2024-06-05 | Tiny models from tiny data: Textual and null-text inversion for few-shot distillation | Erik Landolsi et.al. | 2406.03146v1 | link |
2024-06-05 | Floating Anchor Diffusion Model for Multi-motif Scaffolding | Ke Liu et.al. | 2406.03141v1 | link |
2024-06-05 | Phy-Diff: Physics-guided Hourglass Diffusion Model for Diffusion MRI Synthesis | Juanhua Zhang et.al. | 2406.03002v1 | null |
2024-06-05 | Exploring Data Efficiency in Zero-Shot Learning with Diffusion Models | Zihan Ye et.al. | 2406.02929v1 | null |
2024-06-06 | U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation | Chenxin Li et.al. | 2406.02918v2 | null |
2024-06-05 | TSPDiffuser: Diffusion Models as Learned Samplers for Traveling Salesperson Path Planning Problems | Ryo Yonetani et.al. | 2406.02858v1 | null |
2024-06-04 | ORACLE: Leveraging Mutual Information for Consistent Character Generation with LoRAs in Diffusion Models | Kiymet Akdemir et.al. | 2406.02820v1 | null |
2024-06-04 | Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following | Qiaomu Miao et.al. | 2406.02774v1 | null |
2024-06-04 | Neural Representations of Dynamic Visual Stimuli | Jacob Yeung et.al. | 2406.02659v1 | null |
2024-06-04 | Pancreatic Tumor Segmentation as Anomaly Detection in CT Images Using Denoising Diffusion Models | Reza Babaei et.al. | 2406.02653v1 | null |
2024-06-04 | Dreamguider: Improved Training free Diffusion-based Conditional Generation | Nithin Gopalakrishnan Nair et.al. | 2406.02549v1 | null |
2024-06-06 | Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting | Inkyu Shin et.al. | 2406.02541v3 | null |
2024-06-04 | CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation | Dejia Xu et.al. | 2406.02509v1 | null |
2024-06-04 | Guiding a Diffusion Model with a Bad Version of Itself | Tero Karras et.al. | 2406.02507v1 | null |
2024-06-04 | Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation | Jiajun Wang et.al. | 2406.02485v1 | link |
2024-06-04 | Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion | Colin Hansen et.al. | 2406.02477v1 | null |
2024-06-04 | Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems | Jason Hu et.al. | 2406.02462v1 | null |
2024-06-04 | RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting | Qi Wang et.al. | 2406.02461v1 | null |
2024-06-04 | Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models | Dominik Hintersdorf et.al. | 2406.02366v1 | link |
2024-06-05 | Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation | Clement Chadebec et.al. | 2406.02347v2 | link |
2024-06-05 | SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models | Dongchao Yang et.al. | 2406.02328v2 | null |
2024-06-04 | A Survey of Transformer Enabled Time Series Synthesis | Alexander Sommers et.al. | 2406.02322v1 | null |
2024-06-04 | Neural Thermodynamic Integration: Free Energies from Energy-based Diffusion Models | Bálint Máté et.al. | 2406.02313v1 | null |
2024-06-04 | I4VGen: Image as Stepping Stone for Text-to-Video Generation | Xiefan Guo et.al. | 2406.02230v1 | null |
2024-06-04 | GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon | Sanhita Pathak et.al. | 2406.02184v1 | null |
2024-06-04 | The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise | Yuanhao Ban et.al. | 2406.01970v1 | null |
2024-06-04 | Plug-and-Play Diffusion Distillation | Yi-Ting Hsiao et.al. | 2406.01954v1 | null |
2024-06-04 | Generating Synthetic Net Load Data with Physics-informed Diffusion Model | Shaorong Zhang et.al. | 2406.01913v1 | null |
2024-06-06 | Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation | Yue Ma et.al. | 2406.01900v2 | null |
2024-06-04 | Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models | Wenzhuo Tang et.al. | 2406.01899v1 | link |
2024-06-04 | MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training | Kengo Uchida et.al. | 2406.01867v1 | null |
2024-06-03 | L-MAGIC: Language Model Assisted Generation of Images with Coherence | Zhipeng Cai et.al. | 2406.01843v1 | link |
2024-06-03 | Diffusion Boosted Trees | Xizewen Han et.al. | 2406.01813v1 | null |
2024-06-03 | DEFT: Efficient Finetuning of Conditional Diffusion Models by Learning the Generalised |
Alexander Denker et.al. | 2406.01781v1 | null |
2024-06-03 | A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization | Sebastian Sanokowski et.al. | 2406.01661v1 | link |
2024-06-03 | CoLa-DCE -- Concept-guided Latent Diffusion Counterfactual Explanations | Franz Motzkus et.al. | 2406.01649v1 | null |
2024-06-03 | DiffUHaul: A Training-Free Method for Object Dragging in Images | Omri Avrahami et.al. | 2406.01594v1 | null |
2024-06-03 | ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation | Guanxing Lu et.al. | 2406.01586v1 | null |
2024-06-03 | Long and Short Guidance in Score identity Distillation for One-Step Text-to-Image Generation | Mingyuan Zhou et.al. | 2406.01561v1 | null |
2024-06-03 | Robust Classification by Coupling Data Mollification with Label Smoothing | Markus Heinonen et.al. | 2406.01494v1 | null |
2024-06-04 | DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention | Yang Liu et.al. | 2406.01489v2 | null |
2024-06-03 | DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors | Tianyu Huang et.al. | 2406.01476v1 | link |
2024-06-03 | ED-SAM: An Efficient Diffusion Sampling Approach to Domain Generalization in Vision-Language Foundation Models | Thanh-Dat Truong et.al. | 2406.01432v1 | null |
2024-06-03 | Differentially Private Fine-Tuning of Diffusion Models | Yu-Lin Tsai et.al. | 2406.01355v1 | null |
2024-06-03 | Important node identification for complex networks based on improved Electre Multi-Attribute fusion | Qi Cao et.al. | 2406.01341v1 | null |
2024-06-03 | HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models | Mengcheng Li et.al. | 2406.01334v1 | null |
2024-06-03 | Report on Methods and Applications for Crafting 3D Humans | Lei Liu et.al. | 2406.01223v1 | null |
2024-06-03 | UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation | Xiang Wang et.al. | 2406.01188v1 | null |
2024-06-03 | Dimba: Transformer-Mamba Diffusion Models | Zhengcong Fei et.al. | 2406.01159v1 | null |
2024-06-04 | Towards Practical Single-shot Motion Synthesis | Konstantinos Roditakis et.al. | 2406.01136v2 | null |
2024-06-03 | ** |
Pengtao Chen et.al. | 2406.01125v1 | null |
2024-06-03 | SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models | Qilong Zhangli et.al. | 2406.01062v1 | null |
2024-06-03 | Constraint-Aware Diffusion Models for Trajectory Optimization | Anjian Li et.al. | 2406.00990v1 | null |
2024-06-03 | MultiEdits: Simultaneous Multi-Aspect Editing with Text-to-Image Diffusion Models | Mingzhen Huang et.al. | 2406.00985v1 | null |
2024-06-03 | Faster Diffusion-based Sampling with Randomized Midpoints: Sequential and Parallel | Shivam Gupta et.al. | 2406.00924v1 | null |
2024-06-03 | Demystifying SGD with Doubly Stochastic Gradients | Kyurae Kim et.al. | 2406.00920v1 | null |
2024-06-03 | ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation | Shaoshu Yang et.al. | 2406.00908v1 | link |
2024-06-02 | DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection | Yewon Lim et.al. | 2406.00856v1 | link |
2024-06-02 | Diffusion-Inspired Quantum Noise Mitigation in Parameterized Quantum Circuits | Hoang-Quan Nguyen et.al. | 2406.00843v1 | null |
2024-06-02 | Invisible Backdoor Attacks on Diffusion Models | Sen Li et.al. | 2406.00816v1 | link |
2024-05-31 | Mixed Diffusion for 3D Indoor Scene Synthesis | Siyi Hu et.al. | 2405.21066v1 | null |
2024-05-31 | Unified Directly Denoising for Both Variance Preserving and Variance Exploding Diffusion Models | Jingjing Wang et.al. | 2405.21059v1 | null |
2024-05-31 | Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models | Xinxi Zhang et.al. | 2405.21050v1 | null |
2024-05-31 | Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling | Jiatao Gu et.al. | 2405.21048v1 | null |
2024-05-31 | Amortizing intractable inference in diffusion models for vision, language, and control | Siddarth Venkatraman et.al. | 2405.20971v1 | link |
2024-05-31 | Flow matching achieves minimax optimal convergence | Kenji Fukumizu et.al. | 2405.20879v1 | null |
2024-05-31 | MegActor: Harness the Power of Raw Video for Vivid Portrait Animation | Shurong Yang et.al. | 2405.20851v1 | link |
2024-06-03 | Stratified Avatar Generation from Sparse Observations | Han Feng et.al. | 2405.20786v2 | null |
2024-05-31 | Share Your Secrets for Privacy! Confidential Forecasting with Vertical Federated Learning | Aditya Shankar et.al. | 2405.20761v1 | link |
2024-05-31 | Information Theoretic Text-to-Image Alignment | Chao Wang et.al. | 2405.20759v1 | null |
2024-05-31 | Diffusion Models Are Innate One-Step Generators | Bowen Zheng et.al. | 2405.20750v1 | link |
2024-05-31 | Unleashing the Potential of Diffusion Models for Incomplete Data Imputation | Hengrui Zhang et.al. | 2405.20690v1 | null |
2024-05-31 | Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling | Kidist Amde Mekonnen et.al. | 2405.20675v1 | link |
2024-05-31 | 4Diffusion: Multi-view Video Diffusion Model for 4D Generation | Haiyu Zhang et.al. | 2405.20674v1 | null |
2024-05-31 | Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation | Shuzhou Yang et.al. | 2405.20669v1 | null |
2024-05-31 | GenMix: Combining Generative and Mixture Data Augmentation for Medical Image Classification | Hansang Lee et.al. | 2405.20650v1 | null |
2024-06-03 | Stochastic Optimal Control for Diffusion Bridges in Function Spaces | Byoungwoo Park et.al. | 2405.20630v2 | null |
2024-05-31 | Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization | Yisu Liu et.al. | 2405.20584v1 | null |
2024-05-31 | Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning | Linjiajie Fang et.al. | 2405.20555v1 | link |
2024-05-30 | Diffusion On Syntax Trees For Program Synthesis | Shreyas Kapur et.al. | 2405.20519v1 | null |
2024-05-30 | Slight Corruption in Pre-training Data Makes Better Diffusion Models | Hao Chen et.al. | 2405.20494v1 | null |
2024-05-30 | Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images | Krishnakant Singh et.al. | 2405.20469v1 | null |
2024-05-30 | P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation | Qi Zhang et.al. | 2405.20443v1 | null |
2024-05-30 | Gradient Inversion of Federated Diffusion Models | Jiyue Huang et.al. | 2405.20380v1 | null |
2024-05-30 | Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image | Kailu Wu et.al. | 2405.20343v1 | null |
2024-05-30 | VividDream: Generating 3D Scene with Ambient Dynamics | Yao-Chih Lee et.al. | 2405.20334v1 | null |
2024-05-30 | MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | Shuyuan Tu et.al. | 2405.20325v1 | null |
2024-05-30 | Don't drop your samples! Coherence-aware training benefits Conditional diffusion | Nicolas Dufour et.al. | 2405.20324v1 | null |
2024-05-30 | Improving the Training of Rectified Flows | Sangyun Lee et.al. | 2405.20320v1 | link |
2024-05-30 | DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation | Zachary Novack et.al. | 2405.20289v1 | null |
2024-06-02 | MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model | Muyao Niu et.al. | 2405.20222v2 | link |
2024-05-30 | Boost Your Own Human Image Generation Model via Direct Preference Optimization with AI Feedback | Sanghyeon Na et.al. | 2405.20216v1 | null |
2024-05-30 | MotionDreamer: Zero-Shot 3D Mesh Animation from Video Diffusion Models | Lukas Uzolas et.al. | 2405.20155v1 | null |
2024-06-03 | DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild | Honghao Fu et.al. | 2405.19996v3 | link |
2024-05-30 | DiffPhysBA: Diffusion-based Physical Backdoor Attack against Person Re-Identification in Real-World | Wenli Sun et.al. | 2405.19990v1 | null |
2024-06-04 | PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting | Qiaowei Miao et.al. | 2405.19957v2 | null |
2024-05-30 | Exploring Diffusion Models' Corruption Stage in Few-Shot Fine-tuning and Mitigating with Bayesian Neural Networks | Xiaoyu Wu et.al. | 2405.19931v1 | null |
2024-05-30 | Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models | Zeyu Fang et.al. | 2405.19878v1 | null |
2024-05-31 | HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization | Wenxuan Liu et.al. | 2405.19751v2 | null |
2024-05-30 | Streaming Video Diffusion: Online Video Editing with Diffusion Models | Feng Chen et.al. | 2405.19726v1 | link |
2024-05-30 | Text Guided Image Editing with Automatic Concept Locating and Forgetting | Jia Li et.al. | 2405.19708v1 | null |
2024-05-31 | Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | Tianyu Chen et.al. | 2405.19690v2 | link |
2024-05-31 | Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models | Masatoshi Uehara et.al. | 2405.19673v2 | null |
2024-05-29 | Blind Image Restoration via Fast Diffusion Inversion | Hamadi Chihaoui et.al. | 2405.19572v1 | link |
2024-05-29 | Predicting Long-Term Human Behaviors in Discrete Representations via Physics-Guided Diffusion | Zhitian Zhang et.al. | 2405.19528v1 | null |
2024-05-29 | MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection | Raman Dutt et.al. | 2405.19458v1 | null |
2024-05-29 | Diffusion Policy Attacker: Crafting Adversarial Attacks for Diffusion-based Policies | Yipu Chen et.al. | 2405.19424v1 | null |
2024-05-29 | ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning | Ruchika Chavhan et.al. | 2405.19237v1 | link |
2024-05-30 | ** |
Weitian Zhang et.al. | 2405.19203v2 | null |
2024-05-29 | Diffusion-based Dynamics Models for Long-Horizon Rollout in Offline Reinforcement Learning | Hanye Zhao et.al. | 2405.19189v1 | null |
2024-05-29 | Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization | Zhiwei Tang et.al. | 2405.18881v1 | null |
2024-05-29 | Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors | Zihui Wu et.al. | 2405.18782v1 | null |
2024-05-29 | RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching | Divya Nori et.al. | 2405.18768v1 | link |
2024-05-29 | Stationary distribution approximations of Two-island Wright-Fisher and seed-bank models using Stein's method | Han L. Gan et.al. | 2405.18763v1 | null |
2024-05-29 | Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning | Tianle Zhang et.al. | 2405.18729v1 | null |
2024-05-29 | Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI | Che Liu et.al. | 2405.18726v1 | null |
2024-05-29 | Learning Diffeomorphism for Image Registration with Time-Continuous Networks using Semigroup Regularization | Mohammadjavad Matinkia et.al. | 2405.18684v1 | link |
2024-05-29 | Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map Filtering | Ido Sobol et.al. | 2405.18677v1 | null |
2024-05-28 | DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention | Lianghui Zhu et.al. | 2405.18428v1 | link |
2024-05-28 | Phased Consistency Model | Fu-Yun Wang et.al. | 2405.18407v1 | null |
2024-05-28 | RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives | Jaehong Yoon et.al. | 2405.18406v1 | link |
2024-05-28 | Multi-modal Generation via Cross-Modal In-Context Learning | Amandeep Kumar et.al. | 2405.18304v1 | link |
2024-05-28 | CT-based brain ventricle segmentation via diffusion Schrödinger Bridge without target domain ground truths | Reihaneh Teimouri et.al. | 2405.18267v1 | null |
2024-05-28 | EG4D: Explicit Generation of 4D Object without Score Distillation | Qi Sun et.al. | 2405.18132v1 | link |
2024-05-28 | Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers? | Zebin You et.al. | 2405.18029v1 | null |
2024-05-28 | Unveiling the Power of Diffusion Features For Personalized Segmentation and Retrieval | Dvir Samuel et.al. | 2405.18025v1 | null |
2024-05-28 | MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling | Bowen Zhang et.al. | 2405.18003v1 | link |
2024-05-28 | AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization | Junjie Shentu et.al. | 2405.17965v1 | null |
2024-05-28 | Improving Discrete Diffusion Models via Structured Preferential Generation | Severi Rissanen et.al. | 2405.17889v1 | null |
2024-05-28 | Diffusion Rejection Sampling | Byeonghu Na et.al. | 2405.17880v1 | link |
2024-05-30 | MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization | Tianchen Zhao et.al. | 2405.17873v2 | null |
2024-05-28 | Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation | Akio Hayakawa et.al. | 2405.17842v1 | null |
2024-05-28 | LDMol: Text-Conditioned Molecule Diffusion Model Leveraging Chemically Informative Latent Space | Jinho Chang et.al. | 2405.17829v1 | null |
2024-05-30 | Diffusion Model Patching via Mixture-of-Prompts | Seokil Ham et.al. | 2405.17825v2 | null |
2024-05-28 | ClavaDDPM: Multi-relational Data Synthesis with Cluster-guided Diffusion Models | Wei Pang et.al. | 2405.17724v1 | null |
2024-05-28 | MindFormer: A Transformer Architecture for Multi-Subject Brain Decoding via fMRI | Inhwa Han et.al. | 2405.17720v1 | null |
2024-05-27 | RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance | Jiaojiao Fan et.al. | 2405.17661v1 | null |
2024-05-27 | Alignment is Key for Applying Diffusion Models to Retrosynthesis | Najwa Laabid et.al. | 2405.17656v1 | null |
2024-05-27 | ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance | Jiannan Huang et.al. | 2405.17532v1 | null |
2024-05-27 | Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer | Ruizhi Shao et.al. | 2405.17405v1 | null |
2024-05-27 | A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training | Kai Wang et.al. | 2405.17403v1 | link |
2024-05-27 | RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control | Litu Rout et.al. | 2405.17401v1 | null |
2024-05-27 | EASI-Tex: Edge-Aware Mesh Texturing from Single Image | Sai Raj Kishore Perla et.al. | 2405.17393v1 | null |
2024-05-28 | Controllable Longer Image Animation with Diffusion Models | Qiang Wang et.al. | 2405.17306v2 | null |
2024-05-27 | Does Diffusion Beat GAN in Image Super Resolution? | Denis Kuznedelev et.al. | 2405.17261v1 | null |
2024-05-27 | DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion Models | Yuqing Zhang et.al. | 2405.17176v1 | null |
2024-05-27 | Partitioned Hankel-based Diffusion Models for Few-shot Low-dose CT Reconstruction | Wenhao Zhang et.al. | 2405.17167v1 | null |
2024-05-27 | PatchScaler: An Efficient Patch-independent Diffusion Model for Super-Resolution | Yong Liu et.al. | 2405.17158v1 | link |
2024-05-27 | Ensembling Diffusion Models via Adaptive Feature Aggregation | Cong Wang et.al. | 2405.17082v1 | null |
2024-05-27 | The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models | Saravanan Kandasamy et.al. | 2405.17068v1 | null |
2024-05-27 | Glauber Generative Model: Discrete Diffusion Models via Binary Classification | Harshit Varma et.al. | 2405.17035v1 | null |
2024-05-27 | ** |
Weiquan Wang et.al. | 2405.17016v1 | null |
2024-05-28 | MotionLLM: Multimodal Motion-Language Learning with Large Language Models | Qi Wu et.al. | 2405.17013v2 | null |
2024-05-27 | A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition | Zilu Guo et.al. | 2405.16952v1 | null |
2024-05-27 | Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models | Qian Wang et.al. | 2405.16947v1 | null |
2024-05-27 | PASTA: Pathology-Aware MRI to PET Cross-Modal Translation with Diffusion Models | Yitong Li et.al. | 2405.16942v1 | null |
2024-05-28 | GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning | Jaewoo Lee et.al. | 2405.16907v2 | link |
2024-05-27 | Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation | Liang Shi et.al. | 2405.16895v1 | null |
2024-05-27 | Part123: Part-aware 3D Reconstruction from a Single-view Image | Anran Liu et.al. | 2405.16888v1 | null |
2024-05-28 | Transfer Learning for Diffusion Models | Yidong Ouyang et.al. | 2405.16876v2 | null |
2024-05-27 | CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild | Xingqun Qi et.al. | 2405.16874v1 | null |
2024-05-27 | NCIDiff: Non-covalent Interaction-generative Diffusion Model for Improving Reliability of 3D Molecule Generation Inside Protein Pocket | Joongwon Lee et.al. | 2405.16861v1 | null |
2024-05-27 | EM Distillation for One-step Diffusion Models | Sirui Xie et.al. | 2405.16852v1 | null |
2024-05-27 | Enhancing Accuracy in Generative Models via Knowledge Transfer | Xinyu Tian et.al. | 2405.16837v1 | null |
2024-05-27 | Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection | Gihyun Kwon et.al. | 2405.16823v1 | null |
2024-05-27 | Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model | Shoma Iwai et.al. | 2405.16817v1 | link |
2024-05-27 | TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing | Xinyu Zhang et.al. | 2405.16803v1 | null |
2024-05-27 | PromptFix: You Prompt and We Fix the Photo | Yongsheng Yu et.al. | 2405.16785v1 | null |
2024-05-27 | Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models | Cristina N. Vasconcelos et.al. | 2405.16759v1 | null |
2024-05-27 | DMPlug: A Plug-in Method for Solving Inverse Problems with Diffusion Models | Hengkang Wang et.al. | 2405.16749v1 | link |
2024-05-26 | Towards Multi-Task Multi-Modal Models: A Video Generative Perspective | Lijun Yu et.al. | 2405.16728v1 | null |
2024-05-26 | Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models | Hanwen Liang et.al. | 2405.16645v1 | null |
2024-05-26 | A Study on Unsupervised Anomaly Detection and Defect Localization using Generative Model in Ultrasonic Non-Destructive Testing | Yusaku Ando et.al. | 2405.16580v1 | null |
2024-05-28 | ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling | Francesca Babiloni et.al. | 2405.16570v2 | null |
2024-05-26 | I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models | Wenqi Ouyang et.al. | 2405.16537v1 | null |
2024-05-26 | Pruning for Robust Concept Erasing in Diffusion Models | Tianyun Yang et.al. | 2405.16534v1 | null |
2024-05-26 | Sp2360: Sparse-view 360 Scene Reconstruction using Cascaded 2D Diffusion Priors | Soumava Paul et.al. | 2405.16517v1 | null |
2024-05-26 | Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models | Kun Huang et.al. | 2405.16516v1 | null |
2024-05-26 | Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective | Jiuxiang Gu et.al. | 2405.16418v1 | null |
2024-05-28 | Disentangling Foreground and Background Motion for Enhanced Realism in Human Video Generation | Jinlin Liu et.al. | 2405.16393v2 | null |
2024-05-26 | Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference | Xunpeng Huang et.al. | 2405.16387v1 | null |
2024-05-25 | Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups | Yuchen Zhu et.al. | 2405.16381v1 | null |
2024-05-25 | R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model | Changhoon Kim et.al. | 2405.16341v1 | null |
2024-05-25 | ModelLock: Locking Your Model With a Spell | Yifeng Gao et.al. | 2405.16285v1 | null |
2024-05-25 | Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination | Shelly Golan et.al. | 2405.16260v1 | null |
2024-05-25 | Underwater Image Enhancement by Diffusion Model with Customized CLIP-Classifier | Shuaixin Liu et.al. | 2405.16214v1 | null |
2024-05-25 | Analytical photoresponses of Schottky contact MoS2 phototransistors | Jianyong Wei et.al. | 2405.16209v1 | null |
2024-05-25 | Diffusion-Reward Adversarial Imitation Learning | Chun-Mao Lai et.al. | 2405.16194v1 | null |
2024-05-25 | Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization | Shutong Ding et.al. | 2405.16173v1 | null |
2024-05-24 | Looking Backward: Streaming Video-to-Video Translation with Feature Banks | Feng Liang et.al. | 2405.15757v1 | link |
2024-05-24 | Taming Score-Based Diffusion Priors for Infinite-Dimensional Nonlinear Inverse Problems | Lorenzo Baldassari et.al. | 2405.15676v1 | null |
2024-05-24 | Reducing the cost of posterior sampling in linear inverse problems via task-dependent score learning | Fabian Schneider et.al. | 2405.15643v1 | null |
2024-05-24 | DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation | Xiankang He et.al. | 2405.15619v1 | null |
2024-05-24 | Learning to Discretize Denoising Diffusion ODEs | Vinh Tong et.al. | 2405.15506v1 | null |
2024-05-24 | Out of Many, One: Designing and Scaffolding Proteins at the Scale of the Structural Universe with Genie 2 | Yeqing Lin et.al. | 2405.15489v1 | null |
2024-05-24 | NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer | Meng You et.al. | 2405.15364v1 | link |
2024-05-24 | SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound Generation | Xinlei Niu et.al. | 2405.15338v1 | null |
2024-05-24 | Challenges and Opportunities in 3D Content Generation | Ke Zhao et.al. | 2405.15335v1 | null |
2024-05-24 | Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model | Mingyang Yi et.al. | 2405.15330v1 | null |
2024-05-24 | SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance | Guibao Shen et.al. | 2405.15321v1 | null |
2024-05-24 | Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion | Aoxue Li et.al. | 2405.15313v1 | null |
2024-05-24 | Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient | Yongliang Wu et.al. | 2405.15304v1 | null |
2024-05-24 | StyleMaster: Towards Flexible Stylized Image Generation with Diffusion Models | Chengming Xu et.al. | 2405.15287v1 | null |
2024-05-24 | Blaze3DM: Marry Triplane Representation with Diffusion for 3D Medical Inverse Problem Solving | Jia He et.al. | 2405.15241v1 | null |
2024-05-24 | Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models | Yimeng Zhang et.al. | 2405.15234v1 | link |
2024-05-24 | DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception | Run Luo et.al. | 2405.15232v1 | null |
2024-05-24 | NIVeL: Neural Implicit Vector Layers for Text-to-Vector Generation | Vikas Thamizharasan et.al. | 2405.15217v1 | null |
2024-05-24 | ODGEN: Domain-specific Object Detection Data Generation with Diffusion Models | Jingyuan Zhu et.al. | 2405.15199v1 | null |
2024-05-24 | Diffusion Actor-Critic with Entropy Regulator | Yinuo Wang et.al. | 2405.15177v1 | null |
2024-05-23 | AdjointDEIS: Efficient Gradients for Diffusion Models | Zander W. Blasingame et.al. | 2405.15020v1 | null |
2024-05-23 | CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner | Weiyu Li et.al. | 2405.14979v1 | link |
2024-05-23 | SFDDM: Single-fold Distillation for Diffusion models | Chi Hong et.al. | 2405.14961v1 | null |
2024-05-23 | PILOT: Equivariant diffusion for pocket conditioned de novo ligand generation with multi-objective guidance via importance sampling | Julian Cremer et.al. | 2405.14925v1 | null |
2024-05-24 | Improved Distribution Matching Distillation for Fast Image Synthesis | Tianwei Yin et.al. | 2405.14867v2 | null |
2024-05-23 | Video Diffusion Models are Training-free Motion Interpreter and Controller | Zeqi Xiao et.al. | 2405.14864v1 | null |
2024-05-23 | Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models | Gen Li et.al. | 2405.14861v1 | null |
2024-05-23 | Semantica: An Adaptable Image-Conditioned Diffusion Model | Manoj Kumar et.al. | 2405.14857v1 | null |
2024-05-23 | TerDiT: Ternary Diffusion Models with Transformers | Xudong Lu et.al. | 2405.14854v1 | link |
2024-05-23 | Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer | Shuang Wu et.al. | 2405.14832v1 | null |
2024-05-23 | Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models | Katherine Xu et.al. | 2405.14828v1 | null |
2024-05-23 | PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher | Dongjun Kim et.al. | 2405.14822v1 | null |
2024-05-24 | Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation | Hongxu Jiang et.al. | 2405.14802v2 | link |
2024-05-23 | Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy | Shengfang Zhai et.al. | 2405.14800v1 | null |
2024-05-23 | EditWorld: Simulating World Dynamics for Instruction-Following Image Editing | Ling Yang et.al. | 2405.14785v1 | null |
2024-05-23 | Physics-informed Score-based Diffusion Model for Limited-angle Reconstruction of Cardiac Computed Tomography | Shuo Han et.al. | 2405.14770v1 | null |
2024-05-23 | RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance | Zhicheng Sun et.al. | 2405.14677v1 | link |
2024-05-23 | Reinforcement Learning for Fine-tuning Text-to-speech Diffusion Models | Jingyi Chen et.al. | 2405.14632v1 | null |
2024-05-23 | Neuroexplicit Diffusion Models for Inpainting of Optical Flow Fields | Tom Fischer et.al. | 2405.14599v1 | null |
2024-05-24 | Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation | Shiqi Yang et.al. | 2405.14598v2 | null |
2024-05-23 | LDM: Large Tensorial SDF Model for Textured Mesh Generation | Rengan Xie et.al. | 2405.14580v1 | null |
2024-05-23 | Regressor-free Molecule Generation to Support Drug Response Prediction | Kun Li et.al. | 2405.14536v1 | null |
2024-05-23 | LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models | Seyedmorteza Sadat et.al. | 2405.14477v1 | null |
2024-05-23 | TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing | Teng Xu et.al. | 2405.14455v1 | null |
2024-05-23 | Adversarial Schrödinger Bridge Matching | Nikita Gushchin et.al. | 2405.14449v1 | null |
2024-05-23 | Reliable Trajectory Prediction and Uncertainty Quantification with Conditioned Diffusion Models | Marion Neumeier et.al. | 2405.14384v1 | null |
2024-05-24 | Autoregressive Image Diffusion: Generation of Image Sequence and Application in MRI | Guanxiong Luo et.al. | 2405.14327v2 | null |
2024-05-23 | Exposure Diffusion: HDR Image Generation by Consistent LDR denoising | Mojtaba Bemana et.al. | 2405.14304v1 | null |
2024-05-23 | Diffusion-based Quantum Error Mitigation using Stochastic Differential Equation | Joo Yong Shim et.al. | 2405.14283v1 | null |
2024-05-23 | Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors | Emile Pierret et.al. | 2405.14250v1 | null |
2024-05-23 | DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis | Yao Teng et.al. | 2405.14224v1 | null |
2024-05-23 | Survey on Visual Signal Coding and Processing with Generative Models: Technologies, Standards and Optimization | Zhibo Chen et.al. | 2405.14221v1 | null |
2024-05-23 | FreeTuner: Any Subject in Any Style with Training-free Diffusion | Youcan Xu et.al. | 2405.14201v1 | null |
2024-05-23 | The Disappearance of Timestep Embedding in Modern Time-Dependent Neural Networks | Bum Jun Kim et.al. | 2405.14126v1 | null |
2024-05-23 | Enhancing Image Layout Control with Loss-Guided Diffusion Models | Zakaria Patel et.al. | 2405.14101v1 | null |
2024-05-22 | Particle physics DL-simulation with control over generated data properties | Karol Rogoziński et.al. | 2405.14049v1 | null |
2024-05-22 | A Study of Posterior Stability for Time-Series Latent Diffusion | Yangming Li et.al. | 2405.14021v1 | null |
2024-05-22 | Design Editing for Offline Model-based Optimization | Ye Yuan et.al. | 2405.13964v1 | null |
2024-05-22 | Learning Latent Space Hierarchical EBM Diffusion Models | Jiali Cui et.al. | 2405.13910v1 | null |
2024-05-22 | ReVideo: Remake a Video with Motion and Content Control | Chong Mou et.al. | 2405.13865v1 | null |
2024-05-22 | Diffusion-Based Cloud-Edge-Device Collaborative Learning for Next POI Recommendations | Jing Long et.al. | 2405.13811v1 | null |
2024-05-22 | Conditioning diffusion models by explicit forward-backward bridging | Adrien Corenflos et.al. | 2405.13794v1 | null |
2024-05-22 | A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation | Gwanghyun Kim et.al. | 2405.13762v1 | null |
2024-05-22 | InstaDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos | Yujun Shi et.al. | 2405.13722v1 | null |
2024-05-22 | Learning Diffusion Priors from Observations by Expectation Maximization | François Rozet et.al. | 2405.13712v1 | null |
2024-05-22 | Prompt Mixing in Diffusion Models using the Black Scholes Algorithm | Divya Kothandaraman et.al. | 2405.13685v1 | null |
2024-05-22 | MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation | Zhiping Yu et.al. | 2405.13570v1 | null |
2024-05-22 | MotionCraft: Physics-based Zero-Shot Video Generation | Luca Savant Aira et.al. | 2405.13557v1 | null |
2024-05-22 | Directly Denoising Diffusion Model | Dan Zhang et.al. | 2405.13540v1 | null |
2024-05-22 | Class-Conditional self-reward mechanism for improved Text-to-Image models | Safouane El Ghazouali et.al. | 2405.13473v1 | link |
2024-05-22 | Enhanced Creativity and Ideation through Stable Video Synthesis | Elijah Miller et.al. | 2405.13357v1 | null |
2024-05-22 | SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic Injection with Large-Scale Pre-Training Diffusion Models | Qingrong Cheng et.al. | 2405.13336v1 | null |
2024-05-21 | TauAD: MRI-free Tau Anomaly Detection in PET Imaging via Conditioned Diffusion Models | Lujia Zhong et.al. | 2405.13199v1 | null |
2024-05-21 | Personalized Residuals for Concept-Driven Text-to-Image Generation | Cusuh Ham et.al. | 2405.12978v1 | null |
2024-05-21 | Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control | Yue Han et.al. | 2405.12970v1 | null |
2024-05-21 | Impact of inhomogeneous diffusion on secondary cosmic ray and antiproton local spectra | Álvaro Tovar-Pardo et.al. | 2405.12918v1 | null |
2024-05-21 | Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images | Xiaofei Yu et.al. | 2405.12875v1 | link |
2024-05-21 | Model Free Prediction with Uncertainty Assessment | Yuling Jiao et.al. | 2405.12684v1 | null |
2024-05-21 | CustomText: Customized Textual Image Generation using Diffusion Models | Shubham Paliwal et.al. | 2405.12531v1 | null |
2024-05-21 | Customize Your Own Paired Data via Few-shot Way | Jinshu Chen et.al. | 2405.12490v1 | null |
2024-05-21 | One-step data-driven generative model via Schrödinger Bridge | Hanwen Huang et.al. | 2405.12453v1 | null |
2024-05-20 | Diffusion for World Modeling: Visual Details Matter in Atari | Eloi Alonso et.al. | 2405.12399v1 | link |
2024-05-20 | Images that Sound: Composing Images and Sounds on a Single Canvas | Ziyang Chen et.al. | 2405.12221v1 | null |
2024-05-20 | Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | Nathaniel Cohen et.al. | 2405.12211v1 | null |
2024-05-20 | Nonequilbrium physics of generative diffusion models | Zhendong Yu et.al. | 2405.11932v1 | null |
2024-05-20 | "Set It Up!": Functional Object Arrangement with Compositional Generative Models | Yiqing Xu et.al. | 2405.11928v1 | null |
2024-05-20 | Diff-BGM: A Diffusion Model for Video Background Music Generation | Sizhe Li et.al. | 2405.11913v1 | null |
2024-05-20 | Out-of-Distribution Detection with a Single Unconditional Diffusion Model | Alvin Heng et.al. | 2405.11881v1 | link |
2024-05-20 | Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models | Xiyu Wang et.al. | 2405.11852v1 | null |
2024-05-20 | Alternators For Sequence Modeling | Mohammad Reza Rezaei et.al. | 2405.11848v1 | null |
2024-05-20 | ViViD: Video Virtual Try-on using Diffusion Models | Zixun Fang et.al. | 2405.11794v1 | null |
2024-05-20 | Guided Multi-objective Generative AI to Enhance Structure-based Drug Design | Amit Kadan et.al. | 2405.11785v1 | null |
2024-05-20 | Diffusion Models for Generating Ballistic Spacecraft Trajectories | Tyler Presser et.al. | 2405.11738v1 | null |
2024-05-19 | InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios | Yinghao Huang et.al. | 2405.11690v1 | null |
2024-05-19 | Uncertainty-Aware PPG-2-ECG for Enhanced Cardiovascular Diagnosis using Diffusion Models | Omer Belhasin et.al. | 2405.11566v1 | null |
2024-05-19 | Diffusion-Based Hierarchical Image Steganography | Youmin Xu et.al. | 2405.11523v1 | null |
2024-05-19 | FIFO-Diffusion: Generating Infinite Videos from Text without Training | Jihwan Kim et.al. | 2405.11473v1 | null |
2024-05-19 | Discrete-state Continuous-time Diffusion for Graph Generation | Zhe Xu et.al. | 2405.11416v1 | null |
2024-05-18 | On the Trajectory Regularity of ODE-based Diffusion Sampling | Defang Chen et.al. | 2405.11326v1 | null |
2024-05-18 | Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification | Ming Hu et.al. | 2405.11289v1 | null |
2024-05-18 | HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos | Qifeng Chen et.al. | 2405.11270v1 | null |
2024-05-18 | AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA | Weitao Feng et.al. | 2405.11135v1 | null |
2024-05-17 | Flexible Motion In-betweening with Diffusion Models | Setareh Cohan et.al. | 2405.11126v1 | null |
2024-05-16 | Flow Score Distillation for Diverse Text-to-3D Generation | Runjie Yan et.al. | 2405.10988v1 | null |
2024-05-17 | Improving face generation quality and prompt following with synthetic captions | Michail Tarasiou et.al. | 2405.10864v1 | null |
2024-05-17 | Deep Data Consistency: a Fast and Robust Diffusion Model-based Solver for Inverse Problems | Hanyu Chen et.al. | 2405.10748v1 | link |
2024-05-17 | Numerical Recovery of the Diffusion Coefficient in Diffusion Equations from Terminal Measurement | Bangti Jin et.al. | 2405.10708v1 | null |
2024-05-17 | LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion | Zihao Zhu et.al. | 2405.10691v1 | null |
2024-05-17 | LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion | Tong Chen et.al. | 2405.10550v1 | link |
2024-05-17 | ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation | Pengzhi Li et.al. | 2405.10508v1 | null |
2024-05-20 | Text-to-Vector Generation with Neural Path Representation | Peiying Zhang et.al. | 2405.10317v2 | null |
2024-05-16 | Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model | Zheng Gu et.al. | 2405.10316v1 | null |
2024-05-16 | CAT3D: Create Anything in 3D with Multi-View Diffusion Models | Ruiqi Gao et.al. | 2405.10314v1 | null |
2024-05-16 | Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks | João Bordalo et.al. | 2405.10122v1 | null |
2024-05-16 | Spurious reconstruction from brain activity | Ken Shirakawa et.al. | 2405.10078v1 | null |
2024-05-16 | Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution | Xingjian Wang et.al. | 2405.10014v1 | null |
2024-05-16 | VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing | Binghui Chen et.al. | 2405.09985v1 | null |
2024-05-16 | Language-Oriented Semantic Latent Representation for Image Transmission | Giordano Cicchetti et.al. | 2405.09976v1 | link |
2024-05-16 | Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models | Ziyu Wang et.al. | 2405.09901v1 | link |
2024-05-16 | DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection | Yuhao Sun et.al. | 2405.09882v1 | link |
2024-05-16 | Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion | Xinyang Li et.al. | 2405.09874v1 | null |
2024-05-16 | Rethinking Multi-User Semantic Communications with Deep Generative Models | Eleonora Grassucci et.al. | 2405.09866v1 | null |
2024-05-16 | MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis | Joseph Cho et.al. | 2405.09806v1 | null |
2024-05-15 | A Survey of Generative Techniques for Spatial-Temporal Data Mining | Qianru Zhang et.al. | 2405.09592v1 | null |
2024-05-16 | MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer | Chengyu Wu et.al. | 2405.09539v2 | link |
2024-05-15 | Diffusion-based Contrastive Learning for Sequential Recommendation | Ziqiang Cui et.al. | 2405.09369v1 | null |
2024-05-15 | Dance Any Beat: Blending Beats with Visuals in Dance Video Generation | Xuanchen Wang et.al. | 2405.09266v1 | null |
2024-05-15 | SOEDiff: Efficient Distillation for Small Object Editing | Qihe Pan et.al. | 2405.09114v1 | null |
2024-05-15 | RSHazeDiff: A Unified Fourier-aware Diffusion Model for Remote Sensing Image Dehazing | Jiamei Xiong et.al. | 2405.09083v1 | link |
2024-05-17 | Naturalistic Music Decoding from EEG Data via Latent Diffusion Models | Emilian Postolache et.al. | 2405.09062v2 | null |
2024-05-15 | Response Matching for generating materials and molecules | Bingqing Cheng et.al. | 2405.09057v1 | null |
2024-05-15 | CTS: A Consistency-Based Medical Image Segmentation Model | Kejia Zhang et.al. | 2405.09056v1 | link |
2024-05-14 | Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models | Bingdong Li et.al. | 2405.08674v1 | null |
2024-05-14 | Towards Multi-Task Generative-AI Edge Services with an Attention-based Diffusion DRL Approach | Yaju Liu et.al. | 2405.08328v1 | null |
2024-05-14 | Compositional Text-to-Image Generation with Dense Blob Representations | Weili Nie et.al. | 2405.08246v1 | null |
2024-05-13 | Infinite Texture: Text-guided High Resolution Diffusion Texture Synthesis | Yifan Wang et.al. | 2405.08210v1 | null |
2024-05-13 | Do Bayesian imaging methods report trustworthy probabilities? | David Y. W. Thong et.al. | 2405.08179v1 | null |
2024-05-13 | DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation | Ziang Cao et.al. | 2405.08055v1 | link |
2024-05-13 | Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning | Wenqi Dong et.al. | 2405.08054v1 | null |
2024-05-11 | Diff-ETS: Learning a Diffusion Probabilistic Model for Electromyography-to-Speech Conversion | Zhao Ren et.al. | 2405.08021v1 | null |
2024-05-13 | Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data | Mahdi Morafah et.al. | 2405.07925v1 | null |
2024-05-13 | CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models | Nick Stracke et.al. | 2405.07913v1 | null |
2024-05-13 | SAR Image Synthesis with Diffusion Models | Denisa Qosja et.al. | 2405.07776v1 | null |
2024-05-13 | CDFormer:When Degradation Prediction Embraces Diffusion Model for Blind Image Super-Resolution | Qingguo Liu et.al. | 2405.07648v1 | link |
2024-05-13 | De novo antibody design with SE(3) diffusion | Daniel Cutting et.al. | 2405.07622v1 | null |
2024-05-13 | Reducing Risk for Assistive Reinforcement Learning Policies with Diffusion Models | Andrii Tytarenko et.al. | 2405.07603v1 | null |
2024-05-13 | PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator | Hanshu Yan et.al. | 2405.07510v1 | link |
2024-05-13 | GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting | Haodong Chen et.al. | 2405.07472v1 | null |
2024-05-12 | Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning | Masane Fuchi et.al. | 2405.07288v1 | link |
2024-05-12 | Modeling Pedestrian Intrinsic Uncertainty for Multimodal Stochastic Trajectory Prediction via Energy Plan Denoising | Yao Liu et.al. | 2405.07164v1 | null |
2024-05-12 | Stable Signature is Unstable: Removing Image Watermark from Diffusion Models | Yuepeng Hu et.al. | 2405.07145v1 | null |
2024-05-11 | Diffusion models as probabilistic neural operators for recovering unobserved states of dynamical systems | Katsiaryna Haitsiukevich et.al. | 2405.07097v1 | null |
2024-05-11 | Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior | Ce Wang et.al. | 2405.07044v1 | link |
2024-05-11 | Non-confusing Generation of Customized Concepts in Diffusion Models | Wang Lin et.al. | 2405.06914v1 | null |
2024-05-10 | Self-Consistent Recursive Diffusion Bridge for Medical Image Translation | Fuat Arslan et.al. | 2405.06789v1 | link |
2024-05-10 | Shape Conditioned Human Motion Generation with Diffusion Model | Kebing Xue et.al. | 2405.06778v1 | null |
2024-05-10 | OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation | Jinwei Lin et.al. | 2405.06547v1 | link |
2024-05-14 | SketchDream: Sketch-based Text-to-3D Generation and Editing | Feng-Lin Liu et.al. | 2405.06461v2 | null |
2024-05-10 | PUMA: margin-based data pruning | Javier Maroto et.al. | 2405.06298v1 | null |
2024-05-10 | Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging | Zhuchen Shao et.al. | 2405.06175v1 | null |
2024-05-09 | Distilling Diffusion Models into Conditional GANs | Minguk Kang et.al. | 2405.05967v1 | null |
2024-05-09 | Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask | Zineb Senane et.al. | 2405.05959v1 | link |
2024-05-09 | Frame Interpolation with Consecutive Brownian Bridge Diffusion | Zonglin Lyu et.al. | 2405.05953v1 | null |
2024-05-09 | Composable Part-Based Manipulation | Weiyu Liu et.al. | 2405.05876v1 | null |
2024-05-09 | Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control | Gunshi Gupta et.al. | 2405.05852v1 | link |
2024-05-09 | Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models | Zhe Ma et.al. | 2405.05846v1 | null |
2024-05-09 | MSDiff: Multi-Scale Diffusion Model for Ultra-Sparse View CT Reconstruction | Pinhuang Tan et.al. | 2405.05814v1 | null |
2024-05-10 | MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation | Yuxiang Wei et.al. | 2405.05806v2 | link |
2024-05-09 | DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation | Sitian Shen et.al. | 2405.05800v1 | null |
2024-05-09 | Sequential Amodal Segmentation via Cumulative Occlusion Learning | Jiayang Ao et.al. | 2405.05791v1 | null |
2024-05-09 | DP-MDM: Detail-Preserving MR Reconstruction via Multiple Diffusion Models | Mengxiao Geng et.al. | 2405.05763v1 | null |
2024-05-09 | LatentColorization: Latent Diffusion-Based Speaker Video Colorization | Rory Ward et.al. | 2405.05707v1 | null |
2024-05-09 | StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework | Yiheng Huang et.al. | 2405.05691v1 | null |
2024-05-09 | SubGDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning | Jiying Zhang et.al. | 2405.05665v1 | null |
2024-05-09 | AI in Your Toolbox: A Plugin for Generating Renderings from 3D Models | Mingming Wang et.al. | 2405.05627v1 | null |
2024-05-09 | Denoising Diffusion Delensing Delight: Reconstructing the Non-Gaussian CMB Lensing Potential with Diffusion Models | Thomas Flöss et.al. | 2405.05598v1 | link |
2024-05-09 | Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft | Debabrata Pal et.al. | 2405.05574v1 | null |
2024-05-09 | A Survey on Personalized Content Synthesis with Diffusion Models | Xulu Zhang et.al. | 2405.05538v1 | null |
2024-05-08 | Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo | Nayantara Mudur et.al. | 2405.05255v1 | link |
2024-05-08 | Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models | Hongjie Wang et.al. | 2405.05252v1 | null |
2024-05-08 | Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation | Jonas Kohler et.al. | 2405.05224v1 | null |
2024-05-08 | FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models | Jinglin Xu et.al. | 2405.05216v1 | link |
2024-05-08 | An anti-noise seismic inversion method based on diffusion model | Yingtian Liu et.al. | 2405.05026v1 | null |
2024-05-08 | Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI | Keqiang Fan et.al. | 2405.04974v1 | null |
2024-05-08 | Empowering Wireless Networks with Artificial Intelligence Generated Graph | Jiacheng Wang et.al. | 2405.04907v1 | null |
2024-05-08 | Fast LiDAR Upsampling using Conditional Diffusion Models | Sander Elias Magnussen Helgesen et.al. | 2405.04889v1 | null |
2024-05-08 | FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation | Xuehai He et.al. | 2405.04834v1 | null |
2024-05-08 | Variational Schrödinger Diffusion Models | Wei Deng et.al. | 2405.04795v1 | null |
2024-05-07 | Remote Diffusion | Kunal Sunil Kasodekar et.al. | 2405.04717v1 | null |
2024-05-07 | TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model | Yongming Zhang et.al. | 2405.04675v1 | null |
2024-05-07 | Tactile-Augmented Radiance Fields | Yiming Dou et.al. | 2405.04534v1 | link |
2024-05-07 | Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing | Yi Zuo et.al. | 2405.04496v1 | null |
2024-05-07 | CloudDiff: Super-resolution ensemble retrieval of cloud properties for all day using the generative diffusion model | Haixia Xiao et.al. | 2405.04483v1 | null |
2024-05-07 | Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos | Junyi Ma et.al. | 2405.04370v1 | null |
2024-05-07 | Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation | Jihyun Kim et.al. | 2405.04356v1 | null |
2024-05-08 | Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer | Zhuoyi Yang et.al. | 2405.04312v2 | link |
2024-05-07 | BUDDy: Single-Channel Blind Unsupervised Dereverberation with Diffusion Models | Eloi Moliner et.al. | 2405.04272v1 | null |
2024-05-07 | Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models | Fan Bao et.al. | 2405.04233v1 | null |
2024-05-07 | Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model | Joo Young Choi et.al. | 2405.03958v1 | null |
2024-05-06 | MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View | Emmanuelle Bourigault et.al. | 2405.03894v1 | null |
2024-05-06 | MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization | Massimiliano Pappa et.al. | 2405.03803v1 | null |
2024-05-06 | Synthetic Data from Diffusion Models Improve Drug Discovery Prediction | Bing Hu et.al. | 2405.03799v1 | null |
2024-05-06 | GraphSL: An Open-Source Library for Graph Source Localization Approaches and Benchmark Datasets | Junxiang Wang et.al. | 2405.03724v1 | link |
2024-05-06 | Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models | Ludwig Winkler et.al. | 2405.03549v1 | null |
2024-05-06 | CCDM: Continuous Conditional Diffusion Models for Image Generation | Xin Ding et.al. | 2405.03546v1 | link |
2024-05-06 | LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model | Haowen Sun et.al. | 2405.03485v1 | link |
2024-05-06 | Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond | Jiuxiang Gu et.al. | 2405.03251v1 | null |
2024-05-06 | Hyperbolic Geometric Latent Diffusion Model for Graph Generation | Xingcheng Fu et.al. | 2405.03188v1 | link |
2024-05-06 | DeepMpMRI: Tensor-decomposition Regularized Learning for Fast and High-Fidelity Multi-Parametric Microstructural MR Imaging | Wenxin Fan et.al. | 2405.03159v1 | null |
2024-05-06 | Video Diffusion Models: A Survey | Andrew Melnik et.al. | 2405.03150v1 | null |
2024-05-06 | AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding | Tao Liu et.al. | 2405.03121v1 | link |
2024-05-05 | Matten: Video Generation with Mamba-Attention | Yu Gao et.al. | 2405.03025v1 | null |
2024-05-05 | Exploring Text-based Realistic Building Facades Editing Applicaiton | Jing Wang et.al. | 2405.02967v1 | null |
2024-05-05 | Efficient Text-driven Motion Generation via Latent Consistency Training | Mengxian Hu et.al. | 2405.02791v1 | null |
2024-05-04 | DiffuseTrace: A Transparent and Flexible Watermarking Scheme for Latent Diffusion Model | Liangqi Lei et.al. | 2405.02696v1 | null |
2024-05-08 | Functional Imaging Constrained Diffusion for Brain PET Synthesis from Structural MRI | Minhui Yu et.al. | 2405.02504v2 | null |
2024-05-03 | Continuous Learned Primal Dual | Christina Runkel et.al. | 2405.02478v1 | null |
2024-05-03 | CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding | Kaiyuan Chen et.al. | 2405.02384v1 | null |
2024-05-03 | DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos | Wen-Hsuan Chu et.al. | 2405.02280v1 | null |
2024-05-03 | Multi-grid reaction-diffusion master equation: applications to morphogen gradient modelling | Radek Erban et.al. | 2405.02117v1 | null |
2024-05-03 | DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model | Peijin Jia et.al. | 2405.02008v1 | null |
2024-05-03 | Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition | Yichun Tai et.al. | 2405.01872v1 | null |
2024-05-03 | Creation of Novel Soft Robot Designs using Generative AI | Wee Kiat Chan et.al. | 2405.01824v1 | null |
2024-05-03 | Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics | Rucha Deshpande et.al. | 2405.01822v1 | null |
2024-05-02 | Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model | Zongyang Du et.al. | 2405.01730v1 | null |
2024-05-02 | Long Tail Image Generation Through Feature Space Augmentation and Iterated Learning | Rafael Elberg et.al. | 2405.01705v1 | link |
2024-05-02 | LocInv: Localization-aware Inversion for Text-Guided Image Editing | Chuanming Tang et.al. | 2405.01496v1 | link |
2024-05-02 | Navigating Heterogeneity and Privacy in One-Shot Federated Learning with Diffusion Models | Matias Mendieta et.al. | 2405.01494v1 | null |
2024-05-02 | Statistical algorithms for low-frequency diffusion data: A PDE approach | Matteo Giordano et.al. | 2405.01372v1 | link |
2024-05-02 | DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines | Ye Tian et.al. | 2405.01248v1 | null |
2024-05-02 | Automated Virtual Product Placement and Assessment in Images using Diffusion Models | Mohammad Mahmudul Alam et.al. | 2405.01130v1 | null |
2024-05-02 | Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields | Yuhang Huang et.al. | 2405.00998v1 | null |
2024-05-02 | Generative manufacturing systems using diffusion models and ChatGPT | Xingyu Li et.al. | 2405.00958v1 | null |
2024-05-02 | EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion | Guangyao Zhai et.al. | 2405.00915v1 | null |
2024-05-01 | SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models | Burak Can Biner et.al. | 2405.00878v1 | null |
2024-05-01 | Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers | Palawat Busaranuvong et.al. | 2405.00858v1 | null |
2024-05-01 | ADM: Accelerated Diffusion Model via Estimated Priors for Robust Motion Prediction under Uncertainties | Jiahui Li et.al. | 2405.00797v1 | link |
2024-05-01 | Obtaining Favorable Layouts for Multiple Object Generation | Barak Battash et.al. | 2405.00791v1 | null |
2024-05-01 | Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models | Xiaoshi Wu et.al. | 2405.00760v1 | null |
2024-05-01 | TexSliders: Diffusion-Based Texture Editing in CLIP Space | Julia Guerrero-Viu et.al. | 2405.00672v1 | null |
2024-05-01 | RGB |
Zheng Zeng et.al. | 2405.00666v1 | null |
2024-05-01 | Deep Metric Learning-Based Out-of-Distribution Detection with Synthetic Outlier Exposure | Assefa Seyoum Wahd et.al. | 2405.00631v1 | null |
2024-05-01 | Lane Segmentation Refinement with Diffusion Models | Antonio Ruiz et.al. | 2405.00620v1 | null |
2024-05-01 | Pricing and delta computation in jump-diffusion models with stochastic intensity by Malliavin calculus | Ayub Ahmadi et.al. | 2405.00473v1 | null |
2024-05-01 | Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable | Haozhe Liu et.al. | 2405.00466v1 | null |
2024-05-01 | Detail-Enhancing Framework for Reference-Based Image Super-Resolution | Zihan Wang et.al. | 2405.00431v1 | null |
2024-05-01 | Streamlining Image Editing with Layered Diffusion Brushes | Peyman Gholami et.al. | 2405.00313v1 | null |
2024-05-02 | An Unstructured Mesh Reaction-Drift-Diffusion Master Equation with Reversible Reactions | Samuel A. Isaacson et.al. | 2405.00283v2 | null |
2024-05-01 | ASAM: Boosting Segment Anything Model with Adversarial Tuning | Bo Li et.al. | 2405.00256v1 | link |
2024-04-30 | Semantically Consistent Video Inpainting with Conditional Diffusion Models | Dylan Green et.al. | 2405.00251v1 | null |
2024-04-30 | IgCONDA-PET: Implicitly-Guided Counterfactual Diffusion for Detecting Anomalies in PET Images | Shadab Ahamed et.al. | 2405.00239v1 | link |
2024-04-30 | SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound | Haohe Liu et.al. | 2405.00233v1 | null |
2024-04-30 | Target-Specific De Novo Peptide Binder Design with DiffPepBuilder | Fanhao Wang et.al. | 2405.00128v1 | null |
2024-04-30 | MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model | Wenxun Dai et.al. | 2404.19759v1 | null |
2024-04-30 | Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting | Paul Engstler et.al. | 2404.19758v1 | null |
2024-04-30 | Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation | Ian Dunn et.al. | 2404.19739v1 | link |
2024-04-30 | X-Diffusion: Generating Detailed 3D MRI Volumes From a Single Image Using Cross-Sectional Diffusion Models | Emmanuelle Bourigault et.al. | 2404.19604v1 | null |
2024-04-30 | MicroDreamer: Zero-shot 3D Generation in |
Luxi Chen et.al. | 2404.19525v1 | link |
2024-04-30 | TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models | Teng Zhou et.al. | 2404.19475v1 | null |
2024-04-30 | Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective | Xiaoxuan Han et.al. | 2404.19382v1 | link |
2024-04-30 | Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model | Wentao Lei et.al. | 2404.19277v1 | null |
2024-04-30 | DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets | Xiaoyu Huang et.al. | 2404.19264v1 | null |
2024-04-30 | CONTUNER: Singing Voice Beautifying with Pitch and Expressiveness Condition | Jianzong Wang et.al. | 2404.19187v1 | null |
2024-04-29 | Stylus: Automatic Adapter Selection for Diffusion Models | Michael Luo et.al. | 2404.18928v1 | null |
2024-04-29 | TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | Junhao Cheng et.al. | 2404.18919v1 | link |
2024-04-29 | Learning general Gaussian mixtures with efficient score matching | Sitan Chen et.al. | 2404.18893v1 | null |
2024-04-29 | A Survey on Diffusion Models for Time Series and Spatio-Temporal Data | Yiyuan Yang et.al. | 2404.18886v1 | link |
2024-04-29 | Learning Mixtures of Gaussians Using Diffusion Models | Khashayar Gatmiry et.al. | 2404.18869v1 | null |
2024-04-29 | Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior | Zhiyuan Li et.al. | 2404.18820v1 | null |
2024-04-29 | Bootstrap 3D Reconstructed Scenes from 3D Gaussian Splatting | Yifei Gao et.al. | 2404.18669v1 | null |
2024-04-29 | FlexiFilm: Long Video Generation with Flexible Conditions | Yichen Ouyang et.al. | 2404.18620v1 | link |
2024-04-29 | Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting | Tianyidan Xie et.al. | 2404.18598v1 | null |
2024-04-26 | FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion | Abhishek Kumar Singh et.al. | 2404.18591v1 | null |
2024-05-01 | U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models | Song Mei et.al. | 2404.18444v2 | null |
2024-04-28 | Fisher Information Improved Training-Free Conditional Diffusion Model | Kaiyu Song et.al. | 2404.18252v1 | null |
2024-04-28 | Paint by Inpaint: Learning to Add Image Objects by Removing Them First | Navve Wasserman et.al. | 2404.18212v1 | link |
2024-04-28 | Generative AI for Visualization: State of the Art and Future Directions | Yilin Ye et.al. | 2404.18144v1 | null |
2024-04-28 | Generative AI for Low-Carbon Artificial Intelligence of Things | Jinbo Wen et.al. | 2404.18077v1 | null |
2024-04-28 | Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model | Xiaolong Li et.al. | 2404.18065v1 | null |
2024-04-28 | Exposing Text-Image Inconsistency Using Diffusion Models | Mingzhen Huang et.al. | 2404.18033v1 | link |
2024-04-30 | Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching | Robert Denkert et.al. | 2404.17939v2 | null |
2024-04-27 | Unsupervised Anomaly Detection via Masked Diffusion Posterior Sampling | Di Wu et.al. | 2404.17900v1 | null |
2024-04-27 | DPER: Diffusion Prior Driven Neural Representation for Limited Angle and Sparse View CT Reconstruction | Chenhe Du et.al. | 2404.17890v1 | null |
2024-04-27 | Diffusion-Aided Joint Source Channel Coding For High Realism Wireless Image Transmission | Mingyu Yang et.al. | 2404.17736v1 | null |
2024-04-27 | Causal Diffusion Autoencoders: Toward Counterfactual Generation via Diffusion Probabilistic Models | Aneesh Komanduri et.al. | 2404.17735v1 | null |
2024-04-26 | Stocking and Harvesting Effects in Advection-Reaction-Diffusion Model: Exploring Decoupled Algorithms and Analysis | Mayesha Sharmim Tisha et.al. | 2404.17702v1 | null |
2024-04-26 | MaPa: Text-driven Photorealistic Material Painting for 3D Shapes | Shangzhan Zhang et.al. | 2404.17569v1 | null |
2024-04-26 | Chemotaxis-inspired PDE model for airborne infectious disease transmission: analysis and simulations | Pierluigi Colli et.al. | 2404.17506v1 | null |
2024-04-26 | Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation | Seungwook Kim et.al. | 2404.17419v1 | null |
2024-04-29 | MV-VTON: Multi-View Virtual Try-On with Diffusion Models | Haoyu Wang et.al. | 2404.17364v2 | link |
2024-04-26 | Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model | Yushen Xu et.al. | 2404.17357v1 | null |
2024-04-26 | Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection | Jiawei Song et.al. | 2404.17254v1 | null |
2024-04-26 | Few-shot Calligraphy Style Learning | Fangda Chen et.al. | 2404.17199v1 | link |
2024-04-25 | CyNetDiff -- A Python Library for Accelerated Implementation of Network Diffusion Models | Eliot W. Robson et.al. | 2404.17059v1 | link |
2024-04-25 | Universal fragmentation in annihilation reactions with constrained kinetics | Enrique Rozas Garcia et.al. | 2404.16950v1 | null |
2024-04-25 | Inferring solid-state diffusivity in lithium-ion battery active materials: improving upon the classical GITT method | A. Emir Gumrukcuoglu et.al. | 2404.16658v1 | null |
2024-04-29 | MuseumMaker: Continual Style Customization without Catastrophic Forgetting | Chenxi Liu et.al. | 2404.16612v2 | null |
2024-04-29 | Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models | Parul Gupta et.al. | 2404.16556v2 | null |
2024-04-25 | DiffSeg: A Segmentation Model for Skin Lesions Based on Diffusion Difference | Zhihao Shuai et.al. | 2404.16474v1 | null |
2024-04-25 | TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models | Haomiao Ni et.al. | 2404.16306v1 | null |
2024-04-25 | CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions | Haoyuan Li et.al. | 2404.16302v1 | link |
2024-04-25 | One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns | Arman Maesumi et.al. | 2404.16292v1 | null |
2024-04-24 | Editable Image Elements for Controllable Synthesis | Jiteng Mu et.al. | 2404.16029v1 | null |
2024-04-24 | RetinaRegNet: A Versatile Approach for Retinal Image Registration | Vishal Balaji Sivaraman et.al. | 2404.16017v1 | link |
2024-04-24 | MYCloth: Towards Intelligent and Interactive Online T-Shirt Customization based on User's Preference | Yexin Liu et.al. | 2404.15801v1 | null |
2024-04-24 | MotionMaster: Training-free Camera Motion Transfer For Video Generation | Teng Hu et.al. | 2404.15789v1 | null |
2024-04-24 | Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations | Kaiwen Xue et.al. | 2404.15766v1 | link |
2024-04-24 | DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images | Orazio Pontorno et.al. | 2404.15697v1 | null |
2024-04-24 | Generative Diffusion Model (GDM) for Optimization of Wi-Fi Networks | Tie Liu et.al. | 2404.15684v1 | null |
2024-04-24 | AnoFPDM: Anomaly Segmentation with Forward Process of Diffusion Models for Brain MRI | Yiming Che et.al. | 2404.15683v1 | null |
2024-04-27 | CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models | Qinghe Wang et.al. | 2404.15677v2 | link |
2024-04-24 | Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models | Xu Shen et.al. | 2404.15625v1 | null |
2024-04-26 | A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution | Zhixiong Yang et.al. | 2404.15620v2 | link |
2024-04-23 | ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning | Weifeng Chen et.al. | 2404.15449v1 | null |
2024-04-23 | GLoD: Composing Global Contexts and Local Details in Image Generation | Moyuru Yamada et.al. | 2404.15447v1 | null |
2024-04-23 | ControlTraj: Controllable Trajectory Generation with Topology-Constrained Diffusion Model | Yuanshao Zhu et.al. | 2404.15380v1 | null |
2024-04-23 | Heat flow, log-concavity, and Lipschitz transport maps | Giovanni Brigati et.al. | 2404.15205v1 | null |
2024-04-23 | CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method | Mingbao Lin et.al. | 2404.15141v1 | link |
2024-04-23 | Taming Diffusion Probabilistic Models for Character Control | Rui Chen et.al. | 2404.15121v1 | null |
2024-04-23 | Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models | Jingyao Xu et.al. | 2404.15081v1 | null |
2024-04-23 | Music Style Transfer With Diffusion Model | Hong Huang et.al. | 2404.14771v1 | null |
2024-04-23 | Gradient Guidance for Diffusion Models: An Optimization Perspective | Yingqing Guo et.al. | 2404.14743v1 | null |
2024-04-25 | FlashSpeech: Efficient Zero-Shot Speech Synthesis | Zhen Ye et.al. | 2404.14700v3 | null |
2024-04-23 | DreamPBR: Text-driven Generation of High-resolution SVBRDF with Multi-modal Guidance | Linxuan Xin et.al. | 2404.14676v1 | null |
2024-04-22 | UVMap-ID: A Controllable and Personalized UV Map Generative Model | Weijie Wang et.al. | 2404.14568v1 | link |
2024-04-22 | Align Your Steps: Optimizing Sampling Schedules in Diffusion Models | Amirmojtaba Sabour et.al. | 2404.14507v1 | null |
2024-04-22 | Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses | Inhee Lee et.al. | 2404.14410v1 | null |
2024-04-22 | GeoDiffuser: Geometry-Based Image Editing with Diffusion Models | Rahul Sajnani et.al. | 2404.14403v1 | null |
2024-04-22 | TAVGBench: Benchmarking Text to Audible-Video Generation | Yuxin Mao et.al. | 2404.14381v1 | link |
2024-04-22 | Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion | Alexander Shmakov et.al. | 2404.14332v1 | null |
2024-04-22 | X-Ray: A Sequential 3D Representation for Generation | Tao Hu et.al. | 2404.14329v1 | null |
2024-04-22 | Collaborative Filtering Based on Diffusion Models: Unveiling the Potential of High-Order Connectivity | Yu Hou et.al. | 2404.14240v1 | link |
2024-04-22 | MultiBooth: Towards Generating All Your Concepts in an Image from Text | Chenyang Zhu et.al. | 2404.14239v1 | link |
2024-04-22 | Face2Face: Label-driven Facial Retouching Restoration | Guanhua Zhao et.al. | 2404.14177v1 | null |
2024-04-22 | FLDM-VTON: Faithful Latent Diffusion Model for Virtual Try-on | Chenhui Wang et.al. | 2404.14162v1 | null |
2024-04-22 | Generative Artificial Intelligence Assisted Wireless Sensing: Human Flow Detection in Practical Communication Environments | Jiacheng Wang et.al. | 2404.14140v1 | null |
2024-04-23 | RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification | Hai Ci et.al. | 2404.14055v2 | link |
2024-04-22 | RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance | Chengrui Wang et.al. | 2404.13984v1 | null |
2024-04-24 | MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets | Zeyu Li et.al. | 2404.13923v2 | null |
2024-04-23 | Accelerating Image Generation with Sub-path Linear Approximation Model | Chen Xu et.al. | 2404.13903v2 | null |
2024-04-22 | Towards Better Text-to-Image Generation Alignment via Attention Modulation | Yihang Wu et.al. | 2404.13899v1 | null |
2024-04-23 | Decoherence of a charged Brownian particle in a magnetic field : an analysis of the roles of coupling via position and momentum variables | Suraka Bhattacharjee et.al. | 2404.13883v2 | null |
2024-04-21 | Universal Fingerprint Generation: Controllable Diffusion Model with Multimodal Conditions | Steven A. Grosz et.al. | 2404.13791v1 | null |
2024-04-21 | Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control | Maria Mihaela Trusca et.al. | 2404.13766v1 | null |
2024-04-21 | A Splice Method for Local-to-Nonlocal Coupling of Weak Forms | Shuai Jiang et.al. | 2404.13744v1 | null |
2024-04-21 | Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models | Vitali Petsiuk et.al. | 2404.13706v1 | null |
2024-04-21 | Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis | Yuxi Ren et.al. | 2404.13686v1 | null |
2024-04-21 | An Integrated Communication and Computing Scheme for Wi-Fi Networks based on Generative AI and Reinforcement Learning | Xinyang Du et.al. | 2404.13598v1 | null |
2024-04-21 | Motion-aware Latent Diffusion Models for Video Frame Interpolation | Zhilin Huang et.al. | 2404.13534v1 | null |
2024-04-21 | Reliable Model Watermarking: Defending Against Theft without Compromising on Evasion | Hongyu Zhu et.al. | 2404.13518v1 | null |
2024-04-21 | ODE-DPS: ODE-based Diffusion Posterior Sampling for Inverse Problems in Partial Differential Equation | Enze Jiang et.al. | 2404.13496v1 | null |
2024-04-21 | Accelerating the Generation of Molecular Conformations with Progressive Distillation of Equivariant Latent Diffusion Models | Romain Lacombe et.al. | 2404.13491v1 | link |
2024-04-20 | Music Consistency Models | Zhengcong Fei et.al. | 2404.13358v1 | null |
2024-04-20 | Generating Daylight-driven Architectural Design via Diffusion Models | Pengzhi Li et.al. | 2404.13353v1 | null |
2024-04-20 | Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think | Haotian Xue et.al. | 2404.13320v1 | link |
2024-04-20 | Latent Schr{ö}dinger Bridge Diffusion Model for Generative Learning | Yuling Jiao et.al. | 2404.13309v1 | null |
2024-04-20 | PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition | Xi Fang et.al. | 2404.13299v1 | null |
2024-04-20 | Optimal Control of a Sub-diffusion Model using Dirichlet-Neumann and Neumann-Neumann Waveform Relaxation Algorithms | Soura Sana et.al. | 2404.13283v1 | null |
2024-04-20 | A Massive MIMO Sampling Detection Strategy Based on Denoising Diffusion Model | Lanxin He et.al. | 2404.13281v1 | null |
2024-04-20 | FilterPrompt: Guiding Image Transfer in Diffusion Models | Xi Wang et.al. | 2404.13263v1 | null |
2024-04-19 | DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading | Man M. Ho et.al. | 2404.13097v1 | null |
2024-04-19 | Analysis of Classifier-Free Guidance Weight Schedulers | Xi Wang et.al. | 2404.13040v1 | null |
2024-04-19 | RadRotator: 3D Rotation of Radiographs with Diffusion Models | Pouria Rouzrokh et.al. | 2404.13000v1 | null |
2024-04-19 | Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics | Xiaofei Wang et.al. | 2404.12973v1 | null |
2024-04-19 | Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling | Grigory Bartosh et.al. | 2404.12940v1 | null |
2024-04-19 | Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models | Konstantinos Vilouras et.al. | 2404.12920v1 | null |
2024-04-19 | Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images | Santosh et.al. | 2404.12908v1 | link |
2024-04-19 | ConCLVD: Controllable Chinese Landscape Video Generation via Diffusion Model | Dingming Liu et.al. | 2404.12903v1 | null |
2024-04-19 | Training-and-prompt-free General Painterly Harmonization Using Image-wise Attention Sharing | Teng-Fang Hsiao et.al. | 2404.12900v1 | link |
2024-04-19 | MCM: Multi-condition Motion Synthesis Framework | Zeyu Ling et.al. | 2404.12886v1 | null |
2024-04-19 | Detecting Out-Of-Distribution Earth Observation Images with Diffusion Models | Georges Le Bellier et.al. | 2404.12667v1 | null |
2024-04-19 | F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation | Man M. Ho et.al. | 2404.12650v1 | null |
2024-04-19 | Dragtraffic: A Non-Expert Interactive and Point-Based Controllable Traffic Scene Generation Framework | Sheng Wang et.al. | 2404.12624v1 | null |
2024-04-19 | Rethinking Clothes Changing Person ReID: Conflicts, Synthesis, and Optimization | Junjie Li et.al. | 2404.12611v1 | null |
2024-04-18 | GenVideo: One-shot Target-image and Shape Aware Video Editing using T2I Diffusion Models | Sai Sree Harsha et.al. | 2404.12541v1 | null |
2024-04-18 | G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis | Yufei Ye et.al. | 2404.12383v1 | null |
2024-04-18 | Learning the Domain Specific Inverse NUFFT for Accelerated Spiral MRI using Diffusion Models | Trevor J. Chan et.al. | 2404.12361v1 | null |
2024-04-18 | AniClipart: Clipart Animation with Text-to-Video Priors | Ronghuan Wu et.al. | 2404.12347v1 | null |
2024-04-18 | Guided Discrete Diffusion for Electronic Health Record Generation | Zixiang Chen et.al. | 2404.12314v1 | null |
2024-04-18 | StyleBooth: Image Style Editing with Multimodal Instruction | Zhen Han et.al. | 2404.12154v1 | link |
2024-04-18 | LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights | Thibault Castells et.al. | 2404.11936v1 | null |
2024-04-18 | FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models | Wei Wu et.al. | 2404.11895v1 | null |
2024-04-17 | Prompt-Driven Feature Diffusion for Open-World Semi-Supervised Learning | Marzi Heidari et.al. | 2404.11795v1 | null |
2024-04-17 | Diffusion Schrödinger Bridge Models for High-Quality MR-to-CT Synthesis for Head and Neck Proton Treatment Planning | Muheng Li et.al. | 2404.11741v1 | null |
2024-04-17 | Factorized Diffusion: Perceptual Illusions by Noise Decomposition | Daniel Geng et.al. | 2404.11615v1 | null |
2024-04-17 | IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination | Xi Chen et.al. | 2404.11593v1 | null |
2024-04-17 | Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding | Zezhong Fan et.al. | 2404.11589v1 | null |
2024-04-17 | MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation | Kuan-Chieh Wang et.al. | 2404.11565v1 | null |
2024-04-17 | Predicting Long-horizon Futures by Conditioning on Geometry and Time | Tarasha Khurana et.al. | 2404.11554v1 | null |
2024-04-17 | SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening | Yu Zhong et.al. | 2404.11537v1 | null |
2024-04-17 | Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt | Zhanjie Zhang et.al. | 2404.11474v1 | link |
2024-04-17 | Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption | Buzhen Huang et.al. | 2404.11291v1 | link |
2024-04-17 | Optical Image-to-Image Translation Using Denoising Diffusion Models: Heterogeneous Change Detection as a Use Case | João Gabriel Vinholi et.al. | 2404.11243v1 | null |
2024-04-17 | RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models | Han Huang et.al. | 2404.11199v1 | link |
2024-04-19 | LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models | Dingkun Zhang et.al. | 2404.11098v3 | null |
2024-04-16 | Molecular relaxation by reverse diffusion with time step prediction | Khaled Kahouli et.al. | 2404.10935v1 | link |
2024-04-16 | RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting | Ashkan Mirzaei et.al. | 2404.10765v1 | null |
2024-04-16 | LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? | Yuchi Wang et.al. | 2404.10763v1 | link |
2024-04-19 | GazeHTA: End-to-end Gaze Target Detection with Head-Target Association | Zhi-Yi Lin et.al. | 2404.10718v2 | null |
2024-04-16 | Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution | Yutao Yuan et.al. | 2404.10688v1 | link |
2024-04-16 | Generating Human Interaction Motions in Scenes with Text Control | Hongwei Yi et.al. | 2404.10685v1 | null |
2024-04-16 | StyleCity: Large-Scale 3D Urban Scenes Stylization with Vision-and-Text Reference via Progressive Optimization | Yingshu Chen et.al. | 2404.10681v1 | null |
2024-04-18 | Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay | Jinmei Liu et.al. | 2404.10662v2 | link |
2024-04-16 | Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences | Seungwook Kim et.al. | 2404.10603v1 | null |
2024-04-17 | Do Counterfactual Examples Complicate Adversarial Training? | Eric Yeats et.al. | 2404.10588v2 | null |
2024-04-17 | AAVDiff: Experimental Validation of Enhanced Viability and Diversity in Recombinant Adeno-Associated Virus (AAV) Capsids through Diffusion Generation | Lijun Liu et.al. | 2404.10573v2 | null |
2024-04-16 | A bridge between spatial and first-passage properties of continuous and discrete time stochastic processes: from hard walls to absorbing boundary conditions | Mathis Guéneau et.al. | 2404.10537v1 | null |
2024-04-16 | Four-hour thunderstorm nowcasting using deep diffusion models of satellite | Kuai Dai et.al. | 2404.10512v1 | null |
2024-04-16 | SparseDM: Toward Sparse Efficient Diffusion Models | Kafeng Wang et.al. | 2404.10445v1 | null |
2024-04-16 | Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior | Yiqian Wu et.al. | 2404.10394v1 | null |
2024-04-16 | Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery | Payal Varshney et.al. | 2404.10356v1 | null |
2024-04-18 | Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models | Qi Guo et.al. | 2404.10335v2 | null |
2024-04-17 | OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model | Runyi Li et.al. | 2404.10312v2 | null |
2024-04-16 | EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion | Cindy Le et.al. | 2404.10279v1 | null |
2024-04-16 | OneActor: Consistent Character Generation via Cluster-Conditioned Guidance | Jiahao Wang et.al. | 2404.10267v1 | null |
2024-04-16 | Diffusion assisted image reconstruction in optoacoustic tomography | M. G. González et.al. | 2404.10239v1 | null |
2024-04-15 | Salient Object-Aware Background Generation using Text-Guided Diffusion Models | Amir Erfan Eshratifar et.al. | 2404.10157v1 | link |
2024-04-15 | Taming Latent Diffusion Model for Neural Radiance Field Inpainting | Chieh Hubert Lin et.al. | 2404.09995v1 | null |
2024-04-15 | in2IN: Leveraging individual Information to Generate Human INteractions | Pablo Ruiz Ponce et.al. | 2404.09988v1 | null |
2024-04-15 | MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models | Nithin Gopalakrishnan Nair et.al. | 2404.09977v1 | null |
2024-04-15 | Diffscaler: Enhancing the Generative Prowess of Diffusion Transformers | Nithin Gopalakrishnan Nair et.al. | 2404.09976v1 | null |
2024-04-15 | Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model | Han Lin et.al. | 2404.09967v1 | null |
2024-04-16 | Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization | Navonil Majumder et.al. | 2404.09956v2 | link |
2024-04-15 | A Diffusion-based Data Generator for Training Object Recognition Models in Ultra-Range Distance | Eran Bamani et.al. | 2404.09846v1 | null |
2024-04-17 | Digging into contrastive learning for robust depth estimation with diffusion models | Jiyuan Wang et.al. | 2404.09831v2 | null |
2024-04-15 | Equipping Diffusion Models with Differentiable Spatial Entropy for Low-Light Image Enhancement | Wenyi Lian et.al. | 2404.09735v1 | link |
2024-04-15 | Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models | Ziwei Luo et.al. | 2404.09732v1 | link |
2024-04-15 | All-in-one simulation-based inference | Manuel Gloeckler et.al. | 2404.09636v1 | link |
2024-04-15 | TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models | Haojun Sun et.al. | 2404.09532v1 | null |
2024-04-15 | Magic Clothing: Controllable Garment-Driven Image Synthesis | Weifeng Chen et.al. | 2404.09512v1 | link |
2024-04-15 | PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI | Yandan Yang et.al. | 2404.09465v1 | null |
2024-04-15 | Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models | Peifei Zhu et.al. | 2404.09401v1 | null |
2024-04-14 | Fault Detection in Mobile Networks Using Diffusion Models | Mohamad Nabeel et.al. | 2404.09240v1 | null |
2024-04-14 | DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling | Xuening Yuan et.al. | 2404.09227v1 | null |
2024-04-16 | LoopAnimate: Loopable Salient Object Animation | Fanyi Wang et.al. | 2404.09172v2 | null |
2024-04-14 | RF-Diffusion: Radio Signal Generation via Time-Frequency Diffusion | Guoxuan Chi et.al. | 2404.09140v1 | link |
2024-04-13 | Rethinking Iterative Stereo Matching from Diffusion Bridge Model Perspective | Yuguang Shi et.al. | 2404.09051v1 | null |
2024-04-13 | Theoretical research on generative diffusion models: an overview | Melike Nur Yeğin et.al. | 2404.09016v1 | null |
2024-04-13 | Multimodal Cross-Document Event Coreference Resolution Using Linear Semantic Transfer and Mixed-Modality Ensembles | Abhijnan Nath et.al. | 2404.08949v1 | link |
2024-04-13 | Enforcing Paraphrase Generation via Controllable Latent Diffusion | Wei Zou et.al. | 2404.08938v1 | link |
2024-04-17 | Diffusion Models Meet Remote Sensing: Principles, Methods, and Perspectives | Yidan Liu et.al. | 2404.08926v2 | null |
2024-04-13 | ChangeAnywhere: Sample Generation for Remote Sensing Change Detection via Semantic Latent Diffusion Model | Kai Tang et.al. | 2404.08892v1 | null |
2024-04-12 | Semantic Approach to Quantifying the Consistency of Diffusion Model Image Generation | Brinnae Bent et.al. | 2404.08799v1 | null |
2024-04-12 | Diffusion-Based Joint Temperature and Precipitation Emulation of Earth System Models | Katie Christensen et.al. | 2404.08797v1 | null |
2024-04-12 | Lossy Image Compression with Foundation Diffusion Models | Lucas Relic et.al. | 2404.08580v1 | null |
2024-04-12 | PiRD: Physics-informed Residual Diffusion for Flow Field Reconstruction | Siming Shan et.al. | 2404.08412v1 | null |
2024-04-12 | Struggle with Adversarial Defense? Try Diffusion | Yujie Li et.al. | 2404.08273v1 | null |
2024-04-12 | Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models | Zeyu Yang et.al. | 2404.08254v1 | null |
2024-04-12 | Interest Maximization in Social Networks | Rahul Kumar Gautam et.al. | 2404.08236v1 | null |
2024-04-11 | ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback | Ming Li et.al. | 2404.07987v1 | null |
2024-04-11 | Taming Stable Diffusion for Text to 360° Panorama Image Generation | Cheng Zhang et.al. | 2404.07949v1 | link |
2024-04-11 | Adaptive Hyperbolic-cross-space Mapped Jacobi Method on Unbounded Domains with Applications to Solving Multidimensional Spatiotemporal Integrodifferential Equations | Yunhong Deng et.al. | 2404.07844v1 | null |
2024-04-11 | ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model | Lifan Jiang et.al. | 2404.07773v1 | null |
2024-04-11 | An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization | Minshuo Chen et.al. | 2404.07771v1 | null |
2024-04-11 | Joint Conditional Diffusion Model for Image Restoration with Mixed Degradations | Yufeng Yue et.al. | 2404.07770v1 | null |
2024-04-11 | Diffusing in Someone Else's Shoes: Robotic Perspective Taking with Diffusion | Josua Spisak et.al. | 2404.07735v1 | null |
2024-04-11 | Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models | Tuomas Kynkäänniemi et.al. | 2404.07724v1 | null |
2024-04-11 | Implicit and Explicit Language Guidance for Diffusion-based Visual Perception | Hefeng Wang et.al. | 2404.07600v1 | null |
2024-04-11 | ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation | Stanislav Frolov et.al. | 2404.07564v1 | null |
2024-04-11 | Effects of phase separation on extinction times in population models | Janik Schüttler et.al. | 2404.07563v1 | null |
2024-04-11 | CAT: Contrastive Adapter Training for Personalized Image Generation | Jae Wan Park et.al. | 2404.07554v1 | link |
2024-04-10 | Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models | Yasi Zhang et.al. | 2404.07389v1 | null |
2024-04-10 | GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models | Zewei Zhang et.al. | 2404.07206v1 | null |
2024-04-10 | RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion | Jaidev Shriram et.al. | 2404.07199v1 | null |
2024-04-14 | InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models | Jiale Xu et.al. | 2404.07191v2 | link |
2024-04-10 | Move Anything with Layered Scene Diffusion | Jiawei Ren et.al. | 2404.07178v1 | null |
2024-04-10 | Diffusion-based inpainting of incomplete Euclidean distance matrices of trajectories generated by a fractional Brownian motion | Alexander Lobashev et.al. | 2404.07029v1 | link |
2024-04-10 | DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting | Shijie Zhou et.al. | 2404.06903v1 | null |
2024-04-10 | Fine color guidance in diffusion models and its application to image compression at extremely low bitrates | Tom Bordin et.al. | 2404.06865v1 | null |
2024-04-10 | UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion | Junsheng Zhou et.al. | 2404.06851v1 | null |
2024-04-10 | Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer | Yanqi Ge et.al. | 2404.06835v1 | null |
2024-04-10 | Zero-shot Point Cloud Completion Via 2D Priors | Tianxin Huang et.al. | 2404.06814v1 | null |
2024-04-10 | Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior | Fan Lu et.al. | 2404.06780v1 | null |
2024-04-10 | DiffusionDialog: A Diffusion Model for Diverse Dialog Generation with Latent Space | Jianxiang Xiang et.al. | 2404.06760v1 | null |
2024-04-11 | Disguised Copyright Infringement of Latent Diffusion Models | Yiwei Lu et.al. | 2404.06737v2 | null |
2024-04-10 | Efficient Denoising using Score Embedding in Score-based Diffusion Models | Andrew S. Na et.al. | 2404.06661v1 | null |
2024-04-09 | Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation | Luca Barsellotti et.al. | 2404.06542v1 | null |
2024-04-09 | GeoDirDock: Guiding Docking Along Geodesic Paths | Raúl Miñán et.al. | 2404.06481v1 | null |
2024-04-09 | Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion | Fan Yang et.al. | 2404.06429v1 | null |
2024-04-09 | ZeST: Zero-Shot Material Transfer from a Single Image | Ta-Ying Cheng et.al. | 2404.06425v1 | null |
2024-04-09 | Policy-Guided Diffusion | Matthew Thomas Jackson et.al. | 2404.06356v1 | link |
2024-04-09 | Quantum State Generation with Structure-Preserving Diffusion Model | Yuchen Zhu et.al. | 2404.06336v1 | null |
2024-04-09 | DiffHarmony: Latent Diffusion Model Meets Image Harmonization | Pengfei Zhou et.al. | 2404.06139v1 | null |
2024-04-09 | Hash3D: Training-free Acceleration for 3D Generation | Xingyi Yang et.al. | 2404.06091v1 | link |
2024-04-09 | Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data | Kai Luan et.al. | 2404.06012v1 | null |
2024-04-13 | Tackling Structural Hallucination in Image Translation with Local Diffusion | Seunghoi Kim et.al. | 2404.05980v2 | null |
2024-04-09 | Map Optical Properties to Subwavelength Structures Directly via a Diffusion Model | Shijie Rao et.al. | 2404.05959v1 | null |
2024-04-08 | MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | Kunpeng Song et.al. | 2404.05674v1 | null |
2024-04-08 | YaART: Yet Another ART Rendering Technology | Sergey Kastryulin et.al. | 2404.05666v1 | null |
2024-04-08 | BinaryDM: Towards Accurate Binarization of Diffusion Model | Xingyu Zheng et.al. | 2404.05662v1 | link |
2024-04-08 | Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model | Jichang Yang et.al. | 2404.05648v1 | null |
2024-04-08 | Learning a Category-level Object Pose Estimator without Pose Annotations | Fengrui Tian et.al. | 2404.05626v1 | null |
2024-04-08 | UniFL: Improve Stable Diffusion via Unified Feedback Learning | Jiacheng Zhang et.al. | 2404.05595v1 | null |
2024-04-08 | Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models | Saman Motamed et.al. | 2404.05519v1 | null |
2024-04-08 | Taming Transformers for Realistic Lidar Point Cloud Generation | Hamed Haghighi et.al. | 2404.05505v1 | link |
2024-04-08 | Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance | Dazhong Shen et.al. | 2404.05384v1 | link |
2024-04-08 | Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt | Zhiqi Huang et.al. | 2404.05331v1 | null |
2024-04-08 | Text-to-Image Synthesis for Any Artistic Styles: Advancements in Personalized Artistic Image Generation via Subdivision and Dual Binding | Junseo Park et.al. | 2404.05256v1 | null |
2024-04-08 | DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage CJK Character Generation | Yingtao Tian et.al. | 2404.05212v1 | null |
2024-04-07 | Context-dependent Causality (the Non-Nonotonic Case) | Nir Billfeld et.al. | 2404.05021v1 | null |
2024-04-07 | Generative downscaling of PDE solvers with physics-guided diffusion models | Yulong Lu et.al. | 2404.05009v1 | link |
2024-04-07 | Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models | Zijin Yang et.al. | 2404.04956v1 | null |
2024-04-07 | Regularized Conditional Diffusion Model for Multi-Task Preference Alignment | Xudong Yu et.al. | 2404.04920v1 | null |
2024-04-07 | Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder | Yiyang Ma et.al. | 2404.04916v1 | null |
2024-04-07 | ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion Model | Binghui Chen et.al. | 2404.04833v1 | null |
2024-04-07 | Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving | Jinlong Li et.al. | 2404.04804v1 | null |
2024-04-07 | Rethinking Diffusion Model for Multi-Contrast MRI Super-Resolution | Guangyuan Li et.al. | 2404.04785v1 | link |
2024-04-06 | InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization | Xiefan Guo et.al. | 2404.04650v1 | link |
2024-04-06 | DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation | Duy-Tho Le et.al. | 2404.04629v1 | null |
2024-04-11 | Diffusion Time-step Curriculum for One Image to 3D Generation | Xuanyu Yi et.al. | 2404.04562v2 | link |
2024-04-06 | BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion | Gwanghyun Kim et.al. | 2404.04544v1 | null |
2024-04-06 | DATENeRF: Depth-Aware Text-based Editing of NeRFs | Sara Rojas et.al. | 2404.04526v1 | null |
2024-04-06 | Latent-based Diffusion Model for Long-tailed Recognition | Pengxiao Han et.al. | 2404.04517v1 | null |
2024-04-06 | Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models | Zhengcong Fei et.al. | 2404.04478v1 | link |
2024-04-06 | Aligning Diffusion Models by Optimizing Human Utility | Shufan Li et.al. | 2404.04465v1 | null |
2024-04-05 | Pixel-wise RL on Diffusion Models: Reinforcement Learning from Rich Feedback | Mo Kordzanganeh et.al. | 2404.04356v1 | null |
2024-04-05 | Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models | Sangwon Jang et.al. | 2404.04243v1 | null |
2024-04-05 | ToolEENet: Tool Affordance 6D Pose Estimation | Yunlong Wang et.al. | 2404.04193v1 | null |
2024-04-05 | Dynamic Prompt Optimizing for Text-to-Image Generation | Wenyi Mo et.al. | 2404.04095v1 | link |
2024-04-05 | Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation | Mingyuan Zhou et.al. | 2404.04057v1 | null |
2024-04-05 | Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models | Gihyun Kwon et.al. | 2404.03913v1 | null |
2024-04-04 | Bi-level Guided Diffusion Models for Zero-Shot Medical Imaging Inverse Problems | Hossein Askari et.al. | 2404.03706v1 | null |
2024-04-04 | Mitigating analytical variability in fMRI results with style transfer | Elodie Germani et.al. | 2404.03703v1 | null |
2024-04-04 | MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation | Hanzhe Hu et.al. | 2404.03656v1 | null |
2024-04-04 | CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching | Dongzhi Jiang et.al. | 2404.03653v1 | link |
2024-04-04 | The More You See in 2D, the More You Perceive in 3D | Xinyang Han et.al. | 2404.03652v1 | null |
2024-04-04 | DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior | Yiming Zhang et.al. | 2404.03642v1 | null |
2024-04-04 | LCM-Lookahead for Encoder-based Text-to-Image Personalization | Rinon Gal et.al. | 2404.03620v1 | null |
2024-04-04 | DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR Images | Zhou Jie et.al. | 2404.03595v1 | link |
2024-04-04 | PointInfinity: Resolution-Invariant Point Diffusion Models | Zixuan Huang et.al. | 2404.03566v1 | null |
2024-04-04 | Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models | Siyuan Mei et.al. | 2404.03541v1 | null |
2024-04-04 | A Directional Diffusion Graph Transformer for Recommendation | Zixuan Yi et.al. | 2404.03326v1 | null |
2024-04-04 | SiloFuse: Cross-silo Synthetic Data Generation with Latent Tabular Diffusion Models | Aditya Shankar et.al. | 2404.03299v1 | null |
2024-04-04 | Future-Proofing Class Incremental Learning | Quentin Jodelet et.al. | 2404.03200v1 | null |
2024-04-04 | HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud | Wencan Cheng et.al. | 2404.03159v1 | link |
2024-04-04 | DreamWalk: Style Space Exploration using Diffusion Guidance | Michelle Shu et.al. | 2404.03145v1 | null |
2024-04-04 | Diverse and Tailored Image Generation for Zero-shot Multi-label Classification | Kaixin Zhang et.al. | 2404.03144v1 | null |
2024-04-04 | The Diffusive Ultrasound Modulated Bioluminescence Tomography with Partial Data and Uncertain Optical Parameters | Tianyu Yang et.al. | 2404.03124v1 | null |
2024-04-03 | Many-to-many Image Generation with Auto-regressive Diffusion Models | Ying Shen et.al. | 2404.03109v1 | null |
2024-04-03 | Computing macroscopic reaction rates in reaction-diffusion systems using Monte Carlo simulations | Mohamed Swailem et.al. | 2404.03089v1 | null |
2024-04-03 | ASAP: Interpretable Analysis and Summarization of AI-generated Image Patterns at Scale | Jinbin Huang et.al. | 2404.02990v1 | null |
2024-04-03 | Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections | Gabriel Loaiza-Ganem et.al. | 2404.02954v1 | null |
2024-04-02 | Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models | Jiachen Ma et.al. | 2404.02928v1 | null |
2024-04-03 | LidarDM: Generative LiDAR Simulation in a Generated World | Vlas Zyrianov et.al. | 2404.02903v1 | link |
2024-04-03 | Fast Diffusion Model For Seismic Data Noise Attenuation | Junheng Peng et.al. | 2404.02767v1 | null |
2024-04-03 | Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models | Wentian Zhang et.al. | 2404.02747v1 | link |
2024-04-03 | Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition | Behrooz Razeghi et.al. | 2404.02696v1 | null |
2024-04-03 | Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models | Matteo Pennisi et.al. | 2404.02618v1 | null |
2024-04-03 | A Unified Editing Method for Co-Speech Gesture Generation via Diffusion Inversion | Zeyu Zhao et.al. | 2404.02411v1 | null |
2024-04-03 | Enhancing Diffusion-based Point Cloud Generation with Smoothness Constraint | Yukun Li et.al. | 2404.02396v1 | null |
2024-04-02 | Semantic Augmentation in Images using Language | Sahiti Yerramilli et.al. | 2404.02353v1 | null |
2024-04-02 | Heat Death of Generative Models in Closed-Loop Learning | Matteo Marchi et.al. | 2404.02325v1 | null |
2024-04-02 | APEX: Ambidextrous Dual-Arm Robotic Manipulation Using Collision-Free Generative Diffusion Models | Apan Dastider et.al. | 2404.02284v1 | null |
2024-04-08 | Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better | Enshu Liu et.al. | 2404.02241v2 | link |
2024-04-02 | Diffusion |
Zeyu Yang et.al. | 2404.02148v1 | link |
2024-04-02 | WcDT: World-centric Diffusion Transformer for Traffic Scene Generation | Chen Yang et.al. | 2404.02082v1 | link |
2024-04-03 | AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design | Xinze Li et.al. | 2404.02003v2 | null |
2024-04-07 | Bi-LORA: A Vision-Language Approach for Synthetic Image Detection | Mamadou Keita et.al. | 2404.01959v2 | null |
2024-04-02 | Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model | Xu He et.al. | 2404.01862v1 | link |
2024-04-02 | Upsample Guidance: Scale Up Diffusion Models without Training | Juno Hwang et.al. | 2404.01709v1 | null |
2024-04-05 | FashionEngine: Interactive Generation and Editing of 3D Clothed Humans | Tao Hu et.al. | 2404.01655v2 | null |
2024-04-02 | Diffusion Deepfake | Chaitali Bhattacharyya et.al. | 2404.01579v1 | null |
2024-04-01 | Prior Frequency Guided Diffusion Model for Limited Angle (LA)-CBCT Reconstruction | Jiacheng Xie et.al. | 2404.01448v1 | null |
2024-04-01 | DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery | Yixuan Zhu et.al. | 2404.01424v1 | link |
2024-04-01 | Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data | Matthias Gerstgrasser et.al. | 2404.01413v1 | null |
2024-04-01 | Bigger is not Always Better: Scaling Properties of Latent Diffusion Models | Kangfu Mei et.al. | 2404.01367v1 | null |
2024-04-01 | MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space | Armand Comas-Massagué et.al. | 2404.01296v1 | null |
2024-04-01 | CosmicMan: A Text-to-Image Foundation Model for Humans | Shikai Li et.al. | 2404.01294v1 | null |
2024-04-01 | Measuring Style Similarity in Diffusion Models | Gowthami Somepalli et.al. | 2404.01292v1 | link |
2024-04-01 | A Unified and Interpretable Emotion Representation and Expression Generation | Reni Paskaleva et.al. | 2404.01243v1 | null |
2024-04-02 | StructLDM: Structured Latent Diffusion for 3D Human Generation | Tao Hu et.al. | 2404.01241v2 | null |
2024-04-01 | Video Interpolation with Diffusion Models | Siddhant Jain et.al. | 2404.01203v1 | null |
2024-04-01 | Uncovering the Text Embedding in Text-to-Image Diffusion Models | Hu Yu et.al. | 2404.01154v1 | null |
2024-04-01 | UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models | Zihan Guan et.al. | 2404.01101v1 | null |
2024-04-01 | Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On | Xu Yang et.al. | 2404.01089v1 | null |
2024-04-01 | PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation | Yunze Liu et.al. | 2404.01081v1 | null |
2024-04-01 | Towards Memorization-Free Diffusion Models | Chen Chen et.al. | 2404.00922v1 | null |
2024-04-01 | The long-time behavior of solutions of a three-component reaction-diffusion model for the population dynamics of farmers and hunter-gatherers: the different motility case | Dongyuan Xiao et.al. | 2404.00907v1 | null |
2024-04-01 | Model-Agnostic Human Preference Inversion in Diffusion Models | Jeeyung Kim et.al. | 2404.00879v1 | null |
2024-04-01 | TryOn-Adapter: Efficient Fine-Grained Clothing Identity Adaptation for High-Fidelity Virtual Try-On | Jiazheng Xing et.al. | 2404.00878v1 | link |
2024-04-01 | DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF | Jie Long Lee et.al. | 2404.00874v1 | null |
2024-04-01 | Generating Content for HDR Deghosting from Frequency View | Tao Hu et.al. | 2404.00849v1 | null |
2024-04-01 | Nonlinear ensemble filtering with diffusion models: Application to the surface quasi-geostrophic dynamics | Feng Bao et.al. | 2404.00844v1 | null |
2024-03-31 | Towards Realistic Scene Generation with LiDAR Diffusion Models | Haoxi Ran et.al. | 2404.00815v1 | link |
2024-03-31 | Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization | Mainak Singha et.al. | 2404.00710v1 | null |
2024-03-31 | DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion | Chunyang Bi et.al. | 2404.00661v1 | null |
2024-03-31 | CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models | Xiang Li et.al. | 2404.00569v1 | link |
2024-04-02 | Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction | Junuk Cha et.al. | 2404.00562v2 | null |
2024-03-31 | Creating synthetic energy meter data using conditional diffusion and building metadata | Chun Fu et.al. | 2404.00525v1 | null |
2024-03-30 | Denoising Monte Carlo Renders With Diffusion Models | Vaibhav Vavilala et.al. | 2404.00491v1 | null |
2024-03-30 | DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans | Akash Sengupta et.al. | 2404.00485v1 | null |
2024-03-30 | Score-Based Diffusion Models for Photoacoustic Tomography Image Reconstruction | Sreemanti Dey et.al. | 2404.00471v1 | null |
2024-03-30 | Joint Pedestrian Trajectory Prediction through Posterior Sampling | Haotian Lin et.al. | 2404.00237v1 | null |
2024-03-30 | Grid Diffusion Models for Text-to-Video Generation | Taegyeong Lee et.al. | 2404.00234v1 | null |
2024-03-30 | Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space | Zheling Meng et.al. | 2404.00230v1 | null |
2024-03-29 | FetalDiffusion: Pose-Controllable 3D Fetal MRI Synthesis with Conditional Diffusion Model | Molin Zhang et.al. | 2404.00132v1 | null |
2024-04-02 | GDA: Generalized Diffusion for Robust Test-time Adaptation | Yun-Yun Tsai et.al. | 2404.00095v2 | null |
2024-03-29 | Relation Rectification in Diffusion Model | Yinwei Wu et.al. | 2403.20249v1 | null |
2024-03-29 | Motion Inversion for Video Customization | Luozhou Wang et.al. | 2403.20193v1 | null |
2024-03-29 | FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models | Barbara Toniella Corradini et.al. | 2403.20105v1 | null |
2024-03-29 | SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior | Zhongrui Yu et.al. | 2403.20079v1 | null |
2024-03-29 | Probing solar modulation analytic models with cosmic ray periodic spectra | Wei-Cheng Long et.al. | 2403.20038v1 | null |
2024-04-01 | Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting | Haipeng Liu et.al. | 2403.19898v2 | link |
2024-03-28 | Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks | Pooria Ashrafian et.al. | 2403.19880v1 | link |
2024-03-28 | ShapeFusion: A 3D diffusion model for localized shape editing | Rolandos Alexandros Potamias et.al. | 2403.19773v1 | null |
2024-03-28 | MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models | Hidir Yesiltepe et.al. | 2403.19738v1 | null |
2024-03-28 | Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond | Katherine Xu et.al. | 2403.19653v1 | link |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652v1 | null |
2024-03-28 | GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models | Yusuf Dalva et.al. | 2403.19645v1 | null |
2024-03-28 | In the driver's mind: modeling the dynamics of human overtaking decisions in interactions with oncoming automated vehicles | Samir H. A. Mohammad et.al. | 2403.19637v1 | null |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600v1 | link |
2024-03-28 | Frame by Familiar Frame: Understanding Replication in Video Diffusion Models | Aimon Rahman et.al. | 2403.19593v1 | null |
2024-03-28 | Impact of Resin Molecular Weight on Drying Kinetics and Sag of Coatings | Marola W. Issa et.al. | 2403.19544v1 | null |
2024-03-28 | Debiasing Cardiac Imaging with Controlled Latent Diffusion Models | Grzegorz Skorupko et.al. | 2403.19508v1 | link |
2024-03-28 | Burst Super-Resolution with Diffusion Models for Improving Perceptual Quality | Kyotaro Tokoro et.al. | 2403.19428v1 | link |
2024-03-28 | Imperceptible Protection against Style Imitation from Diffusion Models | Namhyuk Ahn et.al. | 2403.19254v1 | null |
2024-03-28 | RecDiffusion: Rectangling for Image Stitching with Diffusion Models | Tianhao Zhou et.al. | 2403.19164v1 | link |
2024-03-28 | MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation | Seyeon Kim et.al. | 2403.19144v1 | link |
2024-03-28 | QNCD: Quantization Noise Correction for Diffusion Models | Huanpeng Chu et.al. | 2403.19140v1 | link |
2024-03-30 | Egocentric Scene-aware Human Trajectory Prediction | Weizhuo Wang et.al. | 2403.19026v2 | null |
2024-03-27 | TextCraftor: Your Text Encoder Can be Image Quality Controller | Yanyu Li et.al. | 2403.18978v1 | null |
2024-03-27 | CPR: Retrieval Augmented Generation for Copyright Protection | Aditya Golatkar et.al. | 2403.18920v1 | null |
2024-03-27 | A Geometric Explanation of the Likelihood OOD Detection Paradox | Hamidreza Kamkari et.al. | 2403.18910v1 | link |
2024-03-27 | ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion | Daniel Winter et.al. | 2403.18818v1 | null |
2024-04-01 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807v3 | link |
2024-03-27 | Object Pose Estimation via the Aggregation of Diffusion Features | Tianfu Wang et.al. | 2403.18791v1 | link |
2024-03-27 | ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Chenshuang Zhang et.al. | 2403.18775v1 | link |
2024-03-27 | A Diffusion-Based Generative Equalizer for Music Restoration | Eloi Moliner et.al. | 2403.18636v1 | link |
2024-03-27 | HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions | Hao Xu et.al. | 2403.18575v1 | link |
2024-03-27 | Artifact Reduction in 3D and 4D Cone-beam Computed Tomography Images with Deep Learning -- A Review | Mohammadreza Amirian et.al. | 2403.18565v1 | null |
2024-03-27 | CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection | Jiayi Zhu et.al. | 2403.18554v1 | null |
2024-03-27 | CT-3DFlow : Leveraging 3D Normalizing Flows for Unsupervised Detection of Pathological Pulmonary CT scans | Aissam Djahnine et.al. | 2403.18514v1 | null |
2024-03-27 | Synthesizing EEG Signals from Event-Related Potential Paradigms with Conditional Diffusion Models | Guido Klein et.al. | 2403.18486v1 | null |
2024-03-27 | DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis | Zhongxi Chen et.al. | 2403.18471v1 | link |
2024-03-27 | DiffStyler: Diffusion-based Localized Image Style Transfer | Shaoxu Li et.al. | 2403.18461v1 | null |
2024-03-27 | SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model | Inhwan Bae et.al. | 2403.18452v1 | link |
2024-03-27 | U-Sketch: An Efficient Approach for Sketch to Image Diffusion Models | Ilias Mitsouras et.al. | 2403.18425v1 | null |
2024-03-27 | ECNet: Effective Controllable Text-to-Image Diffusion Models | Sicheng Li et.al. | 2403.18417v1 | null |
2024-03-27 | Ship in Sight: Diffusion Models for Ship-Image Super Resolution | Luigi Sigillo et.al. | 2403.18370v1 | link |
2024-03-27 | DODA: Diffusion for Object-detection Domain Adaptation in Agriculture | Shuai Xiang et.al. | 2403.18334v1 | link |
2024-03-27 | RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation | Yang Tian et.al. | 2403.18259v1 | null |
2024-03-27 | NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation | Jingyang Huo et.al. | 2403.18211v1 | null |
2024-03-28 | Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models | Kartikeya Bhardwaj et.al. | 2403.18159v2 | null |
2024-03-26 | Tutorial on Diffusion Models for Imaging and Vision | Stanley H. Chan et.al. | 2403.18103v1 | null |
2024-03-26 | Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance | Zan Wang et.al. | 2403.18036v1 | link |
2024-03-30 | Bidirectional Consistency Models | Liangchen Li et.al. | 2403.18035v2 | null |
2024-03-26 | Mixing Artificial and Natural Intelligence: From Statistical Mechanics to AI and Back to Turbulence | Michael et.al. | 2403.17993v1 | null |
2024-03-26 | AID: Attention Interpolation of Text-to-Image Diffusion | Qiyuan He et.al. | 2403.17924v1 | link |
2024-03-26 | Boosting Diffusion Models with Moving Average Sampling in Frequency Domain | Yurui Qian et.al. | 2403.17870v1 | null |
2024-03-26 | DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions | Sammy Christen et.al. | 2403.17827v1 | null |
2024-03-26 | Annotated Biomedical Video Generation using Denoising Diffusion Probabilistic Models and Flow Fields | Rüveyda Yilmaz et.al. | 2403.17808v1 | null |
2024-03-26 | GenesisTex: Adapting Image Denoising Diffusion to Texture Space | Chenjian Gao et.al. | 2403.17782v1 | null |
2024-03-26 | CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation | Yongrui Yu et.al. | 2403.17770v1 | null |
2024-03-26 | AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation | Huawei Wei et.al. | 2403.17694v1 | link |
2024-03-26 | Manifold-Guided Lyapunov Control with Diffusion Models | Amartya Mukherjee et.al. | 2403.17692v1 | null |
2024-03-26 | Not All Similarities Are Created Equal: Leveraging Data-Driven Biases to Inform GenAI Copyright Disputes | Uri Hacohen et.al. | 2403.17691v1 | null |
2024-03-26 | DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation | Qilin Wang et.al. | 2403.17664v1 | null |
2024-03-26 | AniArtAvatar: Animatable 3D Art Avatar from a Single Image | Shaoxu Li et.al. | 2403.17631v1 | null |
2024-03-26 | DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images | Chuhan Jiao et.al. | 2403.17477v1 | null |
2024-03-26 | LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection | Yunpeng Luo et.al. | 2403.17465v1 | null |
2024-03-26 | Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model | Runmin Dong et.al. | 2403.17460v1 | link |
2024-03-26 | InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion | Jihyun Lee et.al. | 2403.17422v1 | null |
2024-03-26 | A framework to identify supercritical and subcritical Turing bifurcations: Case study of a system sustaining cubic and quadratic autocatalysis | Deepak Kumar et.al. | 2403.17386v1 | null |
2024-03-26 | Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance | Donghoon Ahn et.al. | 2403.17377v1 | null |
2024-03-25 | Diffusion-based Negative Sampling on Graphs for Link Prediction | Trung-Kien Nguyen et.al. | 2403.17259v1 | link |
2024-03-25 | Latency-Aware Generative Semantic Communications with Pre-Trained Diffusion Models | Li Qiao et.al. | 2403.17256v1 | null |
2024-03-25 | DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment | Stella Bounareli et.al. | 2403.17217v1 | null |
2024-03-25 | AnimateMe: 4D Facial Expressions via Diffusion Models | Dimitrios Gerogiannis et.al. | 2403.17213v1 | null |
2024-03-25 | Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions | Stefan Andreas Baumann et.al. | 2403.17064v1 | link |
2024-03-25 | Provably Robust Score-Based Diffusion Posterior Sampling for Plug-and-Play Image Reconstruction | Xingyu Xu et.al. | 2403.17042v1 | null |
2024-03-25 | Invertible Diffusion Models for Compressed Sensing | Bin Chen et.al. | 2403.17006v1 | null |
2024-03-25 | TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models | Zhongwei Zhang et.al. | 2403.17005v1 | null |
2024-03-25 | SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer | Rui Zhu et.al. | 2403.17004v1 | null |
2024-03-25 | VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation | Yang Chen et.al. | 2403.17001v1 | null |
2024-03-25 | Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution | Zhikai Chen et.al. | 2403.17000v1 | null |
2024-03-25 | Comp4D: LLM-Guided Compositional 4D Scene Generation | Dejia Xu et.al. | 2403.16993v1 | null |
2024-03-25 | Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation | Omer Dahary et.al. | 2403.16990v1 | null |
2024-03-25 | Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance | Jingyuan Zhu et.al. | 2403.16954v1 | null |
2024-03-25 | Multiple-Source Localization from a Single-Snapshot Observation Using Graph Bayesian Optimization | Zonghan Zhang et.al. | 2403.16818v1 | link |
2024-03-25 | Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning | Sicong Pan et.al. | 2403.16803v1 | null |
2024-03-25 | Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases | Sophie Starck et.al. | 2403.16776v1 | null |
2024-03-25 | Improving Diffusion Models's Data-Corruption Resistance using Scheduled Pseudo-Huber Loss | Artem Khrapov et.al. | 2403.16728v1 | null |
2024-03-25 | SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions | Yuda Song et.al. | 2403.16627v1 | null |
2024-03-25 | SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation | Aysim Toker et.al. | 2403.16605v1 | null |
2024-03-25 | Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization | Xiangxin Zhou et.al. | 2403.16576v1 | null |
2024-03-25 | An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models | Zizhao Hu et.al. | 2403.16530v1 | null |
2024-03-25 | Let Real Images be as a Judger, Spotting Fake Images Synthesized with Generative Models | Ziyou Liang et.al. | 2403.16513v1 | null |
2024-03-25 | Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework | Ziyao Huang et.al. | 2403.16510v1 | link |
2024-03-25 | Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation | Sanyam Lakhanpal et.al. | 2403.16422v1 | null |
2024-03-25 | FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models | Lin Zhao et.al. | 2403.16379v1 | null |
2024-03-24 | Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis | Atefeh Khoshkhahtinat et.al. | 2403.16258v1 | null |
2024-03-24 | Skull-to-Face: Anatomy-Guided 3D Facial Reconstruction and Editing | Yongqing Liang et.al. | 2403.16207v1 | null |
2024-03-24 | Diffusion Model is a Good Pose Estimator from 3D RF-Vision | Junqiao Fan et.al. | 2403.16198v1 | null |
2024-03-24 | Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery | Siddharth Tourani et.al. | 2403.16194v1 | link |
2024-03-26 | Gaze-guided Hand-Object Interaction Synthesis: Benchmark and Method | Jie Tian et.al. | 2403.16169v2 | null |
2024-03-24 | Robust Diffusion Models for Adversarial Purification | Guang Lin et.al. | 2403.16067v1 | null |
2024-03-24 | A Unified Module for Accelerating STABLE-DIFFUSION: LCM-LORA | Ayush Thakur et.al. | 2403.16024v1 | null |
2024-03-23 | Feature Manipulation for DDPM based Change Detection | Zhenglin Li et.al. | 2403.15943v1 | null |
2024-03-26 | X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention | You Xie et.al. | 2403.15931v2 | null |
2024-03-23 | Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance | Jia-Wei Liao et.al. | 2403.15878v1 | link |
2024-03-23 | In-Context Matting | He Guo et.al. | 2403.15789v1 | null |
2024-03-23 | Time-dependent localized patterns in a predator-prey model | Fahad Al Saadi et.al. | 2403.15788v1 | null |
2024-03-23 | BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion | Jia Wei et.al. | 2403.15766v1 | null |
2024-03-22 | An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes Using Pre-Trained Text-to-Image Models | Zhengyi Zhao et.al. | 2403.15559v1 | null |
2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye et.al. | 2403.15389v1 | null |
2024-03-22 | Ultrasound Imaging based on the Variance of a Diffusion Restoration Model | Yuxin Zhang et.al. | 2403.15316v1 | null |
2024-03-22 | Controlled Training Data Generation with Diffusion Models | Teresa Yeo et.al. | 2403.15309v1 | null |
2024-03-22 | Spectral Motion Alignment for Video Motion Transfer using Diffusion Models | Geon Yeong Park et.al. | 2403.15249v1 | null |
2024-03-22 | Shadow Generation for Composite Image Using Diffusion model | Qingyang Liu et.al. | 2403.15234v1 | link |
2024-03-22 | MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration | Zhichao Wei et.al. | 2403.15059v1 | null |
2024-03-22 | Toward Tiny and High-quality Facial Makeup with Data Amplify Learning | Qiaoqiao Jin et.al. | 2403.15033v1 | null |
2024-03-22 | Dynamics of a memory-based diffusion model with spatial heterogeneity and nonlinear boundary condition | Quanli Ji et.al. | 2403.14969v1 | null |
2024-03-22 | DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow | Kyungmin Lee et.al. | 2403.14966v1 | null |
2024-03-22 | CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model | Seungdae Han et.al. | 2403.14944v1 | null |
2024-03-22 | STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians | Yifei Zeng et.al. | 2403.14939v1 | null |
2024-03-21 | Osmosis: RGBD Diffusion Prior for Underwater Image Restoration | Opher Bar Nathan et.al. | 2403.14837v1 | null |
2024-03-25 | Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing | Alberto Baldrati et.al. | 2403.14828v2 | link |
2024-03-21 | Latent Diffusion Models for Attribute-Preserving Image Anonymization | Luca Piano et.al. | 2403.14790v1 | null |
2024-03-21 | Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance | Shenhao Zhu et.al. | 2403.14781v1 | null |
2024-03-21 | StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text | Roberto Henschel et.al. | 2403.14773v1 | link |
2024-03-21 | Open Knowledge Base Canonicalization with Multi-task Learning | Bingchen Liu et.al. | 2403.14733v1 | null |
2024-03-21 | GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Yinghao Xu et.al. | 2403.14621v1 | link |
2024-03-21 | DreamReward: Text-to-3D Generation with Human Preference | Junliang Ye et.al. | 2403.14613v1 | null |
2024-03-21 | ReNoise: Real Image Inversion Through Iterative Noising | Daniel Garibi et.al. | 2403.14602v1 | null |
2024-03-21 | Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting | Alicia Durrer et.al. | 2403.14499v1 | link |
2024-03-21 | Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation | Mathias Öttl et.al. | 2403.14429v1 | null |
2024-03-21 | DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-Tuning | Jonathan Lebensold et.al. | 2403.14421v1 | null |
2024-03-21 | Physics-Informed Diffusion Models | Jan-Hendrik Bastek et.al. | 2403.14404v1 | null |
2024-03-21 | Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models | Pablo Marcos-Manchón et.al. | 2403.14291v1 | link |
2024-03-21 | Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation | Francesco Di Felice et.al. | 2403.14279v1 | null |
2024-03-21 | Diffusion Models with Ensembled Structure-Based Anomaly Scoring for Unsupervised Anomaly Detection | Finn Behrendt et.al. | 2403.14262v1 | link |
2024-03-21 | Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition | Sihyun Yu et.al. | 2403.14148v1 | null |
2024-03-21 | Protein Conformation Generation via Force-Guided SE(3) Diffusion Models | Yan Wang et.al. | 2403.14088v1 | null |
2024-03-21 | QSMDiff: Unsupervised 3D Diffusion Models for Quantitative Susceptibility Mapping | Zhuang Xiong et.al. | 2403.14070v1 | null |
2024-03-21 | LeFusion: Synthesizing Myocardial Pathology on Cardiac MRI via Lesion-Focus Diffusion Models | Hantao Zhang et.al. | 2403.14066v1 | null |
2024-03-21 | DiffSTOCK: Probabilistic relational Stock Market Predictions using Diffusion Models | Divyanshu Daiya et.al. | 2403.14063v1 | null |
2024-03-20 | Enhancing Fingerprint Image Synthesis with GANs, Diffusion Models, and Style Transfer Techniques | W. Tang et.al. | 2403.13916v1 | null |
2024-03-20 | Towards Learning Contrast Kinetics with Multi-Condition Latent Diffusion Models | Richard Osuala et.al. | 2403.13890v1 | link |
2024-03-20 | Editing Massive Concepts in Text-to-Image Diffusion Models | Tianwei Xiong et.al. | 2403.13807v1 | link |
2024-03-20 | ZigMa: Zigzag Mamba Diffusion Model | Vincent Tao Hu et.al. | 2403.13802v1 | null |
2024-03-20 | TimeRewind: Rewinding Time with Image-and-Events Video Diffusion | Jingxi Chen et.al. | 2403.13800v1 | null |
2024-03-20 | DepthFM: Fast Monocular Depth Estimation with Flow Matching | Ming Gui et.al. | 2403.13788v1 | null |
2024-03-20 | Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation | Fu-Yun Wang et.al. | 2403.13745v1 | link |
2024-03-20 | DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance | Zixuan Wang et.al. | 2403.13667v1 | link |
2024-03-20 | ZoDi: Zero-Shot Domain Adaptation with Diffusion-Based Image Transfer | Hiroki Azuma et.al. | 2403.13652v1 | null |
2024-03-20 | ReGround: Improving Textual and Spatial Grounding at No Cost | Yuseung Lee et.al. | 2403.13589v1 | null |
2024-03-20 | Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing | Hangeol Chang et.al. | 2403.13551v1 | null |
2024-03-20 | Compress3D: a Compressed Latent Space for 3D Generation from a Single Image | Bowen Zhang et.al. | 2403.13524v1 | null |
2024-03-20 | VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis | Yumeng Li et.al. | 2403.13501v1 | null |
2024-03-20 | Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion | Lucas Nunes et.al. | 2403.13470v1 | link |
2024-03-22 | S2DM: Sector-Shaped Diffusion Models for Video Generation | Haoran Lang et.al. | 2403.13408v2 | null |
2024-03-20 | IIDM: Image-to-Image Diffusion Model for Semantic Image Synthesis | Feng Liu et.al. | 2403.13378v1 | link |
2024-03-24 | AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation | Jingkun An et.al. | 2403.13352v2 | null |
2024-03-21 | LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment | Peishan Cong et.al. | 2403.13307v2 | null |
2024-03-20 | DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception | Yibo Wang et.al. | 2403.13304v1 | null |
2024-03-20 | Building Optimal Neural Architectures using Interpretable Knowledge | Keith G. Mills et.al. | 2403.13293v1 | link |
2024-03-20 | Beyond Skeletons: Integrative Latent Mapping for Coherent 4D Sequence Generation | Qitong Yang et.al. | 2403.13238v1 | null |
2024-03-20 | A Contact Model based on Denoising Diffusion to Learn Variable Impedance Control for Contact-rich Manipulation | Masashi Okada et.al. | 2403.13221v1 | null |
2024-03-20 | Diffusion Model for Data-Driven Black-Box Optimization | Zihao Li et.al. | 2403.13219v1 | null |
2024-03-19 | Depth-guided NeRF Training via Earth Mover's Distance | Anita Rau et.al. | 2403.13206v1 | null |
2024-03-19 | Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos | Hadi Alzayer et.al. | 2403.13044v1 | null |
2024-03-19 | FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Linjiang Huang et.al. | 2403.12963v1 | link |
2024-03-19 | FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation | Shuai Yang et.al. | 2403.12962v1 | link |
2024-03-19 | Zero-Reference Low-Light Enhancement via Physical Quadruple Priors | Wenjing Wang et.al. | 2403.12933v1 | null |
2024-03-19 | Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model | Jiajie Yang et.al. | 2403.12915v1 | null |
2024-03-19 | D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation | Jun Yamada et.al. | 2403.12861v1 | null |
2024-03-19 | Generative Enhancement for 3D Medical Images | Lingting Zhu et.al. | 2403.12852v1 | link |
2024-03-19 | Compositional 3D Scene Synthesis with Scene Graph Guided Layout-Shape Generation | Yao Wei et.al. | 2403.12848v1 | null |
2024-03-19 | DreamDA: Generative Data Augmentation with Diffusion Models | Yunxiang Fu et.al. | 2403.12803v1 | link |
2024-03-19 | WaveFace: Authentic Face Restoration with Efficient Frequency Recovery | Yunqi Miao et.al. | 2403.12760v1 | null |
2024-03-19 | Towards Controllable Face Generation with Semantic Latent Diffusion Models | Alex Ergasti et.al. | 2403.12743v1 | link |
2024-03-19 | AnimateDiff-Lightning: Cross-Model Diffusion Distillation | Shanchuan Lin et.al. | 2403.12706v1 | null |
2024-03-19 | Tuning-Free Image Customization with Image and Text Guidance | Pengzhi Li et.al. | 2403.12658v1 | null |
2024-03-19 | LASPA: Latent Spatial Alignment for Fast Training-free Single Image Editing | Yazeed Alharbi et.al. | 2403.12585v1 | null |
2024-03-19 | Generalized Consistency Trajectory Models for Image Manipulation | Beomsu Kim et.al. | 2403.12510v1 | link |
2024-03-19 | SC-Diff: 3D Shape Completion with Latent Diffusion Models | Juan D. Galvis et.al. | 2403.12470v1 | null |
2024-03-19 | Do Generated Data Always Help Contrastive Learning? | Yifei Wang et.al. | 2403.12448v1 | link |
2024-03-19 | Precise-Physics Driven Text-to-3D Generation | Qingshan Xu et.al. | 2403.12438v1 | null |
2024-03-19 | ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance | Yongwei Chen et.al. | 2403.12409v1 | null |
2024-03-19 | Understanding Training-free Diffusion Guidance: Mechanisms and Limitations | Yifei Shen et.al. | 2403.12404v1 | null |
2024-03-19 | OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation | Junhao Cai et.al. | 2403.12396v1 | null |
2024-03-18 | Removing Undesirable Concepts in Text-to-Image Generative Models with Learnable Prompts | Anh Bui et.al. | 2403.12326v1 | null |
2024-03-18 | Synthetic Image Generation in Cyber Influence Operations: An Emergent Threat? | Melanie Mathys et.al. | 2403.12207v1 | null |
2024-03-18 | Latent CLAP Loss for Better Foley Sound Synthesis | Tornike Karchkhadze et.al. | 2403.12182v1 | null |
2024-03-18 | Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection | Ali Karami et.al. | 2403.12172v1 | null |
2024-03-18 | Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation | Zixin Zhu et.al. | 2403.12042v1 | link |
2024-03-19 | MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control | Enshen Zhou et.al. | 2403.12037v2 | link |
2024-03-18 | One-Step Image Translation with Text-to-Image Models | Gaurav Parmar et.al. | 2403.12036v1 | link |
2024-03-18 | VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models | Junlin Han et.al. | 2403.12034v1 | null |
2024-03-19 | Generic 3D Diffusion Adapter Using Controlled Multi-View Editing | Hansheng Chen et.al. | 2403.12032v2 | null |
2024-03-18 | LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation | Yushi Lan et.al. | 2403.12019v1 | null |
2024-03-18 | Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation | Axel Sauer et.al. | 2403.12015v1 | null |
2024-03-18 | GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image | Xiao Fu et.al. | 2403.12013v1 | null |
2024-03-18 | HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data | Mengqi Zhang et.al. | 2403.12011v1 | null |
2024-03-18 | VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model | Qi Zuo et.al. | 2403.12010v1 | null |
2024-03-18 | SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion | Vikram Voleti et.al. | 2403.12008v1 | null |
2024-03-18 | SceneSense: Diffusion Models for 3D Occupancy Synthesis from Partial Observation | Alec Reed et.al. | 2403.11985v1 | null |
2024-03-18 | Diffusion Denoising as a Certified Defense against Clean-label Poisoning | Sanghyun Hong et.al. | 2403.11981v1 | null |
2024-03-18 | Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory | Hengyu Fu et.al. | 2403.11968v1 | null |
2024-03-18 | LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model | Runhui Huang et.al. | 2403.11929v1 | null |
2024-03-18 | Dual-Energy Cone-Beam CT Using Two Complementary Limited-Angle Scans with A Projection-Consistent Diffusion Model | Junbo Peng et.al. | 2403.11890v1 | null |
2024-03-18 | SuperLoRA: Parameter-Efficient Unified Adaptation of Multi-Layer Attention Modules | Xiangyu Chen et.al. | 2403.11887v1 | null |
2024-03-18 | IDF-CR: Iterative Diffusion Process for Divide-and-Conquer Cloud Removal in Remote-sensing Images | Meilin Wang et.al. | 2403.11870v1 | null |
2024-03-18 | Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm | Yi Wu et.al. | 2403.11781v1 | null |
2024-03-18 | Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models | Emilian Postolache et.al. | 2403.11706v1 | link |
2024-03-19 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697v2 | null |
2024-03-18 | Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | Julia Wolleb et.al. | 2403.11667v1 | null |
2024-03-18 | Arc2Face: A Foundation Model of Human Faces | Foivos Paraperas Papantoniou et.al. | 2403.11641v1 | null |
2024-03-18 | LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models | Yang Yang et.al. | 2403.11627v1 | link |
2024-03-18 | CRS-Diff: Controllable Generative Remote Sensing Foundation Model | Datao Tang et.al. | 2403.11614v1 | null |
2024-03-18 | EffiVED:Efficient Video Editing via Text-instruction Diffusion Models | Zhenghao Zhang et.al. | 2403.11568v1 | null |
2024-03-18 | EchoReel: Enhancing Action Generation of Existing Video Diffusion Models | Jianzhi liu et.al. | 2403.11535v1 | link |
2024-03-18 | Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors | Ruicheng Wang et.al. | 2403.11503v1 | null |
2024-03-18 | SeisFusion: Constrained Diffusion Model with Input Guidance for 3D Seismic Data Interpolation and Reconstruction | Shuang Wang et.al. | 2403.11482v1 | link |
2024-03-18 | ALDM-Grasping: Diffusion-aided Zero-Shot Sim-to-Real Transfer for Robot Grasping | Yiwei Li et.al. | 2403.11459v1 | null |
2024-03-18 | CasSR: Activating Image Power for Real-World Image Super-Resolution | Haolan Chen et.al. | 2403.11451v1 | null |
2024-03-18 | VmambaIR: Visual State Space Model for Image Restoration | Yuan Shi et.al. | 2403.11423v1 | link |
2024-03-18 | DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation | Jeongsol Kim et.al. | 2403.11415v1 | null |
2024-03-18 | Divide-and-Conquer Posterior Sampling for Denoising Diffusion Priors | Yazid Janati et.al. | 2403.11407v1 | null |
2024-03-17 | StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining | Tushar Kataria et.al. | 2403.11340v1 | null |
2024-03-17 | Fast Personalized Text-to-Image Syntheses With Attention Injection | Yuxuan Zhang et.al. | 2403.11284v1 | null |
2024-03-17 | Understanding Diffusion Models by Feynman's Path Integral | Yuji Hirono et.al. | 2403.11262v1 | null |
2024-03-17 | THOR: Text to Human-Object Interaction Diffusion via Relation Intervention | Qianyang Wu et.al. | 2403.11208v1 | null |
2024-03-17 | MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation | Yasufumi Kawano et.al. | 2403.11194v1 | link |
2024-03-17 | Artifact Feature Purification for Cross-domain Detection of AI-generated Images | Zheling Meng et.al. | 2403.11172v1 | null |
2024-03-17 | CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion | Xiaoyu Wu et.al. | 2403.11162v1 | null |
2024-03-17 | Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model | Dian Zheng et.al. | 2403.11157v1 | link |
2024-03-17 | Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D Applications | Yonggan Fu et.al. | 2403.11131v1 | null |
2024-03-17 | 3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models | Yongtao Ge et.al. | 2403.11111v1 | null |
2024-03-17 | Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models | Ruibin Li et.al. | 2403.11105v1 | link |
2024-03-19 | Zippo: Zipping Color and Transparency Distributions into a Single Diffusion Model | Kangyang Xie et.al. | 2403.11077v2 | null |
2024-03-17 | Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention | Jie Ren et.al. | 2403.11052v1 | null |
2024-03-16 | Reward Guided Latent Consistency Distillation | Jiachen Li et.al. | 2403.11027v1 | null |
2024-03-16 | OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models | Zhe Kong et.al. | 2403.10983v1 | link |
2024-03-16 | Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription | Hongxiang Zhao et.al. | 2403.10953v1 | null |
2024-03-19 | Efficient Diffusion-Driven Corruption Editor for Test-Time Adaptation | Yeongtak Oh et.al. | 2403.10911v2 | null |
2024-03-19 | Urban Sound Propagation: a Benchmark for 1-Step Generative Modeling of Complex Physical Systems | Martin Spitznagel et.al. | 2403.10904v2 | null |
2024-03-16 | A Watermark-Conditioned Diffusion Model for IP Protection | Rui Min et.al. | 2403.10893v1 | null |
2024-03-16 | stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation | Xiaoyu Li et.al. | 2403.10863v1 | null |
2024-03-16 | MicroDiffusion: Implicit Representation-Guided Diffusion for 3D Reconstruction from Limited 2D Microscopy Projections | Mude Hui et.al. | 2403.10815v1 | link |
2024-03-16 | Efficient Trajectory Forecasting and Generation with Conditional Flow Matching | Sean Ye et.al. | 2403.10809v1 | null |
2024-03-16 | Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference | Fan Zhang et.al. | 2403.10805v1 | null |
2024-03-16 | Diffusion-Reinforcement Learning Hierarchical Motion Planning in Adversarial Multi-agent Games | Zixuan Wu et.al. | 2403.10794v1 | link |
2024-03-16 | ContourDiff: Unpaired Image Translation with Contour-Guided Diffusion Models | Yuwen Chen et.al. | 2403.10786v1 | null |
2024-03-15 | Giving a Hand to Diffusion Models: a Two-Stage Approach to Improving Conditional Human Image Generation | Anton Pelykh et.al. | 2403.10731v1 | null |
2024-03-15 | Debiasing with Diffusion: Probabilistic reconstruction of Dark Matter fields from galaxies with CAMELS | Victoria Ono et.al. | 2403.10648v1 | null |
2024-03-15 | LightIt: Illumination Modeling and Control for Diffusion Models | Peter Kocsis et.al. | 2403.10615v1 | null |
2024-03-15 | Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives | Ronghui Li et.al. | 2403.10518v1 | link |
2024-03-15 | Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding | Pengkun Liu et.al. | 2403.10395v1 | link |
2024-03-15 | Denoising Task Difficulty-based Curriculum for Training Diffusion Models | Jin-Young Kim et.al. | 2403.10348v1 | null |
2024-03-15 | Optimal Control of Stationary Doubly Diffusive Flows on Two and Three Dimensional Bounded Lipschitz Domains: Numerical Analysis | Jai Tushar et.al. | 2403.10282v1 | null |
2024-03-15 | Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder | Jinseok Kim et.al. | 2403.10255v1 | null |
2024-03-15 | FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model | Qijun Feng et.al. | 2403.10242v1 | null |
2024-03-15 | BlindDiff: Empowering Degradation Modelling in Diffusion Models for Blind Image Super-Resolution | Feng Li et.al. | 2403.10211v1 | link |
2024-03-15 | Spectral CT Two-step and One-step Material Decomposition using Diffusion Posterior Sampling | Corentin Vazia et.al. | 2403.10183v1 | null |
2024-03-15 | Animate Your Motion: Turning Still Images into Dynamic Videos | Mingxiao Li et.al. | 2403.10179v1 | null |
2024-03-15 | Being heterogeneous is disadvantageous: Brownian non-Gaussian searches | Vittoria Sposini et.al. | 2403.10138v1 | null |
2024-03-15 | DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration | Nan Gao et.al. | 2403.10098v1 | null |
2024-03-15 | RangeLDM: Fast Realistic LiDAR Point Cloud Generation | Qianjiang Hu et.al. | 2403.10094v1 | null |
2024-03-15 | SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model | Tao Wu et.al. | 2403.10044v1 | null |
2024-03-15 | ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real Images | Xiangtian Xue et.al. | 2403.10004v1 | null |
2024-03-15 | Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting | Zhiqi Li et.al. | 2403.09981v1 | null |
2024-03-14 | ProMark: Proactive Diffusion Watermarking for Causal Attribution | Vishal Asnani et.al. | 2403.09914v1 | null |
2024-03-14 | DTG : Diffusion-based Trajectory Generation for Mapless Global Navigation | Jing Liang et.al. | 2403.09900v1 | null |
2024-03-14 | SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior | Huan-ang Gao et.al. | 2403.09638v1 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631v1 | null |
2024-03-14 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang et.al. | 2403.09630v1 | link |
2024-03-14 | Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation | Fangfu Liu et.al. | 2403.09625v1 | null |
2024-03-14 | Score-Guided Diffusion for 3D Human Recovery | Anastasis Stathopoulos et.al. | 2403.09623v1 | link |
2024-03-14 | Explore In-Context Segmentation via Latent Diffusion Models | Chaoyang Wang et.al. | 2403.09616v1 | null |
2024-03-14 | MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models | Zunnan Xu et.al. | 2403.09471v1 | null |
2024-03-14 | Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing | Wonjun Kang et.al. | 2403.09468v1 | link |
2024-03-14 | Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk | Zhangheng Li et.al. | 2403.09450v1 | link |
2024-03-14 | 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation | Frank Zhang et.al. | 2403.09439v1 | null |
2024-03-14 | LM2D: Lyrics- and Music-Driven Dance Synthesis | Wenjie Yin et.al. | 2403.09407v1 | null |
2024-03-14 | Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction | Hanyu Chen et.al. | 2403.09355v1 | null |
2024-03-14 | HeadEvolver: Text to Head Avatars via Locally Learnable Mesh Deformation | Duotun Wang et.al. | 2403.09326v1 | null |
2024-03-14 | Regularity and trend to equilibrium for a non-local advection-diffusion model of active particles | Luca Alasio et.al. | 2403.09282v1 | null |
2024-03-14 | XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model | Anees Ur Rehman Hashmi et.al. | 2403.09240v1 | null |
2024-03-14 | Intention-driven Ego-to-Exo Video Generation | Hongchen Luo et.al. | 2403.09194v1 | null |
2024-03-14 | Intention-aware Denoising Diffusion Model for Trajectory Prediction | Chen Liu et.al. | 2403.09190v1 | null |
2024-03-14 | Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts | Byeongjun Park et.al. | 2403.09176v1 | link |
2024-03-14 | Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior | Cheng Chen et.al. | 2403.09140v1 | null |
2024-03-14 | Rethinking Referring Object Removal | Xiangtian Xue et.al. | 2403.09128v1 | null |
2024-03-14 | StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control | Jaerin Lee et.al. | 2403.09055v1 | link |
2024-03-13 | Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images | Giuseppe Cartella et.al. | 2403.08933v1 | link |
2024-03-13 | Envision3D: One Image to 3D with Anchor Views Interpolation | Yatian Pang et.al. | 2403.08902v1 | link |
2024-03-13 | Federated Data Model | Xiao Chen et.al. | 2403.08887v1 | null |
2024-03-13 | NoiseDiffusion: Correcting Noise for Image Interpolation with Diffusion Models beyond Spherical Linear Interpolation | PengFei Zheng et.al. | 2403.08840v1 | null |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764v1 | null |
2024-03-13 | Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI | Shihan Qiu et.al. | 2403.08758v1 | null |
2024-03-13 | Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI | Shihan Qiu et.al. | 2403.08749v1 | null |
2024-03-14 | GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing | Jing Wu et.al. | 2403.08733v2 | null |
2024-03-13 | Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data | Asad Aali et.al. | 2403.08728v1 | link |
2024-03-13 | Data Augmentation in Human-Centric Vision | Wentao Jiang et.al. | 2403.08650v1 | null |
2024-03-13 | ActionDiffusion: An Action-aware Diffusion Model for Procedure Planning in Instructional Videos | Lei Shi et.al. | 2403.08591v1 | null |
2024-03-13 | Federated Knowledge Graph Unlearning via Diffusion Model | Bingchen Liu et.al. | 2403.08554v1 | null |
2024-03-13 | Model Will Tell: Training Membership Inference for Diffusion Models | Xiaomeng Fu et.al. | 2403.08487v1 | null |
2024-03-13 | MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction | Linjie Fu et.al. | 2403.08479v1 | link |
2024-03-13 | An Analysis of Human Alignment of Latent Diffusion Models | Lorenz Linhardt et.al. | 2403.08469v1 | null |
2024-03-13 | Diffusion Models with Implicit Guidance for Medical Anomaly Detection | Cosmin I. Bercea et.al. | 2403.08464v1 | link |
2024-03-13 | Towards Dense and Accurate Radar Perception Via Efficient Cross-Modal Diffusion Model | Ruibin Zhang et.al. | 2403.08460v1 | null |
2024-03-13 | PFStorer: Personalized Face Restoration and Super-Resolution | Tuomas Varanka et.al. | 2403.08436v1 | null |
2024-03-13 | Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification | Shuhan Li et.al. | 2403.08407v1 | null |
2024-03-13 | Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models | Pengze Zhang et.al. | 2403.08381v1 | link |
2024-03-13 | Mitigate Target-level Insensitivity of Infrared Small Target Detection via Posterior Distribution Modeling | Haoqing Li et.al. | 2403.08380v1 | link |
2024-03-13 | VIGFace: Virtual Identity Generation Model for Face Image Synthesis | Minsoo Kim et.al. | 2403.08277v1 | null |
2024-03-13 | Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models | Jian Lin et.al. | 2403.08266v1 | null |
2024-03-13 | Make Me Happier: Evoking Emotions Through Image Diffusion Models | Qing Lin et.al. | 2403.08255v1 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860v1 | link |
2024-03-12 | Quantifying and Mitigating Privacy Risks for Tabular Generative Models | Chaoyi Zhu et.al. | 2403.07842v1 | null |
2024-03-12 | MPCPA: Multi-Center Privacy Computing with Predictions Aggregation based on Denoising Diffusion Probabilistic Model | Guibo Luo et.al. | 2403.07838v1 | null |
2024-03-13 | SemCity: Semantic Scene Generation with Triplane Diffusion | Jumin Lee et.al. | 2403.07773v2 | link |
2024-03-12 | Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model | Yuxuan Zhang et.al. | 2403.07764v1 | null |
2024-03-12 | SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces | Yuta Oshima et.al. | 2403.07711v1 | link |
2024-03-12 | Visual Privacy Auditing with Diffusion Models | Kristian Schwethelm et.al. | 2403.07588v1 | null |
2024-03-12 | D4D: An RGBD diffusion model to boost monocular depth estimation | L. Papa et.al. | 2403.07516v1 | link |
2024-03-12 | Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation | Likun Li et.al. | 2403.07500v1 | null |
2024-03-12 | Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models | Phuong Dam et.al. | 2403.07371v1 | null |
2024-03-12 | Efficient Diffusion Model for Image Restoration by Residual Shifting | Zongsheng Yue et.al. | 2403.07319v1 | link |
2024-03-12 | It's All About Your Sketch: Democratising Sketch Control in Diffusion Models | Subhadeep Koley et.al. | 2403.07234v1 | link |
2024-03-12 | Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers | Subhadeep Koley et.al. | 2403.07214v1 | null |
2024-03-11 | 3M-Diffusion: Latent Multi-Modal Diffusion for Text-Guided Generation of Molecular Graphs | Huaisheng Zhu et.al. | 2403.07179v1 | null |
2024-03-11 | One Category One Prompt: Dataset Distillation using Diffusion Models | Ali Abbasi et.al. | 2403.07142v1 | null |
2024-03-11 | BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion | Xuan Ju et.al. | 2403.06976v1 | link |
2024-03-11 | Bayesian Diffusion Models for 3D Shape Reconstruction | Haiyang Xu et.al. | 2403.06973v1 | null |
2024-03-11 | POD-ROM methods: from a finite set of snapshots to continuous-in-time approximations | Bosco Garcia-Archilla et.al. | 2403.06967v1 | null |
2024-03-11 | SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data | Jialu Li et.al. | 2403.06952v1 | null |
2024-03-12 | DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations | Tianhao Qi et.al. | 2403.06951v2 | null |
2024-03-11 | Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction | Qing Xiao et.al. | 2403.06940v1 | null |
2024-03-11 | Estimation of parameters and local times in a discretely observed threshold diffusion model | Sara Mazzonetto et.al. | 2403.06858v1 | null |
2024-03-11 | Multistep Consistency Models | Jonathan Heek et.al. | 2403.06807v1 | null |
2024-03-11 | Distribution-Aware Data Expansion with Diffusion Models | Haowei Zhu et.al. | 2403.06741v1 | link |
2024-03-11 | V3D: Video Diffusion Models are Effective 3D Generators | Zilong Chen et.al. | 2403.06738v1 | link |
2024-03-11 | Active Generation for Image Classification | Tao Huang et.al. | 2403.06517v1 | null |
2024-03-11 | Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning | Woojung Han et.al. | 2403.06516v1 | null |
2024-03-11 | Incorporating Improved Sinusoidal Threshold-based Semi-supervised Method and Diffusion Models for Osteoporosis Diagnosis | Wenchi Ke et.al. | 2403.06498v1 | null |
2024-03-11 | Are you sure? Modelling Drivers' Confidence Judgments in Left-Turn Gap Acceptance Decisions | Arkady Zgonnikov et.al. | 2403.06496v1 | null |
2024-03-13 | Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation | Guangyang Wu et.al. | 2403.06452v2 | null |
2024-03-11 | DivCon: Divide and Conquer for Progressive Text-to-Image Generation | Yuhao Jia et.al. | 2403.06400v1 | link |
2024-03-13 | FSViewFusion: Few-Shots View Generation of Novel Objects | Rukhshanda Hussain et.al. | 2403.06394v2 | null |
2024-03-11 | Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models | Yang Zhang et.al. | 2403.06381v1 | null |
2024-03-12 | Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style | Shuai Tan et.al. | 2403.06365v2 | null |
2024-03-10 | Transferable Reinforcement Learning via Generalized Occupancy Models | Chuning Zhu et.al. | 2403.06328v1 | null |
2024-03-10 | Spectral Diffusion Posterior Sampling for Synergistic Reconstruction in Spectral Computed Tomography | Corentin Vazia et.al. | 2403.06308v1 | null |
2024-03-12 | Fine-tuning of diffusion models via stochastic control: entropy regularization and beyond | Wenpin Tang et.al. | 2403.06279v2 | null |
2024-03-10 | FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing | Youyuan Zhang et.al. | 2403.06269v1 | null |
2024-03-10 | DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation | Xiaobin Hu et.al. | 2403.06168v1 | null |
2024-03-10 | Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation | Paweł A. Pierzchlewicz et.al. | 2403.06164v1 | link |
2024-03-10 | MACE: Mass Concept Erasure in Diffusion Models | Shilin Lu et.al. | 2403.06135v1 | link |
2024-03-10 | VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models | Wenhao Wang et.al. | 2403.06098v1 | link |
2024-03-10 | Diffusion Models Trained with Large Data Are Transferable Visual Models | Guangkai Xu et.al. | 2403.06090v1 | null |
2024-03-10 | Implicit Image-to-Image Schrodinger Bridge for CT Super-Resolution and Denoising | Yuang Wang et.al. | 2403.06069v1 | null |
2024-03-12 | Decoupled Data Consistency with Diffusion Purification for Image Restoration | Xiang Li et.al. | 2403.06054v2 | null |
2024-03-09 | CoNFiLD: Conditional Neural Field Latent Diffusion Model Generating Spatiotemporal Turbulence | Pan Du et.al. | 2403.05940v1 | null |
2024-03-12 | SEMRes-DDPM: Residual Network Based Diffusion Modelling Applied to Imbalanced Data | Ming Zheng et.al. | 2403.05918v2 | null |
2024-03-09 | Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines | Michael Toker et.al. | 2403.05846v1 | null |
2024-03-12 | An Audio-textual Diffusion Model For Converting Speech Signals Into Ultrasound Tongue Imaging Data | Yudong Yang et.al. | 2403.05820v2 | null |
2024-03-09 | Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution | Junxiong Lin et.al. | 2403.05808v1 | null |
2024-03-09 | Privacy-Preserving Diffusion Model Using Homomorphic Encryption | Yaojian Chen et.al. | 2403.05794v1 | null |
2024-03-09 | Large Generative Model Assisted 3D Semantic Communication | Feibo Jiang et.al. | 2403.05783v1 | null |
2024-03-09 | MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process | Xinyao Fan et.al. | 2403.05751v1 | link |
2024-03-08 | Non-robustness of diffusion estimates on networks with measurement error | Arun G. Chandrasekhar et.al. | 2403.05704v1 | null |
2024-03-08 | Audio-Synchronized Visual Animation | Lin Zhang et.al. | 2403.05659v1 | null |
2024-03-08 | VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Yabo Zhang et.al. | 2403.05438v1 | link |
2024-03-08 | DiffSF: Diffusion Models for Scene Flow Estimation | Yushan Zhang et.al. | 2403.05327v1 | null |
2024-03-08 | Noise Level Adaptive Diffusion Model for Robust Reconstruction of Accelerated MRI | Shoujin Huang et.al. | 2403.05245v1 | null |
2024-03-08 | Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation | Junyan Wang et.al. | 2403.05239v1 | null |
2024-03-08 | Denoising Autoregressive Representation Learning | Yazhe Li et.al. | 2403.05196v1 | null |
2024-03-08 | DiffuLT: How to Make Diffusion Model Useful for Long-tail Recognition | Jie Shao et.al. | 2403.05170v1 | null |
2024-03-08 | GSEdit: Efficient Text-Guided Editing of 3D Objects via Gaussian Splatting | Francesco Palandra et.al. | 2403.05154v1 | null |
2024-03-08 | Improving Diffusion Models for Virtual Try-on | Yisol Choi et.al. | 2403.05139v1 | null |
2024-03-08 | ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment | Xiwei Hu et.al. | 2403.05135v1 | null |
2024-03-08 | CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion | Wendi Zheng et.al. | 2403.05121v1 | null |
2024-03-08 | Face2Diffusion for Fast and Editable Face Personalization | Kaede Shiohara et.al. | 2403.05094v1 | link |
2024-03-08 | Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile | Seokjun Lee et.al. | 2403.05093v1 | null |
2024-03-08 | Improving Diffusion-Based Generative Models via Approximated Optimal Transport | Daegyu Kim et.al. | 2403.05069v1 | null |
2024-03-08 | XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution | Yunpeng Qu et.al. | 2403.05049v1 | null |
2024-03-08 | BjTT: A Large-scale Multimodal Dataset for Traffic Prediction | Chengyang Zhang et.al. | 2403.05029v1 | link |
2024-03-08 | InstructGIE: Towards Generalizable Image Editing | Zichong Meng et.al. | 2403.05018v1 | null |
2024-03-08 | DiffClass: Diffusion-Based Class Incremental Learning | Zichong Meng et.al. | 2403.05016v1 | null |
2024-03-08 | RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction | Peng Liu et.al. | 2403.05010v1 | link |
2024-03-08 | StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models | Lezhong Wang et.al. | 2403.04965v1 | null |
2024-03-07 | AFreeCA: Annotation-Free Counting for All | Adriano D'Alessandro et.al. | 2403.04943v1 | null |
2024-03-07 | An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control | Aosong Feng et.al. | 2403.04880v1 | null |
2024-03-07 | ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes | Hashmat Shadab Malik et.al. | 2403.04701v1 | null |
2024-03-07 | Delving into the Trajectory Long-tail Distribution for Muti-object Tracking | Sijia Chen et.al. | 2403.04700v1 | link |
2024-03-07 | PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation | Junsong Chen et.al. | 2403.04692v1 | null |
2024-03-08 | Pix2Gif: Motion-Guided Diffusion for GIF Generation | Hitesh Kandala et.al. | 2403.04634v2 | null |
2024-03-07 | A Domain Translation Framework with an Adversarial Denoising Diffusion Model to Generate Synthetic Datasets of Echocardiography Images | Cristiana Tiago et.al. | 2403.04612v1 | null |
2024-03-07 | Anatomy-Guided Surface Diffusion Model for Alzheimer's Disease Normative Modeling | Jianwei Zhang et.al. | 2403.04531v1 | null |
2024-03-07 | Effect of turbulent diffusion in modeling anaerobic digestion | Jeremy Z. Yan et.al. | 2403.04457v1 | null |
2024-03-07 | Disentangled Diffusion-Based 3D Human Pose Estimation with Hierarchical Spatial and Temporal Denoiser | Qingyuan Cai et.al. | 2403.04444v1 | null |
2024-03-07 | StableDrag: Stable Dragging for Point-based Image Editing | Yutao Cui et.al. | 2403.04437v1 | null |
2024-03-07 | On-demand Quantization for Green Federated Generative Diffusion in Mobile Edge Networks | Bingkun Lai et.al. | 2403.04430v1 | null |
2024-03-07 | Controllable Generation with Text-to-Image Diffusion Models: A Survey | Pu Cao et.al. | 2403.04279v1 | link |
2024-03-06 | PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement | Zhijie Wang et.al. | 2403.04014v1 | link |
2024-03-06 | GUIDE: Guidance-based Incremental Learning with Diffusion Models | Bartosz Cywiński et.al. | 2403.03938v1 | link |
2024-03-06 | Latent Dataset Distillation with Diffusion Models | Brian B. Moser et.al. | 2403.03881v1 | null |
2024-03-06 | Accelerating Convergence of Score-Based Diffusion Models, Provably | Gen Li et.al. | 2403.03852v1 | null |
2024-03-06 | Diffusion on language model embeddings for protein sequence generation | Viacheslav Meshchaninov et.al. | 2403.03726v1 | null |
2024-03-06 | Efficient Search and Learning for Agile Locomotion on Stepping Stones | Adithya Kumar Chinnakkonda Ravi et.al. | 2403.03639v1 | null |
2024-03-06 | Diffusion-based Generative Prior for Low-Complexity MIMO Channel Estimation | Benedikt Fesl et.al. | 2403.03545v1 | link |
2024-03-06 | NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging | Takahiro Shirakawa et.al. | 2403.03485v1 | null |
2024-03-06 | FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion | Hao Wang et.al. | 2403.03463v1 | null |
2024-03-06 | Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing | Bingyan Liu et.al. | 2403.03431v1 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206v1 | null |
2024-03-05 | MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets | Hossein Aboutalebi et.al. | 2403.03194v1 | null |
2024-03-05 | NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models | Zeqian Ju et.al. | 2403.03100v1 | null |
2024-03-05 | Global N-body Simulation of Gap Edge Structures Created by Perturbations from a Small Satellite Embedded in Saturn's Rings | Naoya Torii et.al. | 2403.03012v1 | null |
2024-03-05 | Cross-Domain Image Conversion by CycleDM | Sho Shimotsumagari et.al. | 2403.02919v1 | null |
2024-03-05 | MMoFusion: Multi-modal Co-Speech Motion Generation with Diffusion Model | Sen Wang et.al. | 2403.02905v1 | null |
2024-03-05 | Enhancing the Rate-Distortion-Perception Flexibility of Learned Image Codecs with Conditional Diffusion Decoders | Daniele Mari et.al. | 2403.02887v1 | null |
2024-03-05 | Zero-LED: Zero-Reference Lighting Estimation Diffusion Model for Low-Light Image Enhancement | Jinhong He et.al. | 2403.02879v1 | null |
2024-03-05 | Scalable Continuous-time Diffusion Framework for Network Inference and Influence Estimation | Keke Huang et.al. | 2403.02867v1 | null |
2024-03-05 | Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation | Weijie Li et.al. | 2403.02827v1 | null |
2024-03-05 | Fast, Scale-Adaptive, and Uncertainty-Aware Downscaling of Earth System Model Fields with Generative Foundation Models | Philipp Hess et.al. | 2403.02774v1 | null |
2024-03-05 | Few-shot Learner Parameterization by Diffusion Time-steps | Zhongqi Yue et.al. | 2403.02649v1 | null |
2024-03-05 | Semantic Human Mesh Reconstruction with Textures | Xiaoyu Zhan et.al. | 2403.02561v1 | null |
2024-03-05 | Updating the Minimum Information about CLinical Artificial Intelligence (MI-CLAIM) checklist for generative modeling research | Brenda Y. Miao et.al. | 2403.02558v1 | link |
2024-03-06 | UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control | Xuweiyi Chen et.al. | 2403.02332v3 | link |
2024-03-04 | 3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors | Fangzhou Hong et.al. | 2403.02234v1 | link |
2024-03-04 | DragTex: Generative Point-Based Texture Editing on 3D Mesh | Yudi Zhang et.al. | 2403.02217v1 | null |
2024-03-04 | ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models | Jiaxiang Cheng et.al. | 2403.02084v1 | link |
2024-03-04 | FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio | Chao Xu et.al. | 2403.01901v1 | link |
2024-03-04 | ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models | Lukas Höllein et.al. | 2403.01807v1 | link |
2024-03-07 | OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on | Yuhao Xu et.al. | 2403.01779v2 | link |
2024-03-04 | Differentially Private Synthetic Data via Foundation Model APIs 2: Text | Chulin Xie et.al. | 2403.01749v1 | link |
2024-03-04 | Soft-constrained Schrodinger Bridge: a Stochastic Control Approach | Jhanvi Garg et.al. | 2403.01717v1 | null |
2024-03-04 | HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances | Supreeth Narasimhaswamy et.al. | 2403.01693v1 | null |
2024-03-07 | Reaction-diffusion models of biological invasion: Open source computational tools, key concepts and analysis | Matthew J Simpson et.al. | 2403.01667v4 | link |
2024-03-03 | Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models | Yuchen Wu et.al. | 2403.01639v1 | null |
2024-03-03 | Critical windows: non-asymptotic theory for feature emergence in diffusion models | Marvin Li et.al. | 2403.01633v1 | null |
2024-03-03 | Neural Graph Generator: Feature-Conditioned Graph Generation using Latent Diffusion Models | Iakovos Evdaimon et.al. | 2403.01535v1 | link |
2024-03-03 | SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation | Hongjian Liu et.al. | 2403.01505v1 | null |
2024-03-03 | Learning A Physical-aware Diffusion Model Based on Transformer for Underwater Image Enhancement | Chen Zhao et.al. | 2403.01497v1 | null |
2024-03-03 | Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection | Sam Dauncey et.al. | 2403.01485v1 | null |
2024-03-02 | DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction | Junwen Xiong et.al. | 2403.01226v1 | null |
2024-03-02 | TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion | Salaheldin Mohamed et.al. | 2403.01212v1 | null |
2024-03-02 | Training Unbiased Diffusion Models From Biased Dataset | Yeongmin Kim et.al. | 2403.01189v1 | link |
2024-03-02 | Volume diffusion modelling of a sheared granular gas | Duncan Dockar et.al. | 2403.01188v1 | null |
2024-03-02 | Text-guided Explorable Image Super-resolution | Kanchana Vaishnavi Gandikota et.al. | 2403.01124v1 | null |
2024-03-02 | Face Swap via Diffusion Model | Feifei Wang et.al. | 2403.01108v1 | null |
2024-03-01 | A time-stepping deep gradient flow method for option pricing in (rough) diffusion models | Antonis Papapantoleon et.al. | 2403.00746v1 | null |
2024-03-01 | Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks | Yuhao Liu et.al. | 2403.00644v1 | null |
2024-03-01 | Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset | Ander Salaberria et.al. | 2403.00587v1 | link |
2024-03-01 | Rethinking cluster-conditioned diffusion models | Nikolas Adaloglou et.al. | 2403.00570v1 | null |
2024-03-01 | Waves, patterns and bifurcations: a tutorial review on the vertebrate segmentation clock | Paul François et.al. | 2403.00457v1 | null |
2024-03-01 | An Ordinal Diffusion Model for Generating Medical Images with Different Severity Levels | Shumpei Takezaki et.al. | 2403.00452v1 | null |
2024-03-01 | LoMOE: Localized Multi-Object Editing via Multi-Diffusion | Goirik Chakrabarty et.al. | 2403.00437v1 | null |
2024-03-01 | Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Jianwu Fang et.al. | 2403.00436v1 | null |
2024-03-01 | HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation | Zhiying Leng et.al. | 2403.00372v1 | null |
2024-03-05 | Robust Policy Learning via Offline Skill Diffusion | Woo Kyung Kim et.al. | 2403.00225v2 | null |
2024-02-29 | DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | Muyang Li et.al. | 2402.19481v1 | null |
2024-02-29 | Towards Generalizable Tumor Synthesis | Qi Chen et.al. | 2402.19470v1 | null |
2024-02-29 | Listening to the Noise: Blind Denoising with Gibbs Diffusion | David Heurtel-Depeiges et.al. | 2402.19455v1 | link |
2024-02-29 | Structure Preserving Diffusion Models | Haoye Lu et.al. | 2402.19369v1 | null |
2024-02-29 | A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation | Hanxi Li et.al. | 2402.19330v1 | null |
2024-02-29 | DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly | Gianluca Scarpellini et.al. | 2402.19302v1 | link |
2024-02-29 | TEncDM: Understanding the Properties of Diffusion Model in the Space of Language Model Encodings | Alexander Shabalin et.al. | 2402.19097v1 | null |
2024-03-01 | Graph Convolutional Neural Networks for Automated Echocardiography View Recognition: A Holistic Approach | Sarina Thomas et.al. | 2402.19062v2 | null |
2024-02-29 | WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis | Paul Friedrich et.al. | 2402.19043v1 | link |
2024-02-29 | Generating, Reconstructing, and Representing Discrete and Continuous Data: Generalized Diffusion with Learnable Encoding-Decoding | Guangyi Liu et.al. | 2402.19009v1 | null |
2024-02-29 | ViewFusion: Towards Multi-View Consistency via Interpolated Denoising | Xianghui Yang et.al. | 2402.18842v1 | link |
2024-03-03 | Extended Flow Matching: a Method of Conditional Generation with Generalized Continuity Equation | Noboru Isobe et.al. | 2402.18839v2 | null |
2024-02-29 | A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D | Xiaohan Fei et.al. | 2402.18780v1 | null |
2024-03-04 | Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An Adversarial Perspective | Xinjian Luo et.al. | 2402.18607v2 | null |
2024-02-28 | Logarithmic Sobolev Inequalities for Bounded Domains and Applications to Drift-Diffusion Equations | Elie Abdo et.al. | 2402.18572v1 | null |
2024-02-28 | Dynamical Regimes of Diffusion Models | Giulio Biroli et.al. | 2402.18491v1 | null |
2024-02-28 | Deep Confident Steps to New Pockets: Strategies for Docking Generalization | Gabriele Corso et.al. | 2402.18396v1 | link |
2024-02-28 | Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model | Sangjoon Park et.al. | 2402.18362v1 | null |
2024-02-28 | FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes | Ziying Pan et.al. | 2402.18331v1 | link |
2024-02-28 | Balancing Act: Distribution-Guided Debiasing in Diffusion Models | Rishubh Parihar et.al. | 2402.18206v1 | null |
2024-02-28 | Diffusion-based Neural Network Weights Generation | Bedionita Soro et.al. | 2402.18153v1 | null |
2024-02-28 | Context-aware Talking Face Video Generation | Meidai Xuanyuan et.al. | 2402.18092v1 | null |
2024-02-28 | Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis | Yanzuo Lu et.al. | 2402.18078v1 | link |
2024-03-05 | SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model | Bin Cao et.al. | 2402.18068v2 | null |
2024-02-28 | Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints | Lingkai Kong et.al. | 2402.18012v1 | null |
2024-03-01 | Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning | Zeyang Liu et.al. | 2402.17978v2 | null |
2024-02-27 | Box It to Bind It: Unified Layout Control and Attribute Binding in T2I Diffusion Models | Ashkan Taghipour et.al. | 2402.17910v1 | null |
2024-02-27 | Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning | Xiaoyu Zhang et.al. | 2402.17768v1 | null |
2024-03-04 | Structure-Guided Adversarial Training of Diffusion Models | Ling Yang et.al. | 2402.17563v2 | null |
2024-02-27 | Diffusion Model-Based Image Editing: A Survey | Yi Huang et.al. | 2402.17525v1 | link |
2024-02-27 | Label-Noise Robust Diffusion Models | Byeonghu Na et.al. | 2402.17517v1 | link |
2024-02-27 | EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Linrui Tian et.al. | 2402.17485v1 | null |
2024-02-28 | DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models | Shyam Marjit et.al. | 2402.17412v2 | null |
2024-02-27 | Generative diffusion model for surface structure discovery | Nikolaj Rønne et.al. | 2402.17404v1 | null |
2024-02-27 | Denoising Diffusion Models for Inpainting of Healthy Brain Tissue | Alicia Durrer et.al. | 2402.17307v1 | null |
2024-02-27 | DivAvatar: Diverse 3D Avatar Generation with a Single Prompt | Weijing Tao et.al. | 2402.17292v1 | null |
2024-02-27 | Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network | Zhaoyang Wang et.al. | 2402.17285v1 | null |
2024-02-29 | DiFashion: Towards Personalized Outfit Generation and Recommendation | Yiyan Xu et.al. | 2402.17279v2 | null |
2024-02-27 | One-Shot Structure-Aware Stylized Image Synthesis | Hansam Cho et.al. | 2402.17275v1 | null |
2024-02-27 | Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation | Daiqing Li et.al. | 2402.17245v1 | null |
2024-02-28 | CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization | Hao-Yang Peng et.al. | 2402.17214v2 | null |
2024-02-27 | Generative Learning for Forecasting the Dynamics of Complex Systems | Han Gao et.al. | 2402.17157v1 | null |
2024-02-27 | TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation | Lin Zongying et.al. | 2402.17156v1 | link |
2024-02-27 | SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution | Chengcheng Wang et.al. | 2402.17133v1 | link |
2024-03-01 | Transparent Image Layer Diffusion using Latent Transparency | Lvmin Zhang et.al. | 2402.17113v3 | link |
2024-03-01 | Renormalization Group flow, Optimal Transport and Diffusion-based Generative Model | Artan Sheshmani et.al. | 2402.17090v2 | null |
2024-02-26 | A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data | Antonio Sclocchi et.al. | 2402.16991v1 | null |
2024-02-25 | Diffusion Posterior Proximal Sampling for Image Restoration | Hongjie Wu et.al. | 2402.16907v1 | null |
2024-02-26 | Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing | Ling Yang et.al. | 2402.16627v1 | link |
2024-02-27 | Stochastic Conditional Diffusion Models for Semantic Image Synthesis | Juyeon Ko et.al. | 2402.16506v2 | null |
2024-02-26 | Outline-Guided Object Inpainting with Diffusion Models | Markus Pobitzer et.al. | 2402.16421v1 | null |
2024-02-26 | Placing Objects in Context via Inpainting for Out-of-distribution Segmentation | Pau de Jorge et.al. | 2402.16392v1 | link |
2024-02-26 | Generative AI in Vision: A Survey on Models, Metrics and Applications | Gaurav Raut et.al. | 2402.16369v1 | null |
2024-02-27 | Feedback Efficient Online Fine-Tuning of Diffusion Models | Masatoshi Uehara et.al. | 2402.16359v2 | null |
2024-02-26 | Graph Diffusion Policy Optimization | Yijing Liu et.al. | 2402.16302v1 | link |
2024-02-25 | Photon-counting CT using a Conditional Diffusion Model for Super-resolution and Texture-preservation | Christopher Wiedeman et.al. | 2402.16212v1 | null |
2024-02-25 | Towards Efficient Quantum Hybrid Diffusion Models | Francesca De Falco et.al. | 2402.16147v1 | null |
2024-02-25 | Cinematographic Camera Diffusion Model | Hongda Jiang et.al. | 2402.16143v1 | null |
2024-02-25 | Behavioral Refinement via Interpolant-based Policy Diffusion | Kaiqi Chen et.al. | 2402.16075v1 | null |
2024-02-24 | HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved Diffusion Models | Li Pang et.al. | 2402.15865v1 | link |
2024-02-23 | Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions | Kaihong Zhang et.al. | 2402.15602v1 | null |
2024-02-21 | The Bass diffusion model: agent-based implementation on arbitrary networks | L. Di Lucchio et.al. | 2402.15528v1 | null |
2024-02-23 | Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition | Chun-Hsiao Yeh et.al. | 2402.15504v1 | link |
2024-02-23 | ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation | Yi Zhang et.al. | 2402.15429v1 | link |
2024-02-23 | Let's Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models | Shunyu Liu et.al. | 2402.15289v1 | link |
2024-02-23 | Weak Reproductive Solutions for a Convection-Diffusion Model Describing a Binary Alloy Solidification Processes | Blanca Climent-Ezquerra et.al. | 2402.15221v1 | null |
2024-02-23 | Label-efficient Multi-organ Segmentation Method with Diffusion Model | Yongzhi Huang et.al. | 2402.15216v1 | null |
2024-02-23 | Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control | Masatoshi Uehara et.al. | 2402.15194v1 | null |
2024-02-23 | Dynamics-Guided Diffusion Model for Robot Manipulator Design | Xiaomeng Xu et.al. | 2402.15038v1 | null |
2024-02-22 | Cameras as Rays: Pose Estimation via Ray Diffusion | Jason Y. Zhang et.al. | 2402.14817v1 | null |
2024-02-22 | Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models | Yixuan Ren et.al. | 2402.14780v1 | null |
2024-02-22 | Debiasing Text-to-Image Diffusion Models | Ruifei He et.al. | 2402.14577v1 | null |
2024-02-22 | Model-Based Reinforcement Learning Control of Reaction-Diffusion Problems | Christina Schenk et.al. | 2402.14446v1 | null |
2024-02-22 | Large-Scale Actionless Video Pre-Training via Discrete Diffusion for Efficient Policy Learning | Haoran He et.al. | 2402.14407v1 | null |
2024-02-22 | Diffusion Model Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality Assessment | Zhaoyang Wang et.al. | 2402.14401v1 | null |
2024-02-22 | Typographic Text Generation with Off-the-Shelf Diffusion Model | KhayTze Peong et.al. | 2402.14314v1 | null |
2024-02-22 | Font Style Interpolation with Diffusion Models | Tetta Kondo et.al. | 2402.14311v1 | null |
2024-02-23 | Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion | Yujia Huang et.al. | 2402.14285v2 | link |
2024-02-22 | MVD |
Xin-Yang Zheng et.al. | 2402.14253v1 | null |
2024-02-21 | T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching | Zizheng Pan et.al. | 2402.14167v1 | link |
2024-02-21 | Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate | Yuchen Liang et.al. | 2402.13901v1 | null |
2024-02-21 | NeuralDiffuser: Controllable fMRI Reconstruction with Primary Visual Feature Guided Diffusion | Haoyu Li et.al. | 2402.13809v1 | null |
2024-02-26 | Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions | Jiayu Chen et.al. | 2402.13777v4 | null |
2024-02-21 | Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion | Lianghu Guo et.al. | 2402.13776v1 | null |
2024-02-21 | Music Style Transfer with Time-Varying Inversion of Diffusion Models | Sifei Li et.al. | 2402.13763v1 | null |
2024-02-21 | SRNDiff: Short-term Rainfall Nowcasting with Condition Diffusion Model | Xudong Ling et.al. | 2402.13737v1 | null |
2024-02-21 | Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation | Kihong Kim et.al. | 2402.13729v1 | null |
2024-02-21 | Flexible Physical Camouflage Generation Based on a Differential Approach | Yang Li et.al. | 2402.13575v1 | null |
2024-02-21 | ToDo: Token Downsampling for Efficient Generation of High-Resolution Images | Ethan Smith et.al. | 2402.13573v1 | null |
2024-02-21 | Generative AI for Secure Physical Layer Communications: A Survey | Changyuan Zhao et.al. | 2402.13553v1 | null |
2024-02-21 | DiffPLF: A Conditional Diffusion Model for Probabilistic Forecasting of EV Charging Load | Siyang Li et.al. | 2402.13548v1 | link |
2024-02-21 | Contrastive Prompts Improve Disentanglement in Text-to-Image Diffusion Models | Chen Wu et.al. | 2402.13490v1 | null |
2024-02-20 | Layout-to-Image Generation with Localized Descriptions using ControlNet with Cross-Attention Control | Denis Lukovnikov et.al. | 2402.13404v1 | null |
2024-02-20 | The Uncanny Valley: A Comprehensive Analysis of Diffusion Models | Karam Ghanem et.al. | 2402.13369v1 | null |
2024-02-20 | Neural Network Diffusion | Kai Wang et.al. | 2402.13144v1 | link |
2024-02-20 | Text-Guided Molecule Generation with Diffusion Language Model | Haisong Gong et.al. | 2402.13040v1 | link |
2024-02-21 | Visual Style Prompting with Swapping Self-Attention | Jaeseok Jeong et.al. | 2402.12974v2 | null |
2024-02-20 | CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection | Sohail Ahmed Khan et.al. | 2402.12927v1 | null |
2024-02-20 | RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models | Xinchen Zhang et.al. | 2402.12908v1 | link |
2024-02-20 | Two-stage Rainfall-Forecasting Diffusion Model | XuDong Ling et.al. | 2402.12779v1 | link |
2024-02-20 | MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion | Sen Li et.al. | 2402.12741v1 | link |
2024-02-20 | Diffusion Posterior Sampling is Computationally Intractable | Shivam Gupta et.al. | 2402.12727v1 | null |
2024-02-20 | MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction | Shitao Tang et.al. | 2402.12712v1 | null |
2024-02-20 | SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion | Liumeng Xue et.al. | 2402.12660v1 | link |
2024-02-20 | DiffusionNOCS: Managing Symmetry and Uncertainty in Sim2Real Multi-Modal Category-level Pose Estimation | Takuya Ikeda et.al. | 2402.12647v1 | null |
2024-02-19 | Hierarchical Bayes Approach to Personalized Federated Unsupervised Learning | Kaan Ozkara et.al. | 2402.12537v1 | null |
2024-02-22 | Improving Deep Generative Models on Many-To-One Image-to-Image Translation | Sagar Saxena et.al. | 2402.12531v2 | null |
2024-02-19 | On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models | Miri Varshavsky Hassid et.al. | 2402.12423v1 | null |
2024-02-19 | FiT: Flexible Vision Transformer for Diffusion Model | Zeyu Lu et.al. | 2402.12376v1 | link |
2024-02-19 | Synthetic location trajectory generation using categorical diffusion models | Simon Dirmeier et.al. | 2402.12242v1 | link |
2024-02-19 | Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training | Leo Hyun Park et.al. | 2402.12187v1 | null |
2024-02-19 | Human Video Translation via Query Warping | Haiming Zhu et.al. | 2402.12099v1 | null |
2024-02-19 | Direct Consistency Optimization for Compositional Text-to-Image Personalization | Kyungmin Lee et.al. | 2402.12004v1 | null |
2024-02-19 | Privacy-Preserving Low-Rank Adaptation for Latent Diffusion Models | Zihao Luo et.al. | 2402.11989v1 | link |
2024-02-19 | DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation | Chong Zeng et.al. | 2402.11929v1 | null |
2024-02-20 | A Generative Pre-Training Framework for Spatio-Temporal Graph Transfer Learning | Yuan Yuan et.al. | 2402.11922v2 | link |
2024-02-19 | ComFusion: Personalized Subject Generation in Multiple Specific Scenes From Single Image | Yan Hong et.al. | 2402.11849v1 | null |
2024-02-19 | UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models | Yihua Zhang et.al. | 2402.11846v1 | link |
2024-02-19 | WildFake: A Large-scale Challenging Dataset for AI-Generated Images Detection | Yan Hong et.al. | 2402.11843v1 | null |
2024-02-19 | Statistical Test for Generated Hypotheses by Diffusion Models | Teruyuki Katsuoka et.al. | 2402.11789v1 | null |
2024-02-19 | Towards Theoretical Understandings of Self-Consuming Generative Models | Shi Fu et.al. | 2402.11778v1 | null |
2024-02-18 | SDiT: Spiking Diffusion Model with Transformer | Shu Yang et.al. | 2402.11588v1 | null |
2024-02-18 | CaloGraph: Graph-based diffusion model for fast shower generation in calorimeters with irregular geometry | Dmitrii Kobylianskii et.al. | 2402.11575v1 | null |
2024-02-18 | Temporal Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation | Yakun Chen et.al. | 2402.11558v1 | null |
2024-02-18 | Visual Concept-driven Image Generation with Text-to-Image Diffusion Model | Tanzila Rahman et.al. | 2402.11487v1 | null |
2024-02-17 | Partial Ly |
Georg Wolschin et.al. | 2402.11320v1 | null |
2024-02-17 | TC-DiffRecon: Texture coordination MRI reconstruction method based on diffusion model and modified MF-UNet method | Chenyan Zhang et.al. | 2402.11274v1 | link |
2024-02-17 | DiffPoint: Single and Multi-view Point Cloud Reconstruction with ViT Based Diffusion Model | Yu Feng et.al. | 2402.11241v1 | null |
2024-02-16 | 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations | Tsung-Wei Ke et.al. | 2402.10885v1 | null |
2024-02-16 | Training Class-Imbalanced Diffusion Model Via Overlap Optimization | Divin Yan et.al. | 2402.10821v1 | link |
2024-02-16 | VATr++: Choose Your Words Wisely for Handwritten Text Generation | Bram Vanherle et.al. | 2402.10798v1 | null |
2024-02-16 | Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation | Hongbin Na et.al. | 2402.10699v1 | null |
2024-02-16 | Generative AI and Attentive User Interfaces: Five Strategies to Enhance Take-Over Quality in Automated Driving | Patrick Ebel et.al. | 2402.10664v1 | null |
2024-02-16 | Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model | Xiangyu Zhang et.al. | 2402.10642v1 | null |
2024-02-16 | U |
Ziqi Gao et.al. | 2402.10609v1 | null |
2024-02-16 | A maximum likelihood estimation of Lévy-driven stochastic systems for univariate and multivariate time series of observations | Babak M. S. Arani et.al. | 2402.10608v1 | null |
2024-02-16 | Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation | Lanqing Guo et.al. | 2402.10491v1 | link |
2024-02-16 | Explaining generative diffusion models via visual analysis for interpretable decision-making process | Ji-Hoon Park et.al. | 2402.10404v1 | null |
2024-02-20 | GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting | Chen Yang et.al. | 2402.10259v2 | link |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210v1 | null |
2024-02-19 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207v2 | null |
2024-02-20 | Radio-astronomical Image Reconstruction with Conditional Denoising Diffusion Model | Mariia Drozdova et.al. | 2402.10204v2 | link |
2024-02-15 | Classification Diffusion Models | Shahar Yadin et.al. | 2402.10095v1 | null |
2024-02-15 | Diffusion Models Meet Contextual Bandits with Large Action Spaces | Imad Aouali et.al. | 2402.10028v1 | null |
2024-02-16 | Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion | Hila Manor et.al. | 2402.10009v2 | null |
2024-02-15 | Accelerating Parallel Sampling of Diffusion Models | Zhiwei Tang et.al. | 2402.09970v1 | null |
2024-02-15 | Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation | Junjie Shentu et.al. | 2402.09966v1 | link |
2024-02-15 | Lester: rotoscope animation through video object segmentation and tracking | Ruben Tous et.al. | 2402.09883v1 | link |
2024-02-15 | Diffusion Models for Audio Restoration | Jean-Marie Lemercier et.al. | 2402.09821v1 | null |
2024-02-15 | DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization | Jisu Nam et.al. | 2402.09812v1 | link |
2024-02-15 | Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement | Tao Yang et.al. | 2402.09712v1 | null |
2024-02-12 | Rolling Diffusion Models | David Ruhe et.al. | 2402.09470v1 | null |
2024-02-14 | Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection | Pengfei Zhou et.al. | 2402.09242v1 | link |
2024-02-14 | Semi-Supervised Diffusion Model for Brain Age Prediction | Ayodeji Ijishakin et.al. | 2402.09137v1 | null |
2024-02-14 | L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects | Yutaro Yamada et.al. | 2402.09052v1 | null |
2024-02-14 | Extreme Video Compression with Pre-trained Diffusion Models | Bohan Li et.al. | 2402.08934v1 | link |
2024-02-14 | The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes | Myeongseob Ko et.al. | 2402.08922v1 | null |
2024-02-13 | Percolating transition to turbulence without puffs or bands | Sébastien Gomé et.al. | 2402.08829v1 | null |
2024-02-13 | LDTrack: Dynamic People Tracking by Service Robots using Diffusion Models | Angus Fung et.al. | 2402.08774v1 | null |
2024-02-13 | Towards the Detection of AI-Synthesized Human Face Images | Yuhang Lu et.al. | 2402.08750v1 | null |
2024-02-13 | PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models | Fei Deng et.al. | 2402.08714v1 | null |
2024-02-13 | Zero Shot Molecular Generation via Similarity Kernels | Rokas Elijošius et.al. | 2402.08708v1 | link |
2024-02-13 | Chain Reaction of Ideas: Can Radioactive Decay Predict Technological Innovation? | Guilherme S. Y. Giardini et.al. | 2402.08681v1 | null |
2024-02-13 | Target Score Matching | Valentin De Bortoli et.al. | 2402.08667v1 | null |
2024-02-13 | Learning Continuous 3D Words for Text-to-Image Generation | Ta-Ying Cheng et.al. | 2402.08654v1 | null |
2024-02-14 | Denoising Diffusion Restoration Tackles Forward and Inverse Problems for the Laplace Operator | Amartya Mukherjee et.al. | 2402.08563v2 | null |
2024-02-13 | Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases | Ziyi Zhang et.al. | 2402.08552v1 | null |
2024-02-13 | A Dense Reward View on Aligning Text-to-Image Diffusion with Preference | Shentao Yang et.al. | 2402.08265v1 | link |
2024-02-13 | Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation | AprilPyone MaungMaung et.al. | 2402.08200v1 | null |
2024-02-14 | Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization | Hongrui Chen et.al. | 2402.08095v2 | null |
2024-02-12 | Nearest Neighbour Score Estimators for Diffusion Generative Models | Matthew Niedoba et.al. | 2402.08018v1 | null |
2024-02-12 | Towards a mathematical theory for consistency training in diffusion models | Gen Li et.al. | 2402.07802v1 | null |
2024-02-12 | Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models | Jiacheng Ye et.al. | 2402.07754v1 | null |
2024-02-12 | Cosmology at the Field Level with Probabilistic Machine Learning | Adam Rouhiainen et.al. | 2402.07694v1 | null |
2024-02-12 | Trustworthy SR: Resolving Ambiguity in Image Super-resolution via Diffusion Models and Human Feedback | Cansu Korkmaz et.al. | 2402.07597v1 | null |
2024-02-12 | Score-based Diffusion Models via Stochastic Differential Equations -- a Technical Tutorial | Wenpin Tang et.al. | 2402.07487v1 | null |
2024-02-13 | SALAD: Smart AI Language Assistant Daily | Ragib Amin Nihal et.al. | 2402.07431v2 | null |
2024-02-12 | Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory Generation | Tonglong Wei et.al. | 2402.07369v1 | null |
2024-02-15 | Re-DiffiNet: Modeling discrepancies loss in tumor segmentation using diffusion models | Tianyi Ren et.al. | 2402.07354v3 | null |
2024-02-11 | Stitching Sub-Trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL | Sungyoon Kim et.al. | 2402.07226v1 | link |
2024-02-13 | Towards Fast Stochastic Sampling in Diffusion Generative Models | Kushagra Pandey et.al. | 2402.07211v2 | null |
2024-02-10 | Synthesizing CTA Image Data for Type-B Aortic Dissection using Stable Diffusion Models | Ayman Abaid et.al. | 2402.06969v1 | null |
2024-02-09 | Towards Principled Assessment of Tabular Data Synthesis Algorithms | Yuntao Du et.al. | 2402.06806v1 | link |
2024-02-08 | Social Physics Informed Diffusion Model for Crowd Simulation | Hongyi Chen et.al. | 2402.06680v1 | link |
2024-02-06 | Weather Prediction with Diffusion Guided by Realistic Forecast Processes | Zhanxiang Hua et.al. | 2402.06666v1 | null |
2024-02-09 | Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following | Brian Yang et.al. | 2402.06559v1 | null |
2024-02-15 | Sequential Flow Straightening for Generative Modeling | Jongmin Yoon et.al. | 2402.06461v2 | null |
2024-02-09 | ControlUDA: Controllable Diffusion-assisted Unsupervised Domain Adaptation for Cross-Weather Semantic Segmentation | Fengyi Shen et.al. | 2402.06446v1 | null |
2024-02-09 | Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation | Peter Hönig et.al. | 2402.06436v1 | null |
2024-02-09 | Particle Denoising Diffusion Sampler | Angus Phillips et.al. | 2402.06320v1 | link |
2024-02-09 | Controllable seismic velocity synthesis using generative diffusion models | Fu Wang et.al. | 2402.06277v1 | null |
2024-02-09 | MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models | Yixiao Zhang et.al. | 2402.06178v1 | null |
2024-02-08 | CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models | Maitreya Suin et.al. | 2402.06106v1 | null |
2024-02-08 | Animated Stickers: Bringing Stickers to Life with Video Diffusion | David Yan et.al. | 2402.06088v1 | null |
2024-02-08 | DiscDiff: Latent Diffusion Model for DNA Sequence Generation | Zehui Li et.al. | 2402.06079v1 | null |
2024-02-08 | InstaGen: Enhancing Object Detection by Training on Synthetic Dataset | Chengjian Feng et.al. | 2402.05937v1 | null |
2024-02-08 | Time Series Diffusion in the Frequency Domain | Jonathan Crabbé et.al. | 2402.05933v1 | link |
2024-02-08 | AvatarMMC: 3D Head Avatar Generation and Editing with Multi-Modal Conditioning | Wamiq Reyaz Para et.al. | 2402.05803v1 | null |
2024-02-08 | DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer | Zhiyuan Ma et.al. | 2402.05712v1 | link |
2024-02-08 | Scalable Diffusion Models with State Space Backbone | Zhengcong Fei et.al. | 2402.05608v1 | link |
2024-02-08 | Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models | Senmao Li et.al. | 2402.05375v1 | link |
2024-02-08 | Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model | Junghun Cha et.al. | 2402.05350v1 | null |
2024-02-07 | SPAD : Spatially Aware Multiview Diffusers | Yash Kant et.al. | 2402.05235v1 | null |
2024-02-09 | Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models | Nicholas Konz et.al. | 2402.05210v2 | link |
2024-02-07 | ** |
Maitreya Patel et.al. | 2402.05195v1 | null |
2024-02-13 | On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling | Marcin Sendera et.al. | 2402.05098v2 | link |
2024-02-07 | NITO: Neural Implicit Fields for Resolution-free Topology Optimization | Amin Heyrani Nobari et.al. | 2402.05073v1 | null |
2024-02-07 | LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation | Jiaxiang Tang et.al. | 2402.05054v1 | null |
2024-02-07 | Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design | Andrew Campbell et.al. | 2402.04997v1 | link |
2024-02-07 | Blue noise for diffusion models | Xingchang Huang et.al. | 2402.04930v1 | null |
2024-02-07 | Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation | Shivang Chopra et.al. | 2402.04929v1 | null |
2024-02-07 | Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints | Jian Chen et.al. | 2402.04754v1 | link |
2024-02-07 | Cortical Surface Diffusion Generative Models | Zhenshan Xie et.al. | 2402.04753v1 | null |
2024-02-07 | EvoSeed: Unveiling the Threat on Deep Neural Networks with Real-World Illusions | Shashank Kotyan et.al. | 2402.04699v1 | link |
2024-02-07 | Noise Map Guidance: Inversion with Spatial Context for Real Image Editing | Hansam Cho et.al. | 2402.04625v1 | link |
2024-02-07 | BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception | Aniket Roy et.al. | 2402.04541v1 | link |
2024-02-07 | Text2Street: Controllable Text-to-image Generation for Street Views | Jinming Su et.al. | 2402.04504v1 | null |
2024-02-06 | Fine-Tuned Language Models Generate Stable Inorganic Materials as Text | Nate Gruver et.al. | 2402.04379v1 | link |
2024-02-06 | Bidirectional Autoregressive Diffusion Model for Dance Generation | Canyu Zhang et.al. | 2402.04356v1 | null |
2024-02-06 | Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced Segmentation | Zolnamar Dorjsembe et.al. | 2402.04031v1 | link |
2024-02-06 | Space Group Constrained Crystal Generation | Rui Jiao et.al. | 2402.03992v1 | null |
2024-02-06 | Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting | Yiming Xu et.al. | 2402.03981v1 | null |
2024-02-03 | IMUSIC: IMU-based Facial Expression Capture | Youjia Wang et.al. | 2402.03944v1 | null |
2024-02-06 | EscherNet: A Generative Model for Scalable View Synthesis | Xin Kong et.al. | 2402.03908v1 | null |
2024-02-06 | On gauge freedom, conservativity and intrinsic dimensionality estimation in diffusion models | Christian Horvat et.al. | 2402.03845v1 | null |
2024-02-06 | SDEMG: Score-based Diffusion Model for Surface Electromyographic Signal Denoising | Yu-Tung Liu et.al. | 2402.03808v1 | link |
2024-02-06 | FoolSDEdit: Deceptively Steering Your Edits Towards Targeted Attribute-aware Distribution | Qi Zhou et.al. | 2402.03705v1 | null |
2024-02-06 | Improving and Unifying Discrete&Continuous-time Discrete Denoising Diffusion | Lingxiao Zhao et.al. | 2402.03701v1 | null |
2024-02-06 | Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation | Lingxiao Zhao et.al. | 2402.03687v1 | null |
2024-02-06 | QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning | Haoxuan Wang et.al. | 2402.03666v1 | null |
2024-02-11 | Diffusion World Model | Zihan Ding et.al. | 2402.03570v2 | null |
2024-02-05 | Projected Generative Diffusion Models for Constraint Satisfaction | Jacob K Christopher et.al. | 2402.03559v1 | null |
2024-02-05 | AnaMoDiff: 2D Analogical Motion Diffusion via Disentangled Denoising | Maham Tanveer et.al. | 2402.03549v1 | null |
2024-02-05 | Hyper-Diffusion: Estimating Epistemic and Aleatoric Uncertainty with a Single Model | Matthew A. Chan et.al. | 2402.03478v1 | null |
2024-02-05 | Denoising Diffusion via Image-Based Rendering | Titas Anciukevicius et.al. | 2402.03445v1 | null |
2024-02-05 | Do Diffusion Models Learn Semantically Meaningful and Efficient Representations? | Qiyao Liang et.al. | 2402.03305v1 | null |
2024-02-07 | Zero-shot Object-Level OOD Detection with Context-Aware Inpainting | Quang-Huy Nguyen et.al. | 2402.03292v2 | null |
2024-02-05 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang et.al. | 2402.03290v1 | link |
2024-02-06 | Organic or Diffused: Can We Distinguish Human Art from AI-generated Images? | Anna Yoo Jeong Ha et.al. | 2402.03214v2 | null |
2024-02-05 | Light and Optimal Schrödinger Bridge Matching | Nikita Gushchin et.al. | 2402.03207v1 | link |
2024-02-05 | Guidance with Spherical Gaussian Constraint for Conditional Diffusion | Lingxiao Yang et.al. | 2402.03201v1 | null |
2024-02-05 | Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion | Shiyuan Yang et.al. | 2402.03162v1 | null |
2024-02-05 | PFDM: Parser-Free Virtual Try-on via Diffusion Model | Yunfang Niu et.al. | 2402.03047v1 | null |
2024-02-05 | Diffusive Gibbs Sampling | Wenlin Chen et.al. | 2402.03008v1 | null |
2024-02-05 | DexDiffuser: Generating Dexterous Grasps with Diffusion Models | Zehang Weng et.al. | 2402.02989v1 | null |
2024-02-05 | Retrieval-Augmented Score Distillation for Text-to-3D Generation | Junyoung Seo et.al. | 2402.02972v1 | null |
2024-02-05 | ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis | Bernard Spiegl et.al. | 2402.02906v1 | link |
2024-02-05 | SynthVision -- Harnessing Minimal Input for Maximal Output in Computer Vision Models using Synthetic Image data | Yudara Kularathne et.al. | 2402.02826v1 | null |
2024-02-05 | Extreme Two-View Geometry From Object Poses with Diffusion Models | Yujing Sun et.al. | 2402.02800v1 | link |
2024-02-06 | Contrastive Diffuser: Planning Towards High Return States via Contrastive Learning | Yixiang Shan et.al. | 2402.02772v2 | null |
2024-02-05 | DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models | Yang Sui et.al. | 2402.02739v1 | null |
2024-02-04 | DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing | Chong Mou et.al. | 2402.02583v1 | link |
2024-02-04 | Latent Graph Diffusion: A Unified Framework for Generation and Prediction on Graphs | Zhou Cai et.al. | 2402.02518v1 | null |
2024-02-04 | PoCo: Policy Composition from and for Heterogeneous Robot Learning | Lirui Wang et.al. | 2402.02511v1 | null |
2024-02-04 | PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal | Tao Wang et.al. | 2402.02374v1 | link |
2024-02-07 | Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models | Fangzhao Zhang et.al. | 2402.02347v2 | link |
2024-02-04 | Closed-Loop Unsupervised Representation Disentanglement with |
Xin Jin et.al. | 2402.02346v1 | null |
2024-02-04 | Your Diffusion Model is Secretly a Certifiably Robust Classifier | Huanran Chen et.al. | 2402.02316v1 | null |
2024-02-03 | Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance | Xinyu Peng et.al. | 2402.02149v1 | link |
2024-02-03 | Risk-Sensitive Diffusion: Learning the Underlying Distribution from Noisy Samples | Yangming Li et.al. | 2402.02081v1 | null |
2024-02-03 | DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication | Yanjun Liu et.al. | 2402.02060v1 | null |
2024-02-03 | GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning | Yaning Zhang et.al. | 2402.02003v1 | null |
2024-02-06 | Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization | Fangzhao Zhang et.al. | 2402.01965v2 | null |
2024-02-02 | Robust Inverse Graphics via Probabilistic Inference | Tuan Anh Le et.al. | 2402.01915v1 | null |
2024-02-06 | Carthago Delenda Est: Co-opetitive Indirect Information Diffusion Model for Influence Operations on Online Social Media | Jwen Fai Low et.al. | 2402.01905v2 | null |
2024-02-02 | Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models | Justin Blalock et.al. | 2402.01877v1 | null |
2024-02-01 | Plug-and-Play image restoration with Stochastic deNOising REgularization | Marien Renaud et.al. | 2402.01779v1 | link |
2024-02-02 | NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | Jingyuan Sun et.al. | 2402.01590v1 | null |
2024-02-02 | Boximator: Generating Rich and Controllable Motions for Video Synthesis | Jiawei Wang et.al. | 2402.01566v1 | null |
2024-02-02 | Cross-view Masked Diffusion Transformers for Person Image Synthesis | Trung X. Pham et.al. | 2402.01516v1 | null |
2024-02-02 | Conditioning non-linear and infinite-dimensional diffusion processes | Elizabeth Louise Baker et.al. | 2402.01434v1 | null |
2024-02-02 | Bass Accompaniment Generation via Latent Diffusion | Marco Pasini et.al. | 2402.01412v1 | null |
2024-02-02 | Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors | Dingcheng Yang et.al. | 2402.01369v1 | link |
2024-02-02 | Unsupervised Generation of Pseudo Normal PET from MRI with Diffusion Model for Epileptic Focus Localization | Wentao Chen et.al. | 2402.01191v1 | null |
2024-02-01 | Unconditional Latent Diffusion Models Memorize Patient Imaging Data | Salman Ul Hassan Dar et.al. | 2402.01054v1 | null |
2024-02-01 | pop-cosmos: A comprehensive picture of the galaxy population from COSMOS data | Justin Alsing et.al. | 2402.00935v1 | null |
2024-02-01 | Data-Space Validation of High-Dimensional Models by Comparing Sample Quantiles | Stephen Thorp et.al. | 2402.00930v1 | null |
2024-02-01 | ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields | Jiahua Dong et.al. | 2402.00864v1 | link |
2024-02-01 | An Analysis of the Variance of Diffusion-based Speech Enhancement | Bunlong Lay et.al. | 2402.00811v1 | null |
2024-02-01 | Distilling Conditional Diffusion Models for Offline Reinforcement Learning through Trajectory Stitching | Shangzhe Li et.al. | 2402.00807v1 | null |
2024-02-01 | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning | Fu-Yun Wang et.al. | 2402.00769v1 | link |
2024-01-31 | SeFi-IDE: Semantic-Fidelity Identity Embedding for Personalized Diffusion-Based Generation | Yang Li et.al. | 2402.00631v1 | null |
2024-02-01 | Cylindrically symmetric diffusion model for relativistic heavy-ion collisions | Johannes Hoelck et.al. | 2402.00628v1 | null |
2024-02-01 | CapHuman: Capture Your Moments in Parallel Universes | Chao Liang et.al. | 2402.00627v1 | link |
2024-02-01 | Masked Conditional Diffusion Model for Enhancing Deepfake Detection | Tiewen Chen et.al. | 2402.00541v1 | null |
2024-02-01 | Energetic Particles in the Central Starburst, Disc, and Halo of NGC253 | Yoel Rephaeli et.al. | 2402.00523v1 | null |
2024-02-01 | LRDif: Diffusion Models for Under-Display Camera Emotion Recognition | Zhifeng Wang et.al. | 2402.00250v1 | null |
2024-02-02 | SuperDiff: Diffusion Models for Conditional Generation of Hypothetical New Families of Superconductors | Samuel Yuan et.al. | 2402.00198v2 | null |
2024-01-31 | Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators | Daniel Geng et.al. | 2401.18085v1 | null |
2024-01-31 | Ljusternik-Schnirelmann eigenvalues for the fractional |
Julian Fernandez Bonder et.al. | 2401.18041v1 | null |
2024-01-31 | Diagnosing the particle transport mechanism in the pulsar halo via X-ray observations | Qi-Zuo Wu et.al. | 2401.17982v1 | null |
2024-01-31 | Convergence Analysis for General Probability Flow ODEs of Diffusion Models in Wasserstein Distances | Xuefeng Gao et.al. | 2401.17958v1 | null |
2024-01-31 | AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error | Jonas Ricker et.al. | 2401.17879v1 | null |
2024-01-31 | Drift Diffusion Model to understand (mis)information sharing dynamic in complex networks | Lucila G. Alvarez-Zuzek et.al. | 2401.17846v1 | null |
2024-01-31 | A new class of efficient high order semi-Lagrangian IMEX discontinuous Galerkin methods on staggered unstructured meshes | M. Tavelli et.al. | 2401.17806v1 | null |
2024-01-31 | Dance-to-Music Generation with Encoder-based Textual Inversion of Diffusion Models | Sifei Li et.al. | 2401.17800v1 | null |
2024-01-31 | Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image Generation | Yuanhuiyi Lyu et.al. | 2401.17664v1 | null |
2024-01-31 | Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models | Kyungsung Lee et.al. | 2401.17629v1 | null |
2024-01-31 | Topology-Aware Latent Diffusion for 3D Shape Generation | Jiangbei Hu et.al. | 2401.17603v1 | null |
2024-01-31 | Head and Neck Tumor Segmentation from [18F]F-FDG PET/CT Images Based on 3D Diffusion Model | Yafei Dong et.al. | 2401.17593v1 | null |
2024-01-31 | Task-Oriented Diffusion Model Compression | Geonung Kim et.al. | 2401.17547v1 | null |
2024-01-31 | Enhancing Score-Based Sampling Methods with Ensembles | Tobias Bischoff et.al. | 2401.17539v1 | null |
2024-01-30 | You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation | Mehdi Noroozi et.al. | 2401.17258v1 | null |
2024-02-03 | ContactGen: Contact-Guided Interactive 3D Human Generation for Partners | Dongjun Gu et.al. | 2401.17212v2 | null |
2024-01-30 | Transfer Learning for Text Diffusion Models | Kehang Han et.al. | 2401.17181v1 | null |
2024-01-30 | PlantoGraphy: Incorporating Iterative Design Process into Generative Artificial Intelligence for Landscape Rendering | Rong Huang et.al. | 2401.17120v1 | null |
2024-01-30 | Local modification of subdiffusion by initial Fickian diffusion: Multiscale modeling, analysis and computation | Xiangcheng Zheng et.al. | 2401.16885v1 | null |
2024-01-30 | A Literature Review on Fetus Brain Motion Correction in MRI | Haoran Zhang et.al. | 2401.16782v1 | null |
2024-01-30 | BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion | Yonghao Yu et.al. | 2401.16764v1 | null |
2024-01-30 | Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization | Henglei Lv et.al. | 2401.16762v1 | null |
2024-01-30 | Diffusion model for relational inference | Shuhan Zheng et.al. | 2401.16755v1 | null |
2024-01-29 | Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors | Shiyin Dong et.al. | 2401.16459v1 | null |
2024-01-29 | Using multiple Dirac delta points to describe inhomogeneous flux density over a cell boundary in a single-cell diffusion model | Qiyao Peng et.al. | 2401.16261v1 | null |
2024-01-29 | Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models | Zhongjie Duan et.al. | 2401.16224v1 | null |
2024-01-29 | Spatial-Aware Latent Initialization for Controllable Image Generation | Wenqiang Sun et.al. | 2401.16157v1 | null |
2024-01-29 | DMCE: Diffusion Model Channel Enhancer for Multi-User Semantic Communication Systems | Youcheng Zeng et.al. | 2401.16017v1 | null |
2024-01-31 | Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling | Xiaoyu Shi et.al. | 2401.15977v2 | null |
2024-01-29 | EmoDM: A Diffusion Model for Evolutionary Multi-objective Optimization | Xueming Yan et.al. | 2401.15931v1 | null |
2024-01-28 | Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding | Jianxiang Lu et.al. | 2401.15708v1 | null |
2024-01-30 | Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance | Qingcheng Zhao et.al. | 2401.15687v2 | null |
2024-01-28 | CPDM: Content-Preserving Diffusion Model for Underwater Image Enhancement | Xiaowen Shi et.al. | 2401.15649v1 | null |
2024-01-28 | FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models | Feihong He et.al. | 2401.15636v1 | null |
2024-01-28 | Generative AI-enabled Blockchain Networks: Fundamentals, Applications, and Case Study | Cong T. Nguyen et.al. | 2401.15625v1 | null |
2024-01-28 | Diffusion-based graph generative methods | Hongyang Chen et.al. | 2401.15617v1 | link |
2024-01-28 | Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization | Yinbin Han et.al. | 2401.15604v1 | null |
2024-01-28 | BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry | Xiang Xu et.al. | 2401.15563v1 | null |
2024-01-31 | Wind speed super-resolution and validation: from ERA5 to CERRA via diffusion models | Fabio Merizzi et.al. | 2401.15469v2 | null |
2024-01-27 | A Survey on Data Augmentation in Large Model Era | Yue Zhou et.al. | 2401.15422v1 | link |
2024-01-27 | GEM: Boost Simple Network for Glass Surface Segmentation via Segment Anything Model and Data Synthesis | Jing Hao et.al. | 2401.15282v1 | link |
2024-01-26 | Annotated Hands for Generative Models | Yue Yang et.al. | 2401.15075v1 | link |
2024-01-26 | Text Image Inpainting via Global Structure-Guided Diffusion Models | Shipeng Zhu et.al. | 2401.14832v1 | link |
2024-01-25 | Opposite variations for pore pressure on and off the fault during simulated earthquakes in the laboratory | Dong Liu et.al. | 2401.14506v1 | null |
2024-01-24 | No Longer Trending on Artstation: Prompt Analysis of Generative AI Art | Jon McCormack et.al. | 2401.14425v1 | null |
2024-01-25 | Deconstructing Denoising Diffusion Models for Self-Supervised Learning | Xinlei Chen et.al. | 2401.14404v1 | null |
2024-01-25 | pix2gestalt: Amodal Segmentation by Synthesizing Wholes | Ege Ozguroglu et.al. | 2401.14398v1 | link |
2024-01-25 | UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models | Timo Kapsalis et.al. | 2401.14379v1 | null |
2024-01-27 | Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation | Minglin Chen et.al. | 2401.14257v2 | null |
2024-01-26 | Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models for Scene Graphs | Rameshwar Mishra et.al. | 2401.14111v2 | null |
2024-01-30 | CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion | Nisha Huang et.al. | 2401.14066v2 | null |
2024-01-25 | Diffusion-based Data Augmentation for Object Counting Problems | Zhen Wang et.al. | 2401.13992v1 | null |
2024-01-25 | BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models | Senthil Purushwalkam et.al. | 2401.13974v1 | null |
2024-01-25 | StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models | Yalong Bai et.al. | 2401.13942v1 | null |
2024-01-24 | Inverse Molecular Design with Multi-Conditional Diffusion Guidance | Gang Liu et.al. | 2401.13858v1 | link |
2024-01-24 | Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All | Mehmet Saygin Seyfioglu et.al. | 2401.13795v1 | null |
2024-01-24 | Guided Diffusion for Fast Inverse Design of Density-based Mechanical Metamaterials | Yanyan Yang et.al. | 2401.13570v1 | null |
2024-01-25 | UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion | Wei Li et.al. | 2401.13388v2 | null |
2024-01-31 | Generative Design of Crystal Structures by Point Cloud Representations and Diffusion Model | Zhelin Li et.al. | 2401.13192v2 | link |
2024-01-24 | Towards Multi-domain Face Landmark Detection with Synthetic Data from Diffusion model | Yuanming Li et.al. | 2401.13191v1 | null |
2024-01-24 | Compositional Generative Inverse Design | Tailin Wu et.al. | 2401.13171v1 | link |
2024-01-24 | Choose Your Diffusion: Efficient and flexible ways to accelerate the diffusion model in fast high energy physics simulation | Cheng Jiang et.al. | 2401.13162v1 | null |
2024-01-23 | GALA: Generating Animatable Layered Assets from a Single Scan | Taeksoo Kim et.al. | 2401.12979v1 | null |
2024-01-24 | Zero-Shot Learning for the Primitives of 3D Affordance in General Objects | Hyeonwoo Kim et.al. | 2401.12978v2 | null |
2024-01-23 | Lumiere: A Space-Time Diffusion Model for Video Generation | Omer Bar-Tal et.al. | 2401.12945v1 | null |
2024-01-23 | UniHDA: Towards Universal Hybrid Domain Adaptation of Image Generators | Hengjia Li et.al. | 2401.12596v1 | null |
2024-01-23 | ToDA: Target-oriented Diffusion Attacker against Recommendation System | Xiaohao Liu et.al. | 2401.12578v1 | null |
2024-01-23 | DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations | Dogyun Park et.al. | 2401.12517v1 | null |
2024-01-20 | Large-scale Reinforcement Learning for Diffusion Models | Yinan Zhang et.al. | 2401.12244v1 | null |
2024-01-22 | DITTO: Diffusion Inference-Time T-Optimization for Music Generation | Zachary Novack et.al. | 2401.12179v1 | null |
2024-01-22 | Single-View 3D Human Digitalization with Large Reconstruction Models | Zhenzhen Weng et.al. | 2401.12175v1 | null |
2024-01-22 | Feature Denoising Diffusion Model for Blind Image Quality Assessment | Xudong Li et.al. | 2401.11949v1 | null |
2024-01-22 | EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models | Koichi Namekata et.al. | 2401.11739v1 | null |
2024-01-22 | Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs | Ling Yang et.al. | 2401.11708v1 | link |
2024-01-21 | Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers | Katherine Crowson et.al. | 2401.11605v1 | link |
2024-01-20 | Diffusion Model Conditioning on Gaussian Mixture Model and Negative Gaussian Mixture Gradient | Weiguo Lu et.al. | 2401.11261v1 | null |
2024-01-20 | Product-Level Try-on: Characteristics-preserving Try-on with Realistic Clothes Shading and Wrinkles | Yanlong Zang et.al. | 2401.11239v1 | null |
2024-01-24 | MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation | Nhat M. Hoang et.al. | 2401.11115v3 | null |
2024-01-20 | UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures | Mingyuan Zhou et.al. | 2401.11078v1 | null |
2024-01-20 | Make-A-Shape: a Ten-Million-scale 3D Shape Model | Ka-Hei Hui et.al. | 2401.11067v1 | null |
2024-01-17 | A New Creative Generation Pipeline for Click-Through Rate with Stable Diffusion Model | Hao Yang et.al. | 2401.10934v1 | link |
2024-01-19 | Synthesizing Moving People with 3D Control | Boyi Li et.al. | 2401.10889v1 | null |
2024-01-19 | ActAnywhere: Subject-Aware Video Background Generation | Boxiao Pan et.al. | 2401.10822v1 | null |
2024-01-19 | From Market Saturation to Social Reinforcement: Understanding the Impact of Non-Linearity in Information Diffusion Models | Tobias Friedrich et.al. | 2401.10818v1 | null |
2024-01-19 | Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Zuoyue Li et.al. | 2401.10786v1 | null |
2024-01-19 | Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model | Yinan Zheng et.al. | 2401.10700v1 | link |
2024-01-19 | MAEDiff: Masked Autoencoder-enhanced Diffusion Models for Unsupervised Anomaly Detection in Brain Images | Rui Xu et.al. | 2401.10561v1 | null |
2024-01-18 | Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution | Xin Yuan et.al. | 2401.10404v1 | null |
2024-01-18 | A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Wouter Van Gansbeke et.al. | 2401.10227v1 | link |
2024-01-22 | Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation | Changgu Chen et.al. | 2401.10150v3 | null |
2024-01-18 | DiffusionGPT: LLM-Driven Text-to-Image Generation System | Jie Qin et.al. | 2401.10061v1 | null |
2024-01-18 | CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects | Zhao Wang et.al. | 2401.09962v1 | null |
2024-01-18 | BlenDA: Domain Adaptive Object Detection through diffusion-based blending | Tzuhsuan Huang et.al. | 2401.09921v1 | null |
2024-01-18 | Exploring Latent Cross-Channel Embedding for Accurate 3D Human Pose Reconstruction in a Diffusion Framework | Junkun Jiang et.al. | 2401.09836v1 | null |
2024-01-18 | Wavelet-Guided Acceleration of Text Inversion in Diffusion-Based Image Editing | Gwanhyeong Koo et.al. | 2401.09794v1 | null |
2024-01-18 | Image Translation as Diffusion Visual Programmers | Cheng Han et.al. | 2401.09742v1 | null |
2024-01-17 | Total fraction of drug released from diffusion-controlled delivery systems with binding reactions | Elliot J. Carr et.al. | 2401.09644v1 | null |
2024-01-17 | Efficient generative adversarial networks using linear additive-attention Transformers | Emilio Morales-Juarez et.al. | 2401.09596v1 | link |
2024-01-17 | TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion | Yu-Ying Yeh et.al. | 2401.09416v1 | null |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414v1 | link |
2024-01-17 | On the |
Mireille Bossy et.al. | 2401.09338v1 | null |
2024-01-17 | Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery | Jia Jia et.al. | 2401.09325v1 | null |
2024-01-17 | T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis | Yoonjin Chung et.al. | 2401.09294v1 | null |
2024-01-17 | Training-Free Semantic Video Composition via Pre-trained Diffusion Model | Jiaqi Guo et.al. | 2401.09195v1 | null |
2024-01-17 | Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior | Zike Wu et.al. | 2401.09050v1 | null |
2024-01-17 | Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis | Jonghyun Lee et.al. | 2401.09048v1 | link |
2024-01-17 | VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | Haoxin Chen et.al. | 2401.09047v1 | link |
2024-01-21 | Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation | Tong Xie et.al. | 2401.09031v2 | null |
2024-01-17 | 3D Human Pose Analysis via Diffusion Synthesis | Haorui Ji et.al. | 2401.08930v1 | null |
2024-01-16 | Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive | Yumeng Li et.al. | 2401.08815v1 | link |
2024-01-16 | Fixed Point Diffusion Models | Xingjian Bai et.al. | 2401.08741v1 | null |
2024-01-16 | SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers | Nanye Ma et.al. | 2401.08740v1 | link |
2024-01-18 | NODI: Out-Of-Distribution Detection with Noise from Diffusion | Jingqiu Zhou et.al. | 2401.08689v2 | null |
2024-01-16 | RoHM: Robust Human Motion Reconstruction via Diffusion | Siwei Zhang et.al. | 2401.08570v1 | null |
2024-01-16 | Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation | Mathis Petrovich et.al. | 2401.08559v1 | null |
2024-01-16 | Modeling Spoof Noise by De-spoofing Diffusion and its Application in Face Anti-spoofing | Bin Zhang et.al. | 2401.08275v1 | null |
2024-01-16 | Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization | Chongzhi Zhang et.al. | 2401.08232v1 | null |
2024-01-16 | Photonic Modes Prediction via Multi-Modal Diffusion Model | Jinyang Sun et.al. | 2401.08199v1 | null |
2024-01-16 | Key-point Guided Deformable Image Manipulation Using Diffusion Model | Seok-Hwan Oh et.al. | 2401.08178v1 | null |
2024-01-23 | SpecSTG: A Fast Spectral Diffusion Framework for Probabilistic Spatio-Temporal Traffic Forecasting | Lequan Lin et.al. | 2401.08119v2 | null |
2024-01-16 | DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech | Jaekwon Im et.al. | 2401.08102v1 | null |
2024-01-16 | EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model | Bingyuan Zhang et.al. | 2401.08049v1 | null |
2024-01-16 | Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities | Xu Yan et.al. | 2401.08045v1 | link |
2024-01-15 | Regularity in diffusion models with gradient activation | Damião Araújo et.al. | 2401.07979v1 | null |
2024-01-15 | HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation | Antoine Mercier et.al. | 2401.07727v1 | null |
2024-01-15 | Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks | Siyu Zou et.al. | 2401.07709v1 | null |
2024-01-15 | Multifractal-spectral features enhance classification of anomalous diffusion | Henrik Seckler et.al. | 2401.07646v1 | null |
2024-01-15 | InstantID: Zero-shot Identity-Preserving Generation in Seconds | Qixun Wang et.al. | 2401.07519v1 | link |
2024-01-15 | Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation | Yuanchen Ju et.al. | 2401.07487v1 | null |
2024-01-20 | Hierarchical Fashion Design with Multi-stage Diffusion Models | Zhifeng Xie et.al. | 2401.07450v3 | null |
2024-01-14 | A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models | Namjoon Suh et.al. | 2401.07187v1 | null |
2024-01-13 | Exploring Adversarial Attacks against Latent Diffusion Model from the Perspective of Adversarial Transferability | Junxi Chen et.al. | 2401.07087v1 | null |
2024-01-13 | Quantum Denoising Diffusion Models | Michael Kölle et.al. | 2401.07049v1 | null |
2024-01-13 | Quantum Generative Diffusion Model | Chuangtao Chen et.al. | 2401.07039v1 | null |
2024-01-17 | Denoising Diffusion Recommender Model | Jujia Zhao et.al. | 2401.06982v2 | null |
2024-01-12 | A deep implicit-explicit minimizing movement method for option pricing in jump-diffusion models | Emmanuil H. Georgoulis et.al. | 2401.06740v1 | null |
2024-01-12 | Decoupling Pixel Flipping and Occlusion Strategy for Consistent XAI Benchmarks | Stefan Blücher et.al. | 2401.06654v1 | link |
2024-01-17 | Adversarial Examples are Misaligned in Diffusion Model Manifolds | Peter Lorenz et.al. | 2401.06637v3 | null |
2024-01-12 | Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking | Wei Cao et.al. | 2401.06614v1 | null |
2024-01-12 | 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model | Qian Wang et.al. | 2401.06578v1 | null |
2024-01-12 | RotationDrag: Point-based Image Editing with Rotated Diffusion Features | Minxing Luo et.al. | 2401.06442v1 | link |
2024-01-12 | Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering | Chang Yu et.al. | 2401.06345v1 | null |
2024-01-11 | Frequency-Time Diffusion with Neural Cellular Automata | John Kalkhof et.al. | 2401.06291v1 | null |
2024-01-11 | Demystifying Variational Diffusion Models | Fabio De Sousa Ribeiro et.al. | 2401.06281v1 | null |
2024-01-11 | Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications | Yuwen Xiong et.al. | 2401.06197v1 | link |
2024-01-11 | TriNeRFLet: A Wavelet Based Multiscale Triplane NeRF Representation | Rajaei Khatib et.al. | 2401.06191v1 | null |
2024-01-11 | E |
Yifan Gong et.al. | 2401.06127v1 | null |
2024-01-11 | DiffDA: a diffusion model for weather-scale data assimilation | Langwen Huang et.al. | 2401.05932v1 | null |
2024-01-11 | Efficient Image Deblurring Networks based on Diffusion Models | Kang Chen et.al. | 2401.05907v1 | link |
2024-01-11 | HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models | Hanzhang Wang et.al. | 2401.05870v1 | null |
2024-01-11 | EraseDiff: Erasing Data Influence in Diffusion Models | Jing Wu et.al. | 2401.05779v1 | null |
2024-01-10 | Diffusion Priors for Dynamic View Synthesis from Monocular Videos | Chaoyang Wang et.al. | 2401.05583v1 | null |
2024-01-10 | From Pampas to Pixels: Fine-Tuning Diffusion Models for Gaúcho Heritage | Marcellus Amadeus et.al. | 2401.05520v1 | null |
2024-01-10 | InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes | Mohamad Shahbazi et.al. | 2401.05335v1 | null |
2024-01-10 | Score Distillation Sampling with Learned Manifold Corrective | Thiemo Alldieck et.al. | 2401.05293v1 | null |
2024-01-10 | PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models | Junsong Chen et.al. | 2401.05252v1 | link |
2024-01-05 | Tailoring Frictional Properties of Surfaces Using Diffusion Models | Even Marius Nordhagen et.al. | 2401.05206v1 | null |
2024-01-10 | Derm-T2IM: Harnessing Synthetic Skin Lesion Data via Stable Diffusion Models for Enhanced Skin Disease Classification using ViT and CNN | Muhammad Ali Farooq et.al. | 2401.05159v1 | null |
2024-01-13 | CrossDiff: Exploring Self-Supervised Representation of Pansharpening via Cross-Predictive Diffusion Model | Yinghui Xing et.al. | 2401.05153v2 | null |
2024-01-10 | SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image | Jiayuan Tian et.al. | 2401.05093v1 | null |
2024-01-10 | A novel bond-based nonlocal diffusion model with matrix-valued coefficients in non-divergence form and its collocation discretization | Lili Ju et.al. | 2401.04973v1 | null |
2024-01-09 | Transmission-eigenchannel velocity and diffusion | Azriel Z. Genack et.al. | 2401.04818v1 | null |
2024-01-09 | DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation | Junming Chen et.al. | 2401.04747v1 | null |
2024-01-09 | Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation | Xiyi Chen et.al. | 2401.04728v1 | null |
2024-01-09 | Efficient estimation for ergodic diffusion processes sampled at high frequency | Michael Sørensen et.al. | 2401.04689v1 | null |
2024-01-09 | EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models | Jingyuan Yang et.al. | 2401.04608v1 | null |
2024-01-09 | Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models | Xuewen Liu et.al. | 2401.04585v1 | null |
2024-01-09 | MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation | Weimin Wang et.al. | 2401.04468v1 | null |
2024-01-09 | D3AD: Dynamic Denoising Diffusion Probabilistic Model for Anomaly Detection | Justin Tebbe et.al. | 2401.04463v1 | null |
2024-01-09 | SonicVisionLM: Playing Sound with Vision Language Models | Zhifeng Xie et.al. | 2401.04394v1 | null |
2024-01-09 | Representative Feature Extraction During Diffusion Process for Sketch Extraction with One Example | Kwan Yun et.al. | 2401.04362v1 | null |
2024-01-09 | Memory-Efficient Personalization using Quantized Diffusion Model | Hyogon Ryu et.al. | 2401.04339v1 | null |
2024-01-08 | FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation | Yang Liu et.al. | 2401.04283v1 | null |
2024-01-08 | Robust Image Watermarking using Stable Diffusion | Lijun Zhang et.al. | 2401.04247v1 | null |
2024-01-07 | The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline | Haonan Wang et.al. | 2401.04136v1 | null |
2024-01-08 | scDiffusion: conditional generation of high-quality single-cell data using diffusion model | Erpai Luo et.al. | 2401.03968v1 | link |
2024-01-08 | D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose Refinement | Danqi Yan et.al. | 2401.03914v1 | null |
2024-01-08 | DDM-Lag : A Diffusion-based Decision-making Model for Autonomous Vehicles with Lagrangian Safety Enhancement | Jiaqi Liu et.al. | 2401.03629v1 | null |
2024-01-09 | ROIC-DM: Robust Text Inference and Classification via Diffusion Model | Shilong Yuan et.al. | 2401.03514v2 | null |
2024-01-07 | Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness | Sicheng Yang et.al. | 2401.03476v1 | null |
2024-01-07 | Deep Learning-based Image and Video Inpainting: A Survey | Weize Quan et.al. | 2401.03395v1 | null |
2024-01-06 | Reflected Schrödinger Bridge for Constrained Generative Modeling | Wei Deng et.al. | 2401.03228v1 | null |
2024-01-06 | MirrorDiffusion: Stabilizing Diffusion Process in Zero-shot Image Translation by Prompts Redescription and Beyond | Yupei Lin et.al. | 2401.03221v1 | null |
2024-01-09 | Fair Sampling in Diffusion Models through Switching Mechanism | Yujin Choi et.al. | 2401.03140v2 | link |
2024-01-05 | Latte: Latent Diffusion Transformer for Video Generation | Xin Ma et.al. | 2401.03048v1 | link |
2024-01-05 | The Rise of Diffusion Models in Time-Series Forecasting | Caspar Meijer et.al. | 2401.03006v1 | link |
2024-01-08 | Uncovering the human motion pattern: Pattern Memory-based Diffusion Model for Trajectory Prediction | Yuxin Yang et.al. | 2401.02916v2 | null |
2024-01-05 | Plug-in Diffusion Model for Sequential Recommendation | Haokai Ma et.al. | 2401.02913v1 | link |
2024-01-05 | Diffusion Variational Inference: Diffusion Models as Expressive Variational Posteriors | Top Piriyakulkij et.al. | 2401.02739v1 | null |
2024-01-05 | Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation | Can Xu et.al. | 2401.02683v1 | null |
2024-01-04 | Comprehensive Exploration of Synthetic Data Generation: A Survey | André Bauer et.al. | 2401.02524v1 | null |
2024-01-04 | VASE: Object-Centric Appearance and Shape Manipulation of Real Videos | Elia Peruzzo et.al. | 2401.02473v1 | null |
2024-01-04 | Bring Metric Functions into Diffusion Models | Jie An et.al. | 2401.02414v1 | null |
2024-01-06 | GUESS:GradUally Enriching SyntheSis for Text-Driven Human Motion Generation | Xuehao Gao et.al. | 2401.02142v2 | null |
2024-01-04 | Preserving Image Properties Through Initializations in Diffusion Models | Jeffrey Zhang et.al. | 2401.02097v1 | null |
2024-01-04 | Energy based diffusion generator for efficient sampling of Boltzmann distributions | Yan Wang et.al. | 2401.02080v1 | null |
2024-01-09 | DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection | Yunfan Ye et.al. | 2401.02032v2 | link |
2024-01-04 | Improving Diffusion-Based Image Synthesis with Context Prediction | Ling Yang et.al. | 2401.02015v1 | null |
2024-01-03 | Instruct-Imagen: Image Generation with Multi-modal Instruction | Hexiang Hu et.al. | 2401.01952v1 | null |
2024-01-03 | Can We Generate Realistic Hands Only Using Convolution? | Mehran Hosseini et.al. | 2401.01951v1 | null |
2024-01-03 | Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions | David Junhao Zhang et.al. | 2401.01827v1 | link |
2024-01-03 | DiffYOLO: Object Detection for Anti-Noise via YOLO and Diffusion Models | Yichen Liu et.al. | 2401.01659v1 | null |
2024-01-03 | SIGNeRF: Scene Integrated Generation for Neural Radiance Fields | Jan-Niklas Dihlmann et.al. | 2401.01647v1 | null |
2024-01-03 | S |
Yixuan Wang et.al. | 2401.01520v1 | link |
2024-01-02 | ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text | Dingkun Yan et.al. | 2401.01456v1 | link |
2024-01-02 | VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics | Ammar A. Siddiqui et.al. | 2401.01414v1 | null |
2024-01-01 | DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition | Parul Gupta et.al. | 2401.01387v1 | null |
2024-01-02 | VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM | Fuchen Long et.al. | 2401.01256v1 | null |
2024-01-02 | Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation | Renshuai Liu et.al. | 2401.01207v1 | null |
2024-01-02 | A comparative study of resistivity models for simulations of magnetic reconnection in the solar atmosphere. II. Plasmoid formation | Øystein Håvard Færder et.al. | 2401.01177v1 | null |
2024-01-02 | Joint Generative Modeling of Scene Graphs and Images via Diffusion Models | Bicheng Xu et.al. | 2401.01130v1 | null |
2024-01-02 | Robust single-particle cryo-EM image denoising and restoration | Jing Zhang et.al. | 2401.01097v1 | null |
2024-01-02 | Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation | Jinlong Xue et.al. | 2401.01044v1 | link |
2023-12-30 | Improving the Stability of Diffusion Models for Content Consistent Super-Resolution | Lingchen Sun et.al. | 2401.00877v1 | link |
2023-12-30 | FlashVideo: A Framework for Swift Inference in Text-to-Video Generation | Bin Lei et.al. | 2401.00869v1 | null |
2024-01-01 | DiffMorph: Text-less Image Morphing with Diffusion Models | Shounak Chatterjee et.al. | 2401.00739v1 | null |
2024-01-01 | Diffusion Models, Image Super-Resolution And Everything: A Survey | Brian B. Moser et.al. | 2401.00736v1 | null |
2024-01-02 | GD^2-NeRF: Generative Detail Compensation via GAN and Diffusion for One-shot Generalizable Neural Radiance Fields | Xiao Pan et.al. | 2401.00616v2 | null |
2024-01-03 | Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration | Qianliang Wu et.al. | 2401.00436v2 | null |
2023-12-31 | SynCDR : Training Cross Domain Retrieval Models with Synthetic Data | Samarth Mishra et.al. | 2401.00420v1 | link |
2023-12-31 | Controllable Safety-Critical Closed-loop Traffic Simulation via Guided Diffusion | Wei-Jer Chang et.al. | 2401.00391v1 | null |
2023-12-30 | Probing the Limits and Capabilities of Diffusion Models for the Anatomic Editing of Digital Twins | Karim Kadry et.al. | 2401.00247v1 | null |
2023-12-30 | Inpaint4DNeRF: Promptable Spatio-Temporal NeRF Inpainting with Generative Diffusion Models | Han Jiang et.al. | 2401.00208v1 | null |
2024-01-03 | Diffusion Model with Perceptual Loss | Shanchuan Lin et.al. | 2401.00110v2 | null |
2023-12-29 | Generating Enhanced Negatives for Training Language-Based Object Detectors | Shiyu Zhao et.al. | 2401.00094v1 | null |
2024-01-02 | 6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation | Li Xu et.al. | 2401.00029v2 | null |
2023-12-29 | FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis | Feng Liang et.al. | 2312.17681v1 | null |
2023-12-29 | Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models | Kay Liu et.al. | 2312.17679v1 | link |
2023-12-29 | Leveraging Open-Vocabulary Diffusion to Camouflaged Instance Segmentation | Tuan-Anh Vu et.al. | 2312.17505v1 | null |
2023-12-28 | Classifier-free graph diffusion for molecular property targeting | Matteo Ninniri et.al. | 2312.17397v1 | null |
2023-12-28 | iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views | Chin-Hsuan Wu et.al. | 2312.17250v1 | link |
2023-12-28 | Personalized Restoration via Dual-Pivot Tuning | Pradyumna Chari et.al. | 2312.17234v1 | null |
2023-12-28 | 4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency | Yuyang Yin et.al. | 2312.17225v1 | null |
2023-12-28 | Restoration by Generation with Constrained Priors | Zheng Ding et.al. | 2312.17161v1 | null |
2023-12-28 | DiffKG: Knowledge Graph Diffusion Model for Recommendation | Yangqin Jiang et.al. | 2312.16890v1 | link |
2023-12-29 | DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors | Biwen Lei et.al. | 2312.16837v2 | null |
2023-12-27 | I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models | Xun Guo et.al. | 2312.16693v1 | null |
2023-12-27 | Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection | Huan Liu et.al. | 2312.16649v1 | null |
2023-12-27 | Image Restoration by Denoising Diffusion Models with Iteratively Preconditioned Guidance | Tomer Garber et.al. | 2312.16519v1 | null |
2023-12-29 | PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion | Guansong Lu et.al. | 2312.16486v2 | null |
2024-01-03 | SVGDreamer: Text Guided SVG Generation with Diffusion Model | Ximing Xing et.al. | 2312.16476v2 | null |
2023-12-27 | Natural Adversarial Patch Generation Method Based on Latent Diffusion Model | Xianyi Chen et.al. | 2312.16401v1 | null |
2023-12-24 | Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks | Christian Simon et.al. | 2312.16218v1 | null |
2023-12-23 | Iterative Prompt Relabeling for diffusion model with RLDF | Jiaxin Ge et.al. | 2312.16204v1 | null |
2023-12-26 | One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications | Mengyao Lyu et.al. | 2312.16145v1 | null |
2023-12-26 | Compositional Search of Stable Crystalline Structures in Multi-Component Alloys Using Generative Diffusion Models | Grzegorz Kaszuba et.al. | 2312.16073v1 | null |
2023-12-26 | HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D | Sangmin Woo et.al. | 2312.15980v1 | link |
2023-12-26 | Semantic Guidance Tuning for Text-To-Image Diffusion Models | Hyun Kang et.al. | 2312.15964v1 | null |
2023-12-26 | Implied volatility (also) is path-dependent | Hervé Andrès et.al. | 2312.15950v1 | link |
2023-12-26 | EnchantDance: Unveiling the Potential of Music-Driven Dance Movement | Bo Han et.al. | 2312.15946v1 | link |
2023-12-26 | Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection | Songmin Dai et.al. | 2312.15911v1 | null |
2023-12-26 | Cross Initialization for Personalized Text-to-Image Generation | Lianyu Pang et.al. | 2312.15905v1 | link |
2024-01-02 | Adversarial Item Promotion on Visually-Aware Recommender Systems by Guided Diffusion | Lijian Chen et.al. | 2312.15826v3 | null |
2023-12-28 | High-Fidelity Diffusion-based Image Editing | Chen Hou et.al. | 2312.15707v2 | null |
2023-12-25 | A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation | Yongkang Wang et.al. | 2312.15665v1 | link |
2023-12-25 | Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation | Bingzhi Liu et.al. | 2312.15628v1 | null |
2023-12-25 | Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models | Haiwei Xue et.al. | 2312.15567v1 | null |
2023-12-27 | A-SDM: Accelerating Stable Diffusion through Redundancy Removal and Performance Optimization | Jinchao Zhu et.al. | 2312.15516v2 | null |
2023-12-24 | Diffusion-EXR: Controllable Review Generation for Explainable Recommendation via Diffusion Models | Ling Li et.al. | 2312.15490v1 | null |
2023-12-24 | A Two-stage Personalized Virtual Try-on Framework with Shape Control and Texture Guidance | Shufang Zhang et.al. | 2312.15480v1 | null |
2023-12-23 | Prompt-Propose-Verify: A Reliable Hand-Object-Interaction Data Generation Framework using Foundational Models | Gurusha Juneja et.al. | 2312.15247v1 | null |
2023-12-23 | CaLDiff: Camera Localization in NeRF via Pose Diffusion | Rashik Shrestha et.al. | 2312.15242v1 | null |
2023-12-23 | Majority-based Preference Diffusion on Social Networks | Ahad N. Zehmakan et.al. | 2312.15140v1 | null |
2023-12-22 | Spectrally Decomposed Diffusion Models for Generative Turbulence Recovery | Mohammed Sardar et.al. | 2312.15029v1 | null |
2023-12-22 | FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing | Mingyuan Zhang et.al. | 2312.15004v1 | null |
2023-12-22 | Emage: Non-Autoregressive Text-to-Image Generation | Zhangyin Feng et.al. | 2312.14988v1 | null |
2023-12-21 | Diffusion Models for Generative Artificial Intelligence: An Introduction for Applied Mathematicians | Catherine F. Higham et.al. | 2312.14977v1 | null |
2023-12-21 | Gaussian Harmony: Attaining Fairness in Diffusion-based Face Generation Models | Basudha Pal et.al. | 2312.14976v1 | null |
2023-12-22 | MACS: Mass Conditioned 3D Hand and Object Motion Synthesis | Soshi Shimada et.al. | 2312.14929v1 | null |
2023-12-22 | BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction | Honghao Fu et.al. | 2312.14871v1 | null |
2023-12-22 | Neural-network-based regularization methods for inverse problems in imaging | Andreas Habring et.al. | 2312.14849v1 | null |
2023-12-22 | Dreaming of Electrical Waves: Generative Modeling of Cardiac Excitation Waves using Diffusion Models | Tanish Baranwal et.al. | 2312.14830v1 | null |
2023-12-22 | Neural network models for preferential concentration of particles in two-dimensional turbulence | Thibault Maurel-Oujia et.al. | 2312.14829v1 | null |
2023-12-22 | Plan, Posture and Go: Towards Open-World Text-to-Motion Generation | Jinpeng Liu et.al. | 2312.14828v1 | null |
2023-12-22 | Harnessing Diffusion Models for Visual Perception with Meta Prompts | Qiang Wan et.al. | 2312.14733v1 | link |
2023-12-22 | FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection | Dongmei Zhang et.al. | 2312.14465v1 | null |
2023-12-22 | Generative AI Beyond LLMs: System Implications of Multi-Modal Generation | Alicia Golden et.al. | 2312.14385v1 | null |
2023-12-21 | Single-Cell RNA-seq Synthesis with Latent Diffusion Model | Yixuan Wang et.al. | 2312.14220v1 | null |
2023-12-21 | DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models | Brian Nlong Zhao et.al. | 2312.14216v1 | null |
2023-12-21 | Diffusion Reward: Learning Rewards via Conditional Video Diffusion | Tao Huang et.al. | 2312.14134v1 | null |
2023-12-21 | Neural Point Cloud Diffusion for Disentangled 3D Shape and Appearance Generation | Philipp Schröppel et.al. | 2312.14124v1 | link |
2023-12-25 | HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models | Hayk Manukyan et.al. | 2312.14091v2 | link |
2023-12-21 | Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning | Desai Xie et.al. | 2312.13980v1 | null |
2023-12-22 | Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models | Xianfang Zeng et.al. | 2312.13913v2 | link |
2023-12-20 | Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis | Bichen Wu et.al. | 2312.13834v1 | null |
2023-12-21 | Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models | Huan Ling et.al. | 2312.13763v1 | null |
2023-12-21 | Free-Editor: Zero-shot Text-driven 3D Scene Editing | Nazmul Karim et.al. | 2312.13663v1 | null |
2023-12-21 | Diff-Oracle: Diffusion Model for Oracle Character Generation with Controllable Styles and Contents | Jing Li et.al. | 2312.13631v1 | null |
2023-12-21 | Navigating the Structured What-If Spaces: Counterfactual Generation via Structured Diffusion | Nishtha Madaan et.al. | 2312.13616v1 | null |
2023-12-21 | Front stability of infinitely steep travelling waves in population biology | Matthew J Simpson et.al. | 2312.13601v1 | link |
2023-12-20 | Unlocking Pre-trained Image Backbones for Semantic Image Synthesis | Tariq Berrada et.al. | 2312.13314v1 | null |
2023-12-20 | Generate E-commerce Product Background by Integrating Category Commonality and Personalized Style | Haohan Wang et.al. | 2312.13309v1 | null |
2023-12-20 | Not All Steps are Equal: Efficient Generation with Progressive Diffusion Models | Wenhao Li et.al. | 2312.13307v1 | null |
2023-12-27 | Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting | Junwu Zhang et.al. | 2312.13271v3 | link |
2023-12-20 | Conditional Image Generation with Pretrained Generative Model | Rajesh Shrestha et.al. | 2312.13253v1 | null |
2023-12-20 | Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model | Saurabh Saxena et.al. | 2312.13252v1 | null |
2023-12-20 | Diffusion Models With Learned Adaptive Noise | Subham Sekhar Sahoo et.al. | 2312.13236v1 | link |
2023-12-22 | DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis | Yuming Gu et.al. | 2312.13016v3 | link |
2023-12-21 | RadEdit: stress-testing biomedical vision models via diffusion image editing | Fernando Pérez-García et.al. | 2312.12865v2 | null |
2023-12-20 | ReCo-Diff: Explore Retinex-Based Condition Strategy in Diffusion Model for Low-Light Image Enhancement | Yuhui Wu et.al. | 2312.12826v1 | null |
2023-12-20 | All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models | Seunghoo Hong et.al. | 2312.12807v1 | null |
2023-12-21 | AMD:Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion | Beibei Jing et.al. | 2312.12763v2 | null |
2023-12-20 | How Good Are Deep Generative Models for Solving Inverse Problems? | Shichong Peng et.al. | 2312.12691v1 | null |
2023-12-19 | Surf-CDM: Score-Based Surface Cold-Diffusion Model For Medical Image Segmentation | Fahim Ahmed Zaman et.al. | 2312.12649v1 | null |
2023-12-19 | Fixed-point Inversion for Text-to-image diffusion models | Barak Meiri et.al. | 2312.12540v1 | null |
2023-12-19 | StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation | Akio Kodaira et.al. | 2312.12491v1 | link |
2023-12-19 | InstructVideo: Instructing Video Diffusion Models with Human Feedback | Hangjie Yuan et.al. | 2312.12490v1 | null |
2023-12-19 | Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models | Angela Castillo et.al. | 2312.12487v1 | null |
2023-12-19 | Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion | Fan Zhang et.al. | 2312.12471v1 | link |
2023-12-19 | MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers | Haoyu Ma et.al. | 2312.12468v1 | null |
2023-12-19 | On Inference Stability for Diffusion Models | Viet Nguyen et.al. | 2312.12431v1 | link |
2023-12-19 | Scene-Conditional 3D Object Stylization and Composition | Jinghao Zhou et.al. | 2312.12419v1 | null |
2023-12-19 | Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models | Shweta Mahajan et.al. | 2312.12416v1 | null |
2023-12-19 | Travelling pulses on three spatial scales in a Klausmeier-type vegetation-autotoxicity model | Paul Carter et.al. | 2312.12277v1 | null |
2023-12-19 | Intrinsic Image Diffusion for Single-view Material Estimation | Peter Kocsis et.al. | 2312.12274v1 | null |
2023-12-19 | Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model | Lingjun Zhang et.al. | 2312.12232v1 | link |
2023-12-19 | HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback | Gaoge Han et.al. | 2312.12227v1 | null |
2023-12-19 | FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning | Zhenhua Yang et.al. | 2312.12142v1 | link |
2023-12-19 | GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction | Haodong Yan et.al. | 2312.12090v1 | null |
2023-12-19 | Learning Subject-Aware Cropping by Outpainting Professional Photos | James Hong et.al. | 2312.12080v1 | null |
2023-12-19 | Resource-efficient Generative Mobile Edge Networks in 6G Era: Fundamentals, Framework and Case Study | Bingkun Lai et.al. | 2312.12063v1 | null |
2023-12-19 | Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method | Jiachun Pan et.al. | 2312.12030v1 | null |
2023-12-19 | Diffusing More Objects for Semi-Supervised Domain Adaptation with Less Labeling | Leander van den Heuvel et.al. | 2312.12000v1 | null |
2023-12-19 | Optimizing Diffusion Noise Can Serve As Universal Motion Priors | Korrawe Karunratanakul et.al. | 2312.11994v1 | null |
2023-12-19 | Extending intraday solar forecast horizons with deep generative models | Alberto Carpentieri et.al. | 2312.11966v1 | link |
2023-12-19 | Text-Image Conditioned Diffusion for Consistent Text-to-3D Generation | Yuze He et.al. | 2312.11774v1 | null |
2023-12-18 | Learning a Diffusion Model Policy from Rewards via Q-Score Matching | Michael Psenka et.al. | 2312.11752v1 | null |
2023-12-18 | Unified framework for diffusion generative models in SO(3): applications in computer vision and astrophysics | Yesukhei Jagvaral et.al. | 2312.11707v1 | null |
2023-12-18 | HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles | Vanessa Sklyarova et.al. | 2312.11666v1 | null |
2023-12-18 | SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution | Zhixuan Liang et.al. | 2312.11598v1 | null |
2023-12-18 | TIP: Text-Driven Image Processing with Semantic and Restoration Instructions | Chenyang Qi et.al. | 2312.11595v1 | null |
2023-12-15 | Iterative Motion Editing with Natural Language | Purvi Goel et.al. | 2312.11538v1 | null |
2023-12-15 | Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior | Nan Huang et.al. | 2312.11535v1 | null |
2023-12-18 | VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder | Zhicong Tang et.al. | 2312.11459v1 | link |
2023-12-18 | PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models | Antonio Alliegro et.al. | 2312.11417v1 | null |
2023-12-21 | MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance | Qi Mao et.al. | 2312.11396v2 | null |
2023-12-18 | SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing | Zeyinzi Jiang et.al. | 2312.11392v1 | null |
2023-12-18 | Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model | Decheng Liu et.al. | 2312.11285v1 | link |
2023-12-18 | GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models | Kuldeep R Barad et.al. | 2312.11243v1 | null |
2023-12-18 | Multi-scale Reconstruction of Turbulent Rotating Flows with Generative Diffusion Models | Tianyi Li et.al. | 2312.11121v1 | null |
2023-12-20 | DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models | Jiachen Zhou et.al. | 2312.11057v2 | link |
2023-12-18 | Realistic Human Motion Generation with Cross-Diffusion Models | Zeping Ren et.al. | 2312.10993v1 | null |
2023-12-18 | Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model | Zhenyu Xie et.al. | 2312.10960v1 | null |
2023-12-20 | A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm | Yong Niu et.al. | 2312.10885v2 | null |
2023-12-17 | Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models | Nikita Starodubcev et.al. | 2312.10835v1 | link |
2023-12-17 | CogCartoon: Towards Practical Story Visualization | Zhongyang Zhu et.al. | 2312.10718v1 | null |
2023-12-19 | VidToMe: Video Token Merging for Zero-Shot Video Editing | Xirui Li et.al. | 2312.10656v2 | null |
2023-12-16 | VecFusion: Vector Font Generation with Diffusion | Vikas Thamizharasan et.al. | 2312.10540v1 | null |
2023-12-16 | A Unified Filter Method for Jointly Estimating State and Parameters of Stochastic Dynamical Systems via the Ensemble Score Filter | Feng Bao et.al. | 2312.10503v1 | null |
2023-12-16 | Continuous Diffusion for Mixed-Type Tabular Data | Markus Mueller et.al. | 2312.10431v1 | null |
2023-12-16 | Lecture Notes in Probabilistic Diffusion Models | Inga Strümke et.al. | 2312.10393v1 | null |
2023-12-16 | Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge | Conghan Yue et.al. | 2312.10299v1 | null |
2023-12-15 | Two simple criterion to prove the existence of patterns in reaction-diffusion models of two components | Francisco J. Vielma-Leal et.al. | 2312.10231v1 | null |
2023-12-15 | Tell Me What You See: Text-Guided Real-World Image Denoising | Erez Yosef et.al. | 2312.10191v1 | null |
2023-12-19 | Improving new physics searches with diffusion models for event observables and jet constituents | Debajyoti Sengupta et.al. | 2312.10130v2 | null |
2023-12-15 | MVHuman: Tailoring 2D Diffusion with Multi-view Sampling For Realistic 3D Human Generation | Suyi Jiang et.al. | 2312.10120v1 | null |
2023-12-15 | Plasticine3D: Non-rigid 3D editting with text guidance | Yige Chen et.al. | 2312.10111v1 | null |
2023-12-15 | Latent Diffusion Models with Image-Derived Annotations for Enhanced AI-Assisted Cancer Diagnosis in Histopathology | Pedro Osorio et.al. | 2312.09792v1 | null |
2023-12-15 | DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models | Yifeng Ma et.al. | 2312.09767v1 | null |
2023-12-19 | PPFM: Image denoising in photon-counting CT using single-step posterior sampling Poisson flow generative models | Dennis Hein et.al. | 2312.09754v2 | link |
2023-12-15 | Positivity and global existence for nonlocal advection-diffusion models of interacting populations | Valeria Giunta et.al. | 2312.09692v1 | null |
2023-12-15 | Exploring the Feasibility of Generating Realistic 3D Models of Endangered Species Using DreamGaussian: An Analysis of Elevation Angle's Impact on Model Generation | Selcuk Anil Karatopak et.al. | 2312.09682v1 | null |
2023-12-15 | Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models | Senmao Li et.al. | 2312.09608v1 | link |
2023-12-15 | Single PW takes a shortcut to compound PW in US imaging | Zhiqiang Li et.al. | 2312.09514v1 | null |
2023-12-15 | Fast Sampling generative model for Ultrasound image reconstruction | Hengrong Lan et.al. | 2312.09510v1 | null |
2023-12-18 | Unbiasing Enhanced Sampling on a High-dimensional Free Energy Surface with Deep Generative Model | Yikai Liu et.al. | 2312.09404v2 | null |
2023-12-14 | LatentEditor: Text Driven Local Editing of 3D Scenes | Umar Khalid et.al. | 2312.09313v1 | link |
2023-12-14 | LIME: Localized Image Editing via Attention Regularization in Diffusion Models | Enis Simsar et.al. | 2312.09256v1 | null |
2023-12-14 | FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection | Hongsuk Choi et.al. | 2312.09252v1 | null |
2023-12-14 | Single Mesh Diffusion Models with Field Latents for Texture Generation | Thomas W. Mitchel et.al. | 2312.09250v1 | null |
2023-12-14 | A framework for conditional diffusion modelling with applications in motif scaffolding for protein design | Kieran Didi et.al. | 2312.09236v1 | null |
2023-12-14 | Mosaic-SDF for 3D Generative Models | Lior Yariv et.al. | 2312.09222v1 | null |
2023-12-14 | Fast Sampling via De-randomization for Discrete Diffusion Models | Zixiang Chen et.al. | 2312.09193v1 | null |
2023-12-14 | Improving Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architectures | Huijie Zhang et.al. | 2312.09181v1 | null |
2023-12-14 | DiffusionLight: Light Probes for Free by Painting a Chrome Ball | Pakkapon Phongthawee et.al. | 2312.09168v1 | link |
2023-12-14 | Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers | Zi-Xin Zou et.al. | 2312.09147v1 | null |
2023-12-14 | VideoLCM: Video Latent Consistency Model | Xiang Wang et.al. | 2312.09109v1 | null |
2023-12-14 | PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion | Ying-Tian Liu et.al. | 2312.09069v1 | null |
2023-12-14 | Brain Diffuser with Hierarchical Transformer for MCI Causality Analysis | Qiankun Zuo et.al. | 2312.09022v1 | null |
2023-12-18 | OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers | Han Liang et.al. | 2312.08985v2 | null |
2023-12-14 | Motion Flow Matching for Human Motion Synthesis and Editing | Vincent Tao Hu et.al. | 2312.08895v1 | null |
2023-12-14 | VaLID: Variable-Length Input Diffusion for Novel View Synthesis | Shijie Li et.al. | 2312.08892v1 | null |
2023-12-13 | SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance | Yuanyou Xu et.al. | 2312.08889v1 | null |
2023-12-15 | SpeedUpNet: A Plug-and-Play Hyper-Network for Accelerating Text-to-Image Diffusion Models | Weilong Chai et.al. | 2312.08887v2 | null |
2023-12-13 | Diffusion-based Blind Text Image Super-Resolution | Yuzhe Zhang et.al. | 2312.08886v1 | null |
2023-12-13 | SceneWiz3D: Towards Text-guided 3D Scene Composition | Qihang Zhang et.al. | 2312.08885v1 | null |
2023-12-13 | Semantic-Driven Initial Image Construction for Guided Image Synthesis in Diffusion Model | Jiafeng Mao et.al. | 2312.08872v1 | null |
2023-12-14 | Diffusion-C: Unveiling the Generative Challenges of Diffusion Models through Corrupted Data | Keywoong Bae et.al. | 2312.08843v1 | null |
2023-12-14 | Speeding up Photoacoustic Imaging using Diffusion Models | Irem Loc et.al. | 2312.08834v1 | link |
2023-12-14 | Guided Diffusion from Self-Supervised Diffusion Features | Vincent Tao Hu et.al. | 2312.08825v1 | null |
2023-12-14 | Reconstruction of Sound Field through Diffusion Models | Federico Miotello et.al. | 2312.08821v1 | null |
2023-12-14 | Local Conditional Controlling for Text-to-Image Diffusion Models | Yibo Zhao et.al. | 2312.08768v1 | link |
2023-12-14 | UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation | Zexiang Liu et.al. | 2312.08754v1 | null |
2023-12-17 | DreamDrone | Hanyang Kong et.al. | 2312.08746v2 | null |
2023-12-14 | GOEnFusion: Gradient Origin Encodings for 3D Forward Diffusion Models | Animesh Karnewar et.al. | 2312.08744v1 | null |
2023-12-14 | Joint2Human: High-quality 3D Human Generation via Compact Spherical Embedding of 3D Joints | Muxin Zhang et.al. | 2312.08591v1 | null |
2023-12-13 | NViST: In the Wild New View Synthesis from a Single Image with Transformers | Wonbong Jang et.al. | 2312.08568v1 | null |
2023-12-13 | Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models | Liangchen Song et.al. | 2312.08563v1 | null |
2023-12-13 | World Models via Policy-Guided Trajectory Diffusion | Marc Rigter et.al. | 2312.08533v1 | link |
2023-12-13 | PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models | Robin Netzorg et.al. | 2312.08494v1 | null |
2023-12-13 | FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models | Shivangi Aneja et.al. | 2312.08459v1 | null |
2023-12-13 | PhenDiff: Revealing Invisible Phenotypes with Conditional Diffusion Models | Anis Bourou et.al. | 2312.08290v1 | link |
2023-12-13 | Black-box Membership Inference Attacks against Fine-tuned Diffusion Models | Yan Pang et.al. | 2312.08207v1 | null |
2023-12-13 | Concept-centric Personalization with Large-scale Diffusion Priors | Pu Cao et.al. | 2312.08195v1 | link |
2023-12-13 | ** |
Maxwell X. Cai et.al. | 2312.08153v1 | null |
2023-12-13 | Clockwork Diffusion: Efficient Generation With Model-Step Distillation | Amirhossein Habibian et.al. | 2312.08128v1 | null |
2023-12-13 | Knowledge-Aware Artifact Image Synthesis with LLM-Enhanced Prompting and Multi-Source Supervision | Shengguang Wu et.al. | 2312.08056v1 | null |
2023-12-14 | Compositional Inversion for Stable Diffusion Models | Xu-Lu Zhang et.al. | 2312.08048v2 | link |
2023-12-13 | AdapEdit: Spatio-Temporal Guided Adaptive Editing Algorithm for Text-Based Continuity-Sensitive Image Editing | Zhiyuan Ma et.al. | 2312.08019v1 | link |
2023-12-13 | Time Series Diffusion Method: A Denoising Diffusion Probabilistic Model for Vibration Signal Generation | Haiming Yi et.al. | 2312.07981v1 | null |
2023-12-13 | LMD: Faster Image Reconstruction with Latent Masking Diffusion | Zhiyuan Ma et.al. | 2312.07971v1 | link |
2023-12-13 | Semantic-aware Data Augmentation for Text-to-image Synthesis | Zhaorui Tan et.al. | 2312.07951v1 | null |
2023-12-13 | BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics | Wenqian Zhang et.al. | 2312.07937v1 | null |
2023-12-13 | SimAC: A Simple Anti-Customization Method against Text-to-Image Synthesis of Diffusion Models | Feifei Wang et.al. | 2312.07865v1 | null |
2023-12-13 | Diffusion Models Enable Zero-Shot Pose Estimation for Lower-Limb Prosthetic Users | Tianxun Zhou et.al. | 2312.07854v1 | null |
2023-12-14 | Noise in the reverse process improves the approximation capabilities of diffusion models | Karthik Elamvazhuthi et.al. | 2312.07851v2 | null |
2023-12-13 | Stable Rivers: A Case Study in the Application of Text-to-Image Generative Models for Earth Sciences | C Kupferschmidt et.al. | 2312.07833v1 | null |
2023-12-12 | Brain-optimized inference improves reconstructions of fMRI brain activity | Reese Kneeland et.al. | 2312.07705v1 | null |
2023-12-12 | FreeInit: Bridging Initialization Gap in Video Diffusion Models | Tianxing Wu et.al. | 2312.07537v1 | link |
2023-12-12 | FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition | Sicheng Mo et.al. | 2312.07536v1 | null |
2023-12-12 | Cosmological Field Emulation and Parameter Inference with Diffusion Models | Nayantara Mudur et.al. | 2312.07534v1 | null |
2023-12-12 | MinD-3D: Reconstruct High-quality 3D objects in Human Brain | Jianxiong Gao et.al. | 2312.07485v1 | null |
2023-12-12 | DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing | Kaiwen Zhang et.al. | 2312.07409v1 | null |
2023-12-12 | Boosting Latent Diffusion with Flow Matching | Johannes S. Fischer et.al. | 2312.07360v1 | link |
2023-12-12 | Learned representation-guided diffusion models for large-image generation | Alexandros Graikos et.al. | 2312.07330v1 | null |
2023-12-12 | GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos | Tomáš Souček et.al. | 2312.07322v1 | link |
2023-12-12 | Scalable Motion Style Transfer with Constrained Diffusion Generation | Wenjie Yin et.al. | 2312.07311v1 | null |
2023-12-12 | A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models | Enshu Liu et.al. | 2312.07243v1 | null |
2023-12-12 | Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation | Shentong Mo et.al. | 2312.07231v1 | null |
2023-12-12 | Equivariant Flow Matching with Hybrid Probability Transport | Yuxuan Song et.al. | 2312.07168v1 | null |
2023-12-12 | Text2AC-Zero: Consistent Synthesis of Animated Characters using 2D Diffusion | Abdelrahman Eldesokey et.al. | 2312.07133v1 | null |
2023-12-12 | Generating High-Resolution Regional Precipitation Using Conditional Diffusion Model | Naufal Shidqi et.al. | 2312.07112v1 | null |
2023-12-12 | Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation | Xianghui Xie et.al. | 2312.07063v1 | null |
2023-12-12 | Diff-OP3D: Bridging 2D Diffusion for Open Pose 3D Zero-Shot Classification | Weiguang Zhao et.al. | 2312.07039v1 | null |
2023-12-12 | Noise Distribution Decomposition based Multi-Agent Distributional Reinforcement Learning | Wei Geng et.al. | 2312.07025v1 | null |
2023-12-12 | On the notion of Hallucinations from the lens of Bias and Validity in Synthetic CXR Images | Gauri Bhardwaj et.al. | 2312.06979v1 | null |
2023-12-12 | CCM: Adding Conditional Controls to Text-to-Image Consistency Models | Jie Xiao et.al. | 2312.06971v1 | null |
2023-12-12 | LoRA-Enhanced Distillation on Guided Diffusion Models | Pareesa Ameneh Golnari et.al. | 2312.06899v1 | null |
2023-12-11 | Relightful Harmonization: Lighting-aware Portrait Background Replacement | Mengwei Ren et.al. | 2312.06886v1 | null |
2023-12-11 | Adversarial Estimation of Topological Dimension with Harmonic Score Maps | Eric Yeats et.al. | 2312.06869v1 | null |
2023-12-11 | SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models | Yuzhou Huang et.al. | 2312.06739v1 | link |
2023-12-11 | InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following | Shufan Li et.al. | 2312.06738v1 | link |
2023-12-11 | EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion | Zehuan Huang et.al. | 2312.06725v1 | null |
2023-12-11 | CAD: Photorealistic 3D Generation via Adversarial Distillation | Ziyu Wan et.al. | 2312.06663v1 | null |
2023-12-11 | Photorealistic Video Generation with Diffusion Models | Agrim Gupta et.al. | 2312.06662v1 | null |
2023-12-11 | UpFusion: Novel View Diffusion from Unposed Sparse View Observations | Bharath Raj Nagoor Kani et.al. | 2312.06661v1 | null |
2023-12-11 | Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior | Fangfu Liu et.al. | 2312.06655v1 | link |
2023-12-11 | Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution | Shangchen Zhou et.al. | 2312.06640v1 | null |
2023-12-11 | DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection | Haoyang He et.al. | 2312.06607v1 | link |
2023-12-11 | ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models | Denis Zavadski et.al. | 2312.06573v1 | link |
2023-12-11 | HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models | Xiaogang Peng et.al. | 2312.06553v1 | null |
2023-12-11 | STDiff: Spatio-temporal Diffusion for Continuous Stochastic Video Prediction | Xi Ye et.al. | 2312.06486v1 | link |
2023-12-11 | Semantic Image Synthesis for Abdominal CT | Yan Zhuang et.al. | 2312.06453v1 | null |
2023-12-11 | DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior | Tianyu Huang et.al. | 2312.06439v1 | null |
2023-12-11 | DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers | Aaron Mir et.al. | 2312.06400v1 | null |
2023-12-11 | PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization | Xu Peng et.al. | 2312.06354v1 | null |
2023-12-12 | DiffAIL: Diffusion Adversarial Imitation Learning | Bingzheng Wang et.al. | 2312.06348v2 | link |
2023-12-11 | Compensation Sampling for Improved Convergence in Diffusion Models | Hui Lu et.al. | 2312.06285v1 | null |
2023-12-11 | UIEDP:Underwater Image Enhancement with Diffusion Prior | Dazhao Du et.al. | 2312.06240v1 | null |
2023-12-11 | The Journey, Not the Destination: How Data Guides Diffusion Models | Kristian Georgiev et.al. | 2312.06205v1 | null |
2023-12-11 | Offloading and Quality Control for AI Generated Content Services in Edge Computing Networks | Yitong Wang et.al. | 2312.06203v1 | null |
2023-12-11 | Optimized View and Geometry Distillation from Multi-view Diffuser | Youjia Zhang et.al. | 2312.06198v1 | null |
2023-12-11 | SP-DiffDose: A Conditional Diffusion Model for Radiation Dose Prediction Based on Multi-Scale Fusion of Anatomical Structures, Guided by SwinTransformer and Projector | Linjie Fu et.al. | 2312.06187v1 | null |
2023-12-11 | ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank | Zhanjie Zhang et.al. | 2312.06135v1 | null |
2023-12-11 | Probabilistic Precipitation Downscaling with Optical Flow-Guided Diffusion | Prakhar Srivastava et.al. | 2312.06071v1 | null |
2023-12-11 | PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration | Yue Wu et.al. | 2312.06063v1 | null |
2023-12-11 | CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models | Tuna Han Salih Meral et.al. | 2312.06059v1 | null |
2023-12-10 | Correcting Diffusion Generation through Resampling | Yujian Liu et.al. | 2312.06038v1 | link |
2023-12-10 | A Note on the Convergence of Denoising Diffusion Probabilistic Models | Sokhna Diarra Mbacke et.al. | 2312.05989v1 | null |
2023-12-10 | InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models | Jiun Tian Hoe et.al. | 2312.05849v1 | null |
2023-12-10 | Toward Open-ended Embodied Tasks Solving | William Wei Wang et.al. | 2312.05822v1 | null |
2023-12-10 | HumanCoser: Layered 3D Human Generation via Semantic-Aware Diffusion Model | Yi Wang et.al. | 2312.05804v1 | null |
2023-12-10 | AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model | Teng Hu et.al. | 2312.05767v1 | null |
2023-12-09 | Iterative Token Evaluation and Refinement for Real-World Super-Resolution | Chaofeng Chen et.al. | 2312.05616v1 | link |
2023-12-09 | Generative AI for Physical Layer Communications: A Survey | Nguyen Van Huynh et.al. | 2312.05594v1 | null |
2023-12-09 | DPoser: Diffusion Model as Robust 3D Human Pose Prior | Junzhe Lu et.al. | 2312.05541v1 | link |
2023-12-09 | BARET : Balanced Attention based Real image Editing driven by Target-text Inversion | Yuming Qiao et.al. | 2312.05482v1 | null |
2023-12-09 | Spectroscopy-Guided Discovery of Three-Dimensional Structures of Disordered Materials with Diffusion Models | Hyuna Kwon et.al. | 2312.05472v1 | link |
2023-12-09 | Identifying and Mitigating Model Failures through Few-shot CLIP-aided Diffusion Generation | Atoosa Chegini et.al. | 2312.05464v1 | null |
2023-12-09 | Efficient Quantization Strategies for Latent Diffusion Models | Yuewei Yang et.al. | 2312.05431v1 | null |
2023-12-08 | CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling | Ruihan Yang et.al. | 2312.05412v1 | null |
2023-12-08 | NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models | Yusuf Dalva et.al. | 2312.05390v1 | null |
2023-12-08 | Cross Domain Generative Augmentation: Domain Generalization with Latent Diffusion Models | Sobhan Hemati et.al. | 2312.05387v1 | null |
2023-12-08 | MotionCrafter: One-Shot Motion Customization of Diffusion Models | Yuxin Zhang et.al. | 2312.05288v1 | link |
2023-12-08 | KBFormer: A Diffusion Model for Structured Entity Completion | Ouail Kitouni et.al. | 2312.05253v1 | null |
2023-12-08 | SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation | Thuan Hoang Nguyen et.al. | 2312.05239v1 | null |
2023-12-08 | Membership Inference Attacks on Diffusion Models via Quantile Regression | Shuai Tang et.al. | 2312.05140v1 | null |
2023-12-11 | DreaMoving: A Human Video Generation Framework based on Diffusion Models | Mengyang Feng et.al. | 2312.05107v2 | null |
2023-12-08 | SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control | Jaskirat Singh et.al. | 2312.05039v1 | null |
2023-12-07 | Customizing Motion in Text-to-Video Diffusion Models | Joanna Materzynska et.al. | 2312.04966v1 | null |
2023-12-07 | Inversion-Free Image Editing with Natural Language | Sihan Xu et.al. | 2312.04965v1 | null |
2023-12-08 | UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models | Yiming Zhao et.al. | 2312.04884v1 | link |
2023-12-08 | MVDD: Multi-View Depth Diffusion Models | Zhen Wang et.al. | 2312.04875v1 | null |
2023-12-08 | HandDiffuse: Generative Controllers for Two-Hand Interactions via Diffusion Models | Pei Lin et.al. | 2312.04867v1 | null |
2023-12-08 | Learn to Optimize Denoising Scores for 3D Generation: A Unified and Improved Diffusion Prior on NeRF and 3D Gaussian Splatting | Xiaofeng Yang et.al. | 2312.04820v1 | null |
2023-12-08 | A Unified Particle-Based Solver for Non-Newtonian Behaviors Simulation | Chunlei Li et.al. | 2312.04814v1 | null |
2023-12-08 | RS-Corrector: Correcting the Racial Stereotypes in Latent Diffusion Models | Yue Jiang et.al. | 2312.04810v1 | null |
2023-12-08 | RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation | Aradhya N. Mathur et.al. | 2312.04806v1 | null |
2023-12-08 | MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model | Kaiyu Song et.al. | 2312.04802v1 | null |
2023-12-08 | Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video | Yuchen Rao et.al. | 2312.04784v1 | null |
2023-12-08 | Fine-Tuning InstructPix2Pix for Advanced Image Colorization | Zifeng An et.al. | 2312.04780v1 | null |
2023-12-07 | Diffence: Fencing Membership Privacy With Diffusion Models | Yuefeng Peng et.al. | 2312.04692v1 | null |
2023-12-07 | ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations | Maitreya Patel et.al. | 2312.04655v1 | null |
2023-12-07 | NeuSD: Surface Completion with Multi-View Text-to-Image Diffusion | Savva Ignatyev et.al. | 2312.04654v1 | null |
2023-12-07 | Gen2Det: Generate to Detect | Saksham Suri et.al. | 2312.04566v1 | null |
2023-12-07 | NeRFiller: Completing Scenes via Generative 3D Inpainting | Ethan Weber et.al. | 2312.04560v1 | null |
2023-12-07 | PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation | Zhaoxi Chen et.al. | 2312.04559v1 | link |
2023-12-07 | GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation | Shoufa Chen et.al. | 2312.04557v1 | null |
2023-12-07 | Generating Illustrated Instructions | Sachit Menon et.al. | 2312.04552v1 | null |
2023-12-07 | PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play | Lili Chen et.al. | 2312.04549v1 | null |
2023-12-07 | Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance | Yuto Enyo et.al. | 2312.04529v1 | null |
2023-12-07 | RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models | Ozgur Kara et.al. | 2312.04524v1 | link |
2023-12-07 | Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation | Zhiwu Qing et.al. | 2312.04483v1 | null |
2023-12-07 | Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion | Kiran Chhatre et.al. | 2312.04466v1 | link |
2023-12-07 | FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models | Stathis Galanakis et.al. | 2312.04465v1 | null |
2023-12-07 | DreamVideo: Composing Your Dream Videos with Customized Subject and Motion | Yujie Wei et.al. | 2312.04433v1 | null |
2023-12-07 | Approximate Caching for Efficiently Serving Diffusion Models | Shubham Agarwal et.al. | 2312.04429v1 | null |
2023-12-07 | Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views | Yabo Chen et.al. | 2312.04424v1 | null |
2023-12-07 | Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models | Jiayi Guo et.al. | 2312.04410v1 | link |
2023-12-07 | Adversarial Denoising Diffusion Model for Unsupervised Anomaly Detection | Jongmin Yu et.al. | 2312.04382v1 | null |
2023-12-07 | Generating Multiphase Fluid Configurations in Fractures using Diffusion Models | Jaehong Chung et.al. | 2312.04375v1 | null |
2023-12-07 | Investigating the Design Space of Diffusion Models for Speech Enhancement | Philippe Gonzalez et.al. | 2312.04370v1 | null |
2023-12-07 | Improved Efficient Two-Stage Denoising Diffusion Power System Measurement Recovery Against False Data Injection Attacks and Data Losses | Jianhua Pei et.al. | 2312.04346v1 | null |
2023-12-07 | Multi-View Unsupervised Image Generation with Cross Attention Guidance | Llukman Cerkezi et.al. | 2312.04337v1 | null |
2023-12-07 | iDesigner: A High-Resolution and Complex-Prompt Following Text-to-Image Diffusion Model for Interior Design | Ruyi Gan et.al. | 2312.04326v1 | null |
2023-12-07 | Guided Reconstruction with Conditioned Diffusion Models for Unsupervised Anomaly Detection in Brain MRIs | Finn Behrendt et.al. | 2312.04215v1 | link |
2023-12-07 | Diffusing Colors: Image Colorization with Text Guided Diffusion | Nir Zabari et.al. | 2312.04145v1 | null |
2023-12-07 | DiffusionPhase: Motion Diffusion in Frequency Domain | Weilin Wan et.al. | 2312.04036v1 | null |
2023-12-07 | KOALA: Self-Attention Matters in Knowledge Distillation of Latent Diffusion Models for Memory-Efficient and Fast Image Synthesis | Youngwan Lee et.al. | 2312.04005v1 | null |
2023-12-07 | Stable diffusion for Data Augmentation in COCO and Weed Datasets | Boyang Deng et.al. | 2312.03996v1 | null |
2023-12-06 | Adapting HouseDiffusion for conditional Floor Plan generation on Modified Swiss Dwellings dataset | Emanuel Kuhn et.al. | 2312.03938v1 | null |
2023-12-06 | Controllable Human-Object Interaction Synthesis | Jiaman Li et.al. | 2312.03913v1 | null |
2023-12-06 | Inpaint3D: 3D Scene Content Generation using 2D Inpainting Diffusion | Kira Prabhu et.al. | 2312.03869v1 | null |
2023-12-06 | Diffusion Illusions: Hiding Images in Plain Sight | Ryan Burgert et.al. | 2312.03817v1 | null |
2023-12-06 | AVID: Any-Length Video Inpainting with Diffusion Model | Zhixing Zhang et.al. | 2312.03816v1 | link |
2023-12-06 | XCube ( |
Xuanchi Ren et.al. | 2312.03806v1 | null |
2023-12-06 | AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation | Xinzhou Wang et.al. | 2312.03795v1 | null |
2023-12-06 | AnimateZero: Video Diffusion Models are Zero-Shot Image Animators | Jiwen Yu et.al. | 2312.03793v1 | link |
2023-12-06 | FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability | Linze Li et.al. | 2312.03775v1 | null |
2023-12-08 | Self-conditioned Image Generation via Generating Representations | Tianhong Li et.al. | 2312.03701v2 | link |
2023-12-06 | Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication | Ali Naseh et.al. | 2312.03692v1 | null |
2023-12-06 | WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on | xujie zhang et.al. | 2312.03667v1 | null |
2023-12-06 | TokenCompose: Grounding Diffusion with Token-level Supervision | Zirui Wang et.al. | 2312.03626v1 | link |
2023-12-06 | DreamComposer: Controllable 3D Object Generation via Multi-View Conditions | Yunhan Yang et.al. | 2312.03611v1 | null |
2023-12-06 | DiffusionSat: A Generative Foundation Model for Satellite Imagery | Samar Khanna et.al. | 2312.03606v1 | null |
2023-12-06 | MMM: Generative Masked Motion Model | Ekkasit Pinyoanuntapong et.al. | 2312.03596v1 | null |
2023-12-06 | Personalized Face Inpainting with Diffusion Models by Parallel Visual Attention | Jianjin Xu et.al. | 2312.03556v1 | null |
2023-12-06 | FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation | Olivia Markham et.al. | 2312.03540v1 | null |
2023-12-06 | FRDiff: Feature Reuse for Exquisite Zero-shot Acceleration of Diffusion Models | Junhyuk So et.al. | 2312.03517v1 | null |
2023-12-06 | Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis | Zehua Chen et.al. | 2312.03491v1 | null |
2023-12-06 | F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis | Sitong Su et.al. | 2312.03459v1 | null |
2023-12-06 | Generalized Contrastive Divergence: Joint Training of Energy-Based Model and Diffusion Model through Inverse Reinforcement Learning | Sangwoong Yoon et.al. | 2312.03397v1 | null |
2023-12-06 | Diffused Task-Agnostic Milestone Planner | Mineui Hong et.al. | 2312.03395v1 | null |
2023-12-06 | DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction | Yanlong Li et.al. | 2312.03298v1 | null |
2023-12-06 | Cache Me if You Can: Accelerating Diffusion Models through Block Caching | Felix Wimbauer et.al. | 2312.03209v1 | null |
2023-12-05 | ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet | Soon Yau Cheong et.al. | 2312.03154v1 | null |
2023-12-05 | DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud Registration | Zhi Chen et.al. | 2312.03053v1 | null |
2023-12-05 | DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control | Yuru Jia et.al. | 2312.03048v1 | null |
2023-12-05 | MagicStick: Controllable Video Editing via Control Handle Transformations | Yue Ma et.al. | 2312.03047v1 | link |
2023-12-05 | Customization Assistant for Text-to-image Generation | Yufan Zhou et.al. | 2312.03045v1 | null |
2023-12-05 | DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance | Cong Wang et.al. | 2312.03018v1 | null |
2023-12-05 | Alchemist: Parametric Control of Material Properties with Diffusion Models | Prafull Sharma et.al. | 2312.02970v1 | null |
2023-12-05 | AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model | Boheng Zhao et.al. | 2312.02967v1 | null |
2023-12-05 | Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection | Cheng-Ju Ho et.al. | 2312.02966v1 | link |
2023-12-05 | A Diffusion Model of Dynamic Participant Inflow Management | Baris Ata et.al. | 2312.02927v1 | null |
2023-12-05 | Deterministic Guidance Diffusion Model for Probabilistic Weather Forecasting | Donggeun Yoon et.al. | 2312.02819v1 | link |
2023-12-05 | BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models | Fengyuan Shi et.al. | 2312.02813v1 | null |
2023-12-05 | Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions | Xu Shi et.al. | 2312.02772v1 | null |
2023-12-05 | Neural Sign Actors: A diffusion model for 3D sign language production from text | Vasileios Baltatzis et.al. | 2312.02702v1 | null |
2023-12-05 | Analyzing and Improving the Training Dynamics of Diffusion Models | Tero Karras et.al. | 2312.02696v1 | null |
2023-12-05 | Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions Using a Heun-Based Sampler | Philippe Gonzalez et.al. | 2312.02683v1 | null |
2023-12-05 | TPA3D: Triplane Attention for Fast Text-to-3D Generation | Hong-En Chen et.al. | 2312.02647v1 | null |
2023-12-05 | Diffusion Noise Feature: Accurate and Fast Generated Image Detection | Yichi Zhang et.al. | 2312.02625v1 | null |
2023-12-05 | Projection Regret: Reducing Background Bias for Novelty Detection via Diffusion Models | Sungik Choi et.al. | 2312.02615v1 | null |
2023-12-05 | GeNIe: Generative Hard Negative Images Through Diffusion | Soroush Abbasi Koohpayegani et.al. | 2312.02548v1 | link |
2023-12-05 | Retrieving Conditions from Reference Images for Diffusion Models | Haoran Tang et.al. | 2312.02521v1 | null |
2023-12-05 | Creative Agents: Empowering Agents with Imagination for Creative Tasks | Chi Zhang et.al. | 2312.02519v1 | link |
2023-12-05 | Orthogonal Adaptation for Modular Customization of Diffusion Models | Ryan Po et.al. | 2312.02432v1 | null |
2023-12-05 | Towards Granularity-adjusted Pixel-level Semantic Annotation | Rohit Kundu et.al. | 2312.02420v1 | null |
2023-12-04 | EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Motion Generation | Wenyang Zhou et.al. | 2312.02256v1 | null |
2023-12-04 | Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images | Zhuoran Yu et.al. | 2312.02253v1 | null |
2023-12-04 | Conditional Variational Diffusion Models | Gabriel della Maggiora et.al. | 2312.02246v1 | null |
2023-12-04 | X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model | Lingmin Ran et.al. | 2312.02238v1 | null |
2023-12-03 | Slice3D: Multi-Slice, Occlusion-Revealing, Single View 3D Reconstruction | Yizhi Wang et.al. | 2312.02221v1 | null |
2023-12-03 | DragVideo: Interactive Drag-style Video Editing | Yufan Deng et.al. | 2312.02216v1 | link |
2023-12-03 | Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting | Jin Liu et.al. | 2312.02212v1 | link |
2023-12-02 | ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation | Peng Wang et.al. | 2312.02201v1 | null |
2023-12-02 | Exploiting Diffusion Priors for All-in-One Image Restoration | Yuanbiao Gou et.al. | 2312.02197v1 | null |
2023-12-04 | Latent Feature-Guided Diffusion Models for Shadow Removal | Kangfu Mei et.al. | 2312.02156v1 | null |
2023-12-04 | Readout Guidance: Learning Control from Diffusion Features | Grace Luo et.al. | 2312.02150v1 | null |
2023-12-04 | Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Bingxin Ke et.al. | 2312.02145v1 | link |
2023-12-04 | DiffiT: Diffusion Vision Transformers for Image Generation | Ali Hatamizadeh et.al. | 2312.02139v1 | link |
2023-12-04 | Stochastic Optimal Control Matching | Carles Domingo-Enrich et.al. | 2312.02027v1 | null |
2023-12-04 | UniGS: Unified Representation for Image Generation and Segmentation | Lu Qi et.al. | 2312.01985v1 | link |
2023-12-04 | Generalization by Adaptation: Diffusion-Based Domain Extension for Domain-Generalized Semantic Segmentation | Joshua Niemeijer et.al. | 2312.01850v1 | null |
2023-12-04 | Collaborative Neural Painting | Nicola Dall'Asen et.al. | 2312.01800v1 | null |
2023-12-04 | Open-DDVM: A Reproduction and Extension of Diffusion Model for Optical Flow Estimation | Qiaole Dong et.al. | 2312.01746v1 | link |
2023-12-04 | Fully Spiking Denoising Diffusion Implicit Models | Ryo Watanabe et.al. | 2312.01742v1 | null |
2023-12-04 | StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On | Jeongho Kim et.al. | 2312.01725v1 | link |
2023-12-04 | ResEnsemble-DDPM: Residual Denoising Diffusion Probabilistic Models for Ensemble Learning | Shi Zhenning et.al. | 2312.01682v1 | null |
2023-12-03 | CalliPaint: Chinese Calligraphy Inpainting with Diffusion Model | Qisheng Liao et.al. | 2312.01536v1 | null |
2023-12-03 | CityGen: Infinite and Controllable 3D City Layout Generation | Jie Deng et.al. | 2312.01508v1 | null |
2023-12-03 | Existence of finite time blow-up in Keller-Segel system | Federico Buseghin et.al. | 2312.01475v1 | null |
2023-12-03 | Distilling Functional Rearrangement Priors from Large Models | Yiming Zeng et.al. | 2312.01474v1 | null |
2023-12-03 | Diffusion Posterior Sampling for Nonlinear CT Reconstruction | Shudong Li et.al. | 2312.01464v1 | null |
2023-12-03 | Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models | Shengqu Cai et.al. | 2312.01409v1 | null |
2023-12-03 | Improving In-Context Learning in Diffusion Models with Visual Context-Modulated Prompts | Tianqi Chen et.al. | 2312.01408v1 | null |
2023-12-03 | ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models | Jeong-gi Kwak et.al. | 2312.01305v1 | null |
2023-12-03 | Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series | Ying Liu et.al. | 2312.01294v1 | null |
2023-12-02 | PAC Privacy Preserving Diffusion Models | Qipan Xu et.al. | 2312.01201v1 | null |
2023-12-02 | Ultra-Resolution Cascaded Diffusion Model for Gigapixel Image Synthesis in Histopathology | Sarah Cechnicka et.al. | 2312.01152v1 | null |
2023-12-02 | ControlDreamer: Stylized 3D Generation with Multi-View ControlNet | Yeongtak Oh et.al. | 2312.01129v1 | null |
2023-12-02 | Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty | Cheng-Fu Yang et.al. | 2312.01097v1 | link |
2023-12-02 | Taming Latent Diffusion Models to See in the Dark | Qiang Wen et.al. | 2312.01027v1 | null |
2023-12-01 | Consistent Mesh Diffusion | Julian Knodt et.al. | 2312.00971v1 | null |
2023-12-01 | Enhancing Diffusion Models with 3D Perspective Geometry Constraints | Rishi Upadhyay et.al. | 2312.00944v1 | null |
2023-12-01 | Assessment of the Flamelet Generated Manifold method with preferential diffusion modelling for the prediction of partially premixed hydrogen flames | Eduardo Javier Pérez-Sánchez et.al. | 2312.00929v1 | null |
2023-12-01 | 3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing | Balamurugan Thambiraja et.al. | 2312.00870v1 | null |
2023-12-01 | DeepCache: Accelerating Diffusion Models for Free | Xinyin Ma et.al. | 2312.00858v1 | link |
2023-12-01 | Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution | Xi Yang et.al. | 2312.00853v1 | null |
2023-12-01 | Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion | Litu Rout et.al. | 2312.00852v1 | null |
2023-12-01 | VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models | Hyeonho Jeong et.al. | 2312.00845v1 | null |
2023-11-30 | Lasagna: Layered Score Distillation for Disentangled Object Relighting | Dina Bashkirova et.al. | 2312.00833v1 | null |
2023-11-30 | Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples | Phillip Howard et.al. | 2312.00825v1 | null |
2023-12-01 | TrackDiffusion: Multi-object Tracking Data Generation via Diffusion Models | Pengxiang Li et.al. | 2312.00651v1 | null |
2023-11-30 | LucidDreaming: Controllable Object-Centric 3D Generation | Zhaoning Wang et.al. | 2312.00588v1 | null |
2023-12-01 | Text-Guided 3D Face Synthesis -- From Generation to Editing | Yunjie Wu et.al. | 2312.00375v1 | null |
2023-11-30 | DREAM: Diffusion Rectification and Estimation-Adaptive Models | Jinxin Zhou et.al. | 2312.00210v1 | null |
2023-11-30 | S2ST: Image-to-Image Translation in the Seed Space of Latent Diffusion | Or Greenberg et.al. | 2312.00116v1 | null |
2023-11-30 | Fast ODE-based Sampling for Diffusion Models in Around 5 Steps | Zhenyu Zhou et.al. | 2312.00094v1 | null |
2023-11-30 | GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs | Gege Gao et.al. | 2312.00093v1 | null |
2023-11-30 | Generative Artificial Intelligence in Learning Analytics: Contextualising Opportunities and Challenges through the Learning Analytics Cycle | Lixiang Yan et.al. | 2312.00087v1 | null |
2023-11-30 | X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation | Yiwei Ma et.al. | 2312.00085v1 | null |
2023-11-30 | Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion? | Zhengyue Zhao et.al. | 2312.00084v1 | null |
2023-11-30 | HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models | Zhonghao Wang et.al. | 2312.00079v1 | null |
2023-11-29 | Unsupervised Keypoints from Pretrained Diffusion Models | Eric Hedlin et.al. | 2312.00065v1 | null |
2023-11-30 | VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Zhen Xing et.al. | 2311.18837v1 | null |
2023-11-30 | ART |
Wenming Weng et.al. | 2311.18834v1 | null |
2023-11-30 | Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction | Hsin-Ying Lee et.al. | 2311.18832v1 | link |
2023-11-30 | MotionEditor: Editing Video Motion via Content-Aware Diffusion | Shuyuan Tu et.al. | 2311.18830v1 | link |
2023-11-30 | MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Yanhui Wang et.al. | 2311.18829v1 | null |
2023-12-05 | One-step Diffusion with Distribution Matching Distillation | Tianwei Yin et.al. | 2311.18828v3 | null |
2023-11-30 | ElasticDiffusion: Training-free Arbitrary Size Image Generation | Moayed Haji-Ali et.al. | 2311.18822v1 | link |
2023-11-30 | Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters | James Seale Smith et.al. | 2311.18763v1 | null |
2023-11-30 | Detailed Human-Centric Text Description-Driven Large Scene Synthesis | Gwanghyun Kim et.al. | 2311.18654v1 | null |
2023-11-30 | Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing | Hyelin Nam et.al. | 2311.18608v1 | null |
2023-11-30 | DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution | Axi Niu et.al. | 2311.18508v1 | null |
2023-11-30 | Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis | Zipeng Qi et.al. | 2311.18435v1 | null |
2023-11-30 | CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model | Jianhao Zeng et.al. | 2311.18405v1 | link |
2023-11-30 | Age Effects on Decision-Making, Drift Diffusion Model | Zahra Kavian et.al. | 2311.18376v1 | null |
2023-11-30 | Prompt-Based Exemplar Super-Compression and Regeneration for Class-Incremental Learning | Ruxiao Duan et.al. | 2311.18266v1 | link |
2023-11-30 | Diffusion Models Without Attention | Jing Nathan Yan et.al. | 2311.18257v1 | null |
2023-11-30 | SMaRt: Improving GANs with Score Matching Regularity | Mengfei Xia et.al. | 2311.18208v1 | null |
2023-11-30 | HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation | Yifan Zhang et.al. | 2311.18158v1 | null |
2023-11-29 | Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing | Piper Wolters et.al. | 2311.18082v1 | link |
2023-11-29 | DiffGEPCI: 3D MRI Synthesis from mGRE Signals using 2.5D Diffusion Model | Yuyang Hu et.al. | 2311.18073v1 | null |
2023-11-29 | Turn Down the Noise: Leveraging Diffusion Models for Test-time Adaptation via Pseudo-label Ensembling | Mrigank Raman et.al. | 2311.18071v1 | null |
2023-11-29 | GELDA: A generative language annotation framework to reveal visual biases in datasets | Krish Kabra et.al. | 2311.18064v1 | null |
2023-11-29 | Echoes in the Noise: Posterior Samples of Faint Galaxy Surface Brightness Profiles with Score-Based Likelihoods and Priors | Alexandre Adam et.al. | 2311.18002v1 | null |
2023-11-29 | 4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling | Sherwin Bahmani et.al. | 2311.17984v1 | null |
2023-12-01 | GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation | Baorui Ma et.al. | 2311.17971v2 | link |
2023-11-29 | HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting | Wenquan Lu et.al. | 2311.17957v1 | link |
2023-11-29 | C3Net: Compound Conditioned ControlNet for Multimodal Content Generation | Juntao Zhang et.al. | 2311.17951v1 | null |
2023-11-28 | Unlocking Spatial Comprehension in Text-to-Image Diffusion Models | Mohammad Mahdi Derakhshani et.al. | 2311.17937v1 | null |
2023-11-30 | Do text-free diffusion models learn discriminative visual representations? | Soumik Mukhopadhyay et.al. | 2311.17921v2 | link |
2023-11-29 | Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models | Daniel Geng et.al. | 2311.17919v1 | null |
2023-11-29 | AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text | Jianfeng Zhang et.al. | 2311.17917v1 | null |
2023-11-29 | CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting | Alexander Vilesov et.al. | 2311.17907v1 | null |
2023-11-29 | SODA: Bottleneck Diffusion Models for Representation Learning | Drew A. Hudson et.al. | 2311.17901v1 | null |
2023-11-29 | Leveraging Graph Diffusion Models for Network Refinement Tasks | Puja Trivedi et.al. | 2311.17856v1 | null |
2023-11-30 | SPiC-E : Structural Priors in 3D Diffusion Models using Cross-Entity Attention | Etai Sella et.al. | 2311.17834v2 | null |
2023-11-29 | Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers | Chi-Pin Huang et.al. | 2311.17717v1 | null |
2023-11-29 | Fair Text-to-Image Diffusion via Fair Mapping | Jia Li et.al. | 2311.17695v1 | null |
2023-11-29 | AnyLens: A Generative Diffusion Model with Any Rendering Lens | Andrey Voynov et.al. | 2311.17609v1 | null |
2023-11-29 | Query-Relevant Images Jailbreak Large Multi-Modal Models | Xin Liu et.al. | 2311.17600v1 | null |
2023-11-29 | Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning | Liang Peng et.al. | 2311.17536v1 | link |
2023-11-29 | HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models | Shen Zhang et.al. | 2311.17528v1 | null |
2023-11-29 | MMA-Diffusion: MultiModal Attack on Diffusion Models | Yijun Yang et.al. | 2311.17516v1 | null |
2023-11-29 | When StyleGAN Meets Stable Diffusion: a |
Xiaoming Li et.al. | 2311.17461v1 | link |
2023-11-29 | DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Diffusion Model | Jiuming Liu et.al. | 2311.17456v1 | null |
2023-11-29 | Wireless Network Digital Twin for 6G: Generative AI as A Key Enabler | Zhenyu Tao et.al. | 2311.17451v1 | null |
2023-12-01 | VideoAssembler: Identity-Consistent Video Generation with Reference Entities using Diffusion Model | Haoyu Zhao et.al. | 2311.17338v2 | null |
2023-11-28 | Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation | Hang Li et.al. | 2311.17216v1 | null |
2023-11-28 | A point cloud approach to generative modeling for galaxy surveys at the field level | Carolina Cuesta-Lazaro et.al. | 2311.17141v1 | link |
2023-11-28 | Generative Models: What do they know? Do they know things? Let's find out! | Xiaodan Du et.al. | 2311.17137v1 | null |
2023-11-28 | Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis | Xiaohui Chen et.al. | 2311.17126v1 | null |
2023-11-28 | ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis | Xiangjun Gao et.al. | 2311.17123v1 | null |
2023-11-28 | Generative Data Augmentation Improves Scribble-supervised Semantic Segmentation | Jacob Schnell et.al. | 2311.17121v1 | null |
2023-11-28 | Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation | Li Hu et.al. | 2311.17117v1 | null |
2023-11-28 | Robust Diffusion GAN using Semi-Unbalanced Optimal Transport | Quan Dao et.al. | 2311.17101v1 | null |
2023-11-28 | PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation | Jian Ma et.al. | 2311.17086v1 | link |
2023-11-28 | DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling | Linqi Zhou et.al. | 2311.17082v1 | link |
2023-11-28 | Material Palette: Extraction of Materials from a Single Image | Ivan Lopes et.al. | 2311.17060v1 | null |
2023-11-28 | DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models | Tsun-Hsuan Wang et.al. | 2311.17053v1 | null |
2023-11-28 | Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models | Zhengming Yu et.al. | 2311.17050v1 | null |
2023-11-28 | Adversarial Diffusion Distillation | Axel Sauer et.al. | 2311.17042v1 | link |
2023-11-28 | Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer | Danah Yatim et.al. | 2311.17009v1 | null |
2023-11-30 | Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following | Yutong Feng et.al. | 2311.17002v2 | null |
2023-11-28 | COLE: A Hierarchical Generation Framework for Graphic Design | Peidong Jia et.al. | 2311.16974v1 | null |
2023-11-28 | HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion | Jingbo Zhang et.al. | 2311.16961v1 | null |
2023-11-28 | SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models | Yuwei Guo et.al. | 2311.16933v1 | null |
2023-11-28 | RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D | Lingteng Qiu et.al. | 2311.16918v1 | null |
2023-11-28 | On the existence of optimal multi-valued decoders and their accuracy bounds for undersampled inverse problems | Nina Maria Gottschling et.al. | 2311.16898v1 | null |
2023-11-28 | Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration | Chen Zhao et.al. | 2311.16845v1 | null |
2023-11-28 | As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors | Seungwoo Yoo et.al. | 2311.16739v1 | null |
2023-11-28 | LEDITS++: Limitless Image Editing using Text-to-Image Models | Manuel Brack et.al. | 2311.16711v1 | null |
2023-11-28 | MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices | Yang Zhao et.al. | 2311.16567v1 | null |
2023-11-28 | DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser | Peng Chen et.al. | 2311.16565v1 | null |
2023-11-28 | Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models | Ling Fu et.al. | 2311.16555v1 | null |
2023-11-28 | Federated Learning with Diffusion Models for Privacy-Sensitive Vision Tasks | Ye Lin Tun et.al. | 2311.16538v1 | null |
2023-11-27 | SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution | Rongyuan Wu et.al. | 2311.16518v1 | link |
2023-11-27 | LFSRDiff: Light Field Image Super-Resolution via Diffusion Models | Wentao Chao et.al. | 2311.16517v1 | link |
2023-11-27 | Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach | Ayush K. Rai et.al. | 2311.16514v1 | null |
2023-11-27 | CoSeR: Bridging Image and Language for Cognitive Super-Resolution | Haoze Sun et.al. | 2311.16512v1 | null |
2023-11-28 | Exploring Straighter Trajectories of Flow Matching with Diffusion Guidance | Siyu Xing et.al. | 2311.16507v1 | null |
2023-11-27 | TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models | Yushi Huang et.al. | 2311.16503v1 | null |
2023-11-27 | Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent Synthetic Images | Shiu-hong Kao et.al. | 2311.16499v1 | null |
2023-11-27 | MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model | Zhongcong Xu et.al. | 2311.16498v1 | null |
2023-11-28 | Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net | Zizhao Hu et.al. | 2311.16488v1 | null |
2023-11-28 | TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering | Jingye Chen et.al. | 2311.16465v1 | null |
2023-11-28 | Manifold Preserving Guided Diffusion | Yutong He et.al. | 2311.16424v1 | null |
2023-11-29 | ChatTraffic: Text-to-Traffic Generation via Diffusion Model | Chengyang Zhang et.al. | 2311.16203v2 | link |
2023-11-27 | Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for Molecule Generation | Ameya Daigavane et.al. | 2311.16199v1 | null |
2023-11-27 | Test-time Adaptation of Discriminative Models via Diffusion Generative Feedback | Mihir Prabhudesai et.al. | 2311.16102v1 | link |
2023-11-27 | Self-correcting LLM-controlled Diffusion Models | Tsung-Han Wu et.al. | 2311.16090v1 | null |
2023-11-27 | DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization | Zhaoyang Xia et.al. | 2311.16060v1 | link |
2023-11-27 | Exploring Attribute Variations in Style-based GANs using Diffusion Models | Rishubh Parihar et.al. | 2311.16052v1 | null |
2023-11-27 | GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions | Jiemin Fang et.al. | 2311.16037v1 | null |
2023-11-27 | Closing the ODE-SDE gap in score-based diffusion models through the Fokker-Planck equation | Teo Deveney et.al. | 2311.15996v1 | null |
2023-11-27 | DiffAnt: Diffusion Models for Action Anticipation | Zeyun Zhong et.al. | 2311.15991v1 | null |
2023-11-27 | Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion | Yuanxun Lu et.al. | 2311.15980v1 | null |
2023-11-27 | Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models | Claudio Rota et.al. | 2311.15908v1 | link |
2023-11-27 | InterControl: Generate Human Motion Interactions by Controlling Every Joint | Zhenzhi Wang et.al. | 2311.15864v1 | link |
2023-11-27 | SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion | Hsuan-I Ho et.al. | 2311.15855v1 | null |
2023-11-27 | FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax | Yu Lu et.al. | 2311.15813v1 | null |
2023-11-27 | Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation | Biao Gong et.al. | 2311.15773v1 | null |
2023-11-27 | One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls | Minghui Hu et.al. | 2311.15744v1 | null |
2023-11-27 | SceneDM: Scene-level Multi-agent Trajectory Generation with Consistent Diffusion Models | Zhiming Guo et.al. | 2311.15736v1 | null |
2023-11-27 | Regularization by Texts for Latent Diffusion Inverse Solvers | Jeongsol Kim et.al. | 2311.15658v1 | null |
2023-11-27 | Enhancing Diffusion Models with Text-Encoder Reinforcement Learning | Chaofeng Chen et.al. | 2311.15657v1 | link |
2023-11-27 | ET3D: Efficient Text-to-3D Generation via Multi-View Distillation | Yiming Chen et.al. | 2311.15561v1 | null |
2023-11-27 | Instruct2Attack: Language-Guided Semantic Adversarial Attacks | Jiang Liu et.al. | 2311.15551v1 | null |
2023-11-27 | Efficient Dataset Distillation via Minimax Diffusion | Jianyang Gu et.al. | 2311.15529v1 | link |
2023-11-27 | AerialBooth: Mutual Information Guidance for Text Controlled Aerial View Synthesis from a Single Image | Divya Kothandaraman et.al. | 2311.15478v1 | null |
2023-11-26 | DISYRE: Diffusion-Inspired SYnthetic REstoration for Unsupervised Anomaly Detection | Sergio Naval Marimont et.al. | 2311.15453v1 | null |
2023-11-26 | Quantum Diffusion Models | Andrea Cacioppo et.al. | 2311.15444v1 | null |
2023-11-26 | Functional Diffusion | Biao Zhang et.al. | 2311.15435v1 | null |
2023-11-26 | Wired Perspectives: Multi-View Wire Art Embraces Generative AI | Zhiyu Qu et.al. | 2311.15421v1 | null |
2023-11-26 | Flow-Guided Diffusion for Video Inpainting | Bohai Gu et.al. | 2311.15368v1 | link |
2023-11-26 | BS-Diff: Effective Bone Suppression Using Conditional Diffusion Models from Chest X-Ray Images | Zhanghao Chen et.al. | 2311.15328v1 | null |
2023-11-26 | Learning Coarse Propagators in Parareal Algorithm | Bangti Jin et.al. | 2311.15320v1 | null |
2023-11-25 | Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets | Andreas Blattmann et.al. | 2311.15127v1 | link |
2023-11-25 | Leveraging Diffusion Perturbations for Measuring Fairness in Computer Vision | Nicholas Lui et.al. | 2311.15108v1 | null |
2023-11-25 | InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser | Xing Cui et.al. | 2311.15040v1 | null |
2023-11-25 | Point Cloud Pre-training with Diffusion Models | Xiao Zheng et.al. | 2311.14960v1 | null |
2023-11-25 | FreePIH: Training-Free Painterly Image Harmonization with Diffusion Model | Ruibin Li et.al. | 2311.14926v1 | null |
2023-11-25 | GBD-TS: Goal-based Pedestrian Trajectory Prediction with Diffusion using Tree Sampling Algorithm | Ge Sun et.al. | 2311.14922v1 | null |
2023-11-25 | Resfusion: Prior Residual Noise embedded Denoising Diffusion Probabilistic Models | Shi Zhenning et.al. | 2311.14900v1 | null |
2023-11-24 | Geometric theory on large-scale and local determination of density dependence of a recovering large carnivore population | Yunyi Shen et.al. | 2311.14815v1 | null |
2023-11-24 | AdaDiff: Adaptive Step Selection for Fast Diffusion | Hui Zhang et.al. | 2311.14768v1 | null |
2023-11-24 | CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization | Ruoyu Zhao et.al. | 2311.14631v1 | null |
2023-11-24 | Animate124: Animating One Image to 4D Dynamic Scene | Yuyang Zhao et.al. | 2311.14603v1 | null |
2023-11-24 | ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model | Eslam Mohamed Bakr et.al. | 2311.14542v1 | null |
2023-11-24 | GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting | Yiwen Chen et.al. | 2311.14521v1 | null |
2023-11-27 | MVControl: Adding Conditional Control to Multi-view Diffusion for Controllable Text-to-3D Generation | Zhiqi Li et.al. | 2311.14494v2 | null |
2023-11-24 | Joint Diffusion: Mutual Consistency-Driven Diffusion Model for PET-MRI Co-Reconstruction | Taofeng Xie et.al. | 2311.14473v1 | null |
2023-11-24 | Highly Detailed and Temporal Consistent Video Stylization via Synchronized Multi-Frame Diffusion | Minshan Xie et.al. | 2311.14343v1 | null |
2023-11-24 | Decouple Content and Motion for Conditional Image-to-Video Generation | Cuifeng Shen et.al. | 2311.14294v1 | null |
2023-11-24 | Paragraph-to-Image Generation with Information-Enriched Diffusion Model | Weijia Wu et.al. | 2311.14284v1 | link |
2023-11-24 | Image Super-Resolution with Text Prompt Diffusion | Zheng Chen et.al. | 2311.14282v1 | link |
2023-11-24 | Latent Diffusion Prior Enhanced Deep Unfolding for Spectral Image Reconstruction | Zongliang Wu et.al. | 2311.14280v1 | null |
2023-11-23 | HACD: Hand-Aware Conditional Diffusion for Monocular Hand-Held Object Reconstruction | Bowen Fu et.al. | 2311.14189v1 | null |
2023-11-23 | ACT: Adversarial Consistency Models | Fei Kong et.al. | 2311.14097v1 | null |
2023-11-23 | RetroDiff: Retrosynthesis as Multi-stage Distribution Interpolation | Yiming Wang et.al. | 2311.14077v1 | null |
2023-11-23 | Continual Learning of Diffusion Models with Generative Distillation | Sergi Masip et.al. | 2311.14028v1 | link |
2023-11-23 | Touring sampling with pushforward maps | Vivien Cabannes et.al. | 2311.13845v1 | null |
2023-11-23 | Adversarial defense based on distribution transfer | Jiahao Chen et.al. | 2311.13841v1 | null |
2023-11-23 | Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models | Saman Motamed et.al. | 2311.13833v1 | null |
2023-11-23 | Posterior Distillation Sampling | Juil Koo et.al. | 2311.13831v1 | null |
2023-11-23 | Sample-Efficient Training for Diffusion | Shivam Gupta et.al. | 2311.13745v1 | null |
2023-11-22 | A Somewhat Robust Image Watermark against Diffusion-based Editing Models | Mingtian Tan et.al. | 2311.13713v1 | null |
2023-11-22 | Masked Conditional Diffusion Models for Image Analysis with Application to Radiographic Diagnosis of Infant Abuse | Shaoju Wu et.al. | 2311.13688v1 | null |
2023-11-22 | Diffusion models meet image counter-forensics | Matías Tailanian et.al. | 2311.13629v1 | link |
2023-11-22 | TDiffDe: A Truncated Diffusion Model for Remote Sensing Hyperspectral Image Denoising | Jiang He et.al. | 2311.13622v1 | null |
2023-11-21 | Breathing Life Into Sketches Using Text-to-Video Priors | Rinon Gal et.al. | 2311.13608v1 | null |
2023-11-22 | WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space | Katja Schwarz et.al. | 2311.13570v1 | null |
2023-11-22 | **ADriver-I: A General Wo |