-
NUST MISIS / SBER AI / AIRI
- Moscow, Russia
-
00:18
(UTC +03:00) - levnovitskiy@gmail.com
- https://t.me/leffffffffffff
- in/lev-novitskiy-022289261
- https://t.me/mlball_days
Highlights
- Pro
Lists (11)
Sort Name ascending (A-Z)
Starred repositories
[NeurIPS 2024] Boosting the performance of consistency models with PCM!
[CVPR2024] Official PyTorch implementation of "Contrastive Denoising Score(CDS) for Text-guided Latent Diffusion Image Editing"
My take on E(n) Equivariant Graph Neural Networks
Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
KandinskyVideo — multilingual end-to-end text2video latent diffusion model
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Official PyTorch implementation for the paper Minimizing Trajectory Curvature of ODE-based Generative Models, ICML 2023
The codebase of our paper "Improving the Training of Rectified Flows"
Light and Optimal Schrödinger Bridge Matching (ICML 2024) official PyTorch implementation=
PyTorch Implementation of Diffusion Schrodinger Bridge Matching
Text-to-Music Generation with Rectified Flow Transformers
PyTorch implementation of Real-ESRGAN model
Model for watermark classification implemented with PyTorch
Code for "Aligning Optimization Trajectories with Diffusion Models for Constrained Design Generation" @ NeurIPS 2023
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
[IJCAI-24] Spatial-Temporal-Decoupled Masked Pre-training for Spatiotemporal Forecasting
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
Scaling Diffusion Transformers with Mixture of Experts
[CVPR2024] Official implementation of High-fidelity Person-centric Subject-to-Image Synthesis.
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Official Github Repo for Neurips 2024 Paper Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]
VoiceLDM: Text-to-Speech with Environmental Context
FMBoost: Boosting Latent Diffusion with Flow Matching (ECCV 2024 Oral)
Zero-shot Image-to-Image Translation [SIGGRAPH 2023]