Skip to content
View wangqixun's full-sized avatar
:octocat:
:octocat:

Organizations

@instantX-research

Block or report wangqixun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,751 253 Updated Sep 25, 2024

πŸ”₯πŸ”₯πŸ”₯ A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

HTML 314 16 Updated Sep 25, 2024

unoffical implement for upsample guidance

Python 7 Updated Jul 29, 2024

InstantUnify: Integrates Multimodal LLM into Diffusion Models πŸ”₯

31 Updated Aug 8, 2024

SigLIP-based Aesthetic Score Predictor

Python 128 1 Updated May 28, 2024

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 238 21 Updated Dec 28, 2023

CSGO: Content-Style Composition in Text-to-Image Generation πŸ”₯

Jupyter Notebook 229 5 Updated Sep 5, 2024

More suitable IP-Adapter for the DiT architecture

26 Updated Jul 5, 2024

InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation πŸ”₯

Python 39 2 Updated Jul 17, 2024

Enjoy the magic of Diffusion models!

Python 6,400 574 Updated Sep 30, 2024

A generative speech model for daily dialogue.

Python 31,249 3,386 Updated Sep 21, 2024

Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training (NeurIPS 2024)

Python 353 24 Updated Sep 25, 2024
Python 315 22 Updated May 27, 2024

More relighting!

Python 4,962 335 Updated Jun 27, 2024
Jupyter Notebook 33 2 Updated May 28, 2024

InstantID-ROME: Improved Identity-Preserving Generation in Seconds πŸ”₯

181 1 Updated May 7, 2024

Give the gift of rendering text to SDXL

9 Updated Apr 21, 2024

Official implementation of Magic Clothing: Controllable Garment-Driven Image Synthesis

Python 1,377 139 Updated Jul 29, 2024

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation πŸ”₯

Jupyter Notebook 1,622 102 Updated Sep 18, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,720 454 Updated May 3, 2024

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Python 1,062 54 Updated Jul 17, 2024

Official Code for Stable Cascade

Jupyter Notebook 6,528 530 Updated Jul 25, 2024

[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"

Python 451 19 Updated Aug 16, 2024

An API wrapper for Discord written in Python.

Python 14,766 3,748 Updated Sep 23, 2024

πŸ”₯ StableIdentity: Inserting Anybody into Anywhere at First Sight

Python 251 8 Updated Mar 22, 2024

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 6,840 525 Updated Jul 17, 2024

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models (CVPR 2024)

Python 171 8 Updated Mar 29, 2024

Official implementations for paper: Anydoor: zero-shot object-level image customization

Python 3,948 358 Updated Apr 8, 2024
Next