Highlights
- Pro
Block or Report
Block or report vincentlux
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (3)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Image anomaly detection benchmark in industrial manufacturing
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
A general fine-tuning kit geared toward Stable Diffusion 2.1, Stable Diffusion 3, DeepFloyd, and SDXL.
Easily compute clip embeddings and build a clip retrieval system with them
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024)
Datasets for industrial surface-inspection
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
An open source, layer-based web interface for Collage Diffusion - use a familiar Photoshop-like interface and let the AI harmonize the details.
Official implementations for paper: Anydoor: zero-shot object-level image customization
Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。
[CVPR24] CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
A checklist for incorporation so you can get back to building your product, fundraising, etc.
Official PyTorch implementation code for realizing the technical part of Mixture of All Intelligence (MoAI) to improve performance of numerous zero-shot vision language tasks. (Under Review)
Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
Scenic: A Jax Library for Computer Vision Research and Beyond
MoVA: Adapting Mixture of Vision Experts to Multimodal Context
CoreNet: A library for training deep neural networks
Build a chatbot powered by LlamaIndex that augments GPT 3.5 with the contents of the Streamlit docs (or your own data).