Block or Report
Block or report maynardsd
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (22)
Sort Name ascending (A-Z)
A4V_CVPR2024
Algorithm_Interview
Awesome_lists
Diffusion
Distill
Foundation_Model
GPT
Image_translation
label_tools
Mamba
Mamba_track
MMBigModel
Object_detection
SAM
SAM_based_trackanything
Self-supervised_transformers
Semantic_segmentation
small_object_tracking
Trackanypoint
vid2vid
Video_generation
VIS
Stars
Language
Sort by: Recently starred
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Mixture-of-Experts for Large Vision-Language Models
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
AIGC模型的简单实现/Simple Code Demo about Classic AIGC Model/AIGC博客和论文汇总/Compilation of Blogs and Papers on Classic AIGC Models.
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
[Official Repo] A Survey on Vision Mamba: Models, Applications and Challenges
SAM with text prompt
Improving Mamaba performance on Video Understanding task
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
The official repo for the paper "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"
Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything
This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Lumina-T2X is a unified framework for Text to Any Modality Generation
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
[PRCV-2024] State Space Model based Frame-Event Tracking
This project focuses on using the Semantic Segmentation Deep Learning architecture DeepLAbV3+ on the Agriculture-Vision dataset. We focus on improving the architecture's performance by solving the …
This repo contains the code to reproduce our results in CVPR21 Challenge on Agriculture-Vision.
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
VMamba: Visual State Space Models,code is based on mamba
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
[Arxiv] A Survey on Video Diffusion Models
Official Repo for PosSAM: Panoptic Open-vocabulary Segment Anything