Skip to content
View maynardsd's full-sized avatar
Block or Report

Block or report maynardsd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

Python 4,213 318 Updated Jul 12, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,365 332 Updated May 28, 2024

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,707 191 Updated May 21, 2024

Mixture-of-Experts for Large Vision-Language Models

Python 1,850 113 Updated May 15, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,224 1,986 Updated Jul 14, 2024

AIGC模型的简单实现/Simple Code Demo about Classic AIGC Model/AIGC博客和论文汇总/Compilation of Blogs and Papers on Classic AIGC Models.

Python 45 3 Updated Jul 17, 2024

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Jupyter Notebook 56,734 29,153 Updated Jul 17, 2024

[Official Repo] A Survey on Vision Mamba: Models, Applications and Challenges

315 20 Updated Jul 16, 2024

SAM with text prompt

Jupyter Notebook 1,389 146 Updated Jun 6, 2024

Improving Mamaba performance on Video Understanding task

Python 20 2 Updated Jul 5, 2024

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

Jupyter Notebook 4,522 471 Updated Jan 29, 2024

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Jupyter Notebook 1,997 145 Updated Jun 6, 2024

The official repo for the paper "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"

Python 105 7 Updated Jul 15, 2024

Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything

Python 862 46 Updated Jun 28, 2024

This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision

Python 49 1 Updated Jun 17, 2024

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Python 2,795 217 Updated Jul 16, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 1,902 80 Updated Jul 17, 2024

[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications

528 30 Updated Jul 17, 2024

[PRCV-2024] State Space Model based Frame-Event Tracking

Python 17 1 Updated Jul 18, 2024

This project focuses on using the Semantic Segmentation Deep Learning architecture DeepLAbV3+ on the Agriculture-Vision dataset. We focus on improving the architecture's performance by solving the …

Jupyter Notebook 10 2 Updated Nov 26, 2022
Python 6 Updated Jun 21, 2021

This repo contains the code to reproduce our results in CVPR21 Challenge on Agriculture-Vision.

Python 6 1 Updated Jan 3, 2022

Agriculture Vision Workshop 2022

Python 6 Updated Jun 18, 2023

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 2,613 162 Updated Jul 15, 2024

VMamba: Visual State Space Models,code is based on mamba

Python 1,872 98 Updated Jul 16, 2024

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Python 1,271 128 Updated Dec 8, 2023

CVPR24

Python 25 2 Updated Jun 14, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 10,928 976 Updated Jul 18, 2024

[Arxiv] A Survey on Video Diffusion Models

1,544 76 Updated Jul 2, 2024

Official Repo for PosSAM: Panoptic Open-vocabulary Segment Anything

38 Updated Apr 7, 2024
Next