Skip to content
@OpenGVLab

OpenGVLab

General Vision Team of Shanghai AI Laboratory

Static Badge Twitter

Welcome to OpenGVLab! 👋

We are a research group from Shanghai AI Lab focused on Vision-Centric AI research. The GV in our name, OpenGVLab, means general vision, a general understanding of vision, so little effort is needed to adapt to new vision-based tasks.

We develop model architecture and release pre-trained foundation models to the community to motivate further research in this area. We have made promising progress in general vision AI, with 109 SOTA🚀. In 2022, our open-sourced foundation model 65.5 mAP on the COCO object detection benchmark, 91.1% Top1 accuracy in Kinetics 400, achieved landmarks for AI vision👀 tasks for image🖼️ and video📹 understanding.

Based on solid vision foundations, we have expanded to Multi-Modality models and Generative AI(partner with Vchitect). We aim to empower individuals and businesses by offering a higher starting point for developing vision-based AI products and lessening the burden of building an AI model from scratch.

Branches: Alpha (explore lattest advances in vision+language research) and uni-medical (focus on medical AI)

Follow us:    Twitter X logo Twitter   🤗Hugging Face    Medium logo Medium    WeChat logo WeChat    zhihu logo Zhihu

Pinned

  1. InternVL InternVL Public

    [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

    Python 3.7k 284

  2. InternVideo InternVideo Public

    Video Foundation Models & Data for Multimodal Understanding

    Python 1.1k 71

  3. DCNv4 DCNv4 Public

    [CVPR 2024] Deformable Convolution v4

    Python 403 25

  4. Ask-Anything Ask-Anything Public

    [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

    Python 2.8k 229

  5. LLaMA-Adapter LLaMA-Adapter Public

    [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

    Python 5.6k 362

  6. OmniQuant OmniQuant Public

    [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

    Python 611 48

Repositories

Showing 10 of 64 repositories
  • GUI-Odyssey Public

    GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos.

    OpenGVLab/GUI-Odyssey’s past year of commit activity
    Python 31 2 0 0 Updated Jun 24, 2024
  • InternImage Public

    [CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

    OpenGVLab/InternImage’s past year of commit activity
    Python 2,390 MIT 232 173 6 Updated Jun 23, 2024
  • Instruct2Act Public

    Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

    OpenGVLab/Instruct2Act’s past year of commit activity
    Python 294 18 0 0 Updated Jun 23, 2024
  • Ask-Anything Public

    [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

    OpenGVLab/Ask-Anything’s past year of commit activity
    Python 2,832 MIT 228 76 4 Updated Jun 22, 2024
  • InternVL Public

    [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

    OpenGVLab/InternVL’s past year of commit activity
    Python 3,707 MIT 284 137 2 Updated Jun 20, 2024
  • MM-NIAH Public

    This is the official implementation of the paper "Needle In A Multimodal Haystack"

    OpenGVLab/MM-NIAH’s past year of commit activity
    Python 40 3 1 0 Updated Jun 20, 2024
  • Hulk Public

    An official implementation of "Hulk: A Universal Knowledge Translator for Human-Centric Tasks"

    OpenGVLab/Hulk’s past year of commit activity
    Python 64 MIT 2 3 0 Updated Jun 19, 2024
  • SAM-Med2D Public

    Official implementation of SAM-Med2D

    OpenGVLab/SAM-Med2D’s past year of commit activity
    Jupyter Notebook 788 Apache-2.0 74 40 1 Updated Jun 18, 2024
  • all-seeing Public

    [ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

    OpenGVLab/all-seeing’s past year of commit activity
    Python 406 13 6 0 Updated Jun 18, 2024
  • PhyBench Public

    The official repo of PhyBench

    OpenGVLab/PhyBench’s past year of commit activity
    6 1 0 0 Updated Jun 18, 2024