Skip to content
View ottolu's full-sized avatar
  • Microsoft Research Asia
  • Beijing, China

Block or report ottolu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 6,843 525 Updated Jul 17, 2024

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 3,439 283 Updated Aug 14, 2024

播客 🎧 编程、设计、Vlog、音乐、访谈、博客...

1,951 107 Updated Oct 6, 2023

[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation

Python 39 1 Updated Sep 24, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,393 976 Updated Oct 5, 2024
Jupyter Notebook 979 125 Updated Sep 24, 2024
Python 2,493 304 Updated May 19, 2024
Python 6,057 453 Updated Oct 4, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,475 140 Updated Oct 4, 2024

An open source implementation of CLIP.

Python 9,940 959 Updated Aug 19, 2024

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…

Python 3,706 318 Updated Oct 7, 2024

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Python 2,634 166 Updated Sep 27, 2024

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Python 811 41 Updated Oct 6, 2024

[ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset

79 1 Updated Jul 12, 2024

State-of-the-art bilingual open-sourced Math reasoning LLMs.

Python 415 25 Updated Jul 25, 2024

A series of math-specific large language models of our Qwen2 series.

Python 525 46 Updated Sep 18, 2024

🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)

Python 4,790 477 Updated Sep 25, 2024

Whisper with Medusa heads

Python 787 49 Updated Sep 30, 2024

A cross-platform, reimplementation of Notepad++

C++ 9,063 551 Updated Sep 20, 2024

Notepad++ official repository

C++ 22,723 4,585 Updated Oct 5, 2024

A compilation of the best multi-agent papers

189 14 Updated Oct 4, 2024

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Python 12,587 938 Updated Oct 6, 2024
Jupyter Notebook 30 1 Updated Oct 7, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 68,988 8,121 Updated Sep 30, 2024

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 3,811 303 Updated Sep 29, 2024

Fast and memory-efficient exact attention

Python 13,657 1,252 Updated Oct 7, 2024

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 1,140 133 Updated Oct 3, 2024
Python 17 2 Updated Jul 5, 2024

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Python 2,781 261 Updated Sep 26, 2024
Next