-
Microsoft Research Asia
- Beijing, China
Stars
- All languages
- ApacheConf
- Assembly
- Batchfile
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Coq
- Cuda
- D
- Dart
- Dockerfile
- Emacs Lisp
- Erlang
- F#
- F*
- Fortran
- GLSL
- Go
- HTML
- Haskell
- Inno Setup
- Java
- JavaScript
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- Makefile
- Mathematica
- NSIS
- OCaml
- Objective-C
- Objective-C++
- OpenEdge ABL
- PHP
- PLpgSQL
- Perl
- Perl 6
- PowerShell
- Protocol Buffer
- Python
- R
- ReScript
- Ruby
- Rust
- SCSS
- Sass
- Scala
- Scheme
- Shell
- Swift
- TLA
- TeX
- TypeScript
- VHDL
- Verilog
- Vim Script
- XSLT
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
An open source implementation of CLIP.
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
[ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset
State-of-the-art bilingual open-sourced Math reasoning LLMs.
A series of math-specific large language models of our Qwen2 series.
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
A cross-platform, reimplementation of Notepad++
Notepad++ official repository
A compilation of the best multi-agent papers
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Robust Speech Recognition via Large-Scale Weak Supervision
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Fast and memory-efficient exact attention
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family