-
MVIG-RHOS
- Shanghai
-
10:32
(UTC +08:00) - https://scholar.google.com/citations?user=fhDk2-wAAAAJ&hl=zh-CN
Highlights
- Pro
Stars
[CVPR 2024] Official implementation of the paper "Towards Versatile Human-Human Interaction Analysis"
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)
A collection of useful datasets for robotics and computer vision
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
A fast, clean, responsive Hugo theme.
AcadHomepage: A Modern and Responsive Academic Personal Homepage
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Papers and Datasets about Point Cloud.
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning
一个用于在 macOS 上平滑你的鼠标滚动效果或单独设置滚动方向的小工具, 让你的滚轮爽如触控板 | A lightweight tool used to smooth scrolling and set scroll direction independently for your mouse on macOS
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
Tooling for the Common Objects In 3D dataset.
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark, CVPR 2019, Oral
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model