Cherishnoobs

Follow

🎯

less is more.

TeaQwQTea Cherishnoobs

🎯

less is more.

Follow

research is research

26 followers · 32 following

https://cherishnoobs.github.io/Chenglong-Chu.github.io/

Achievements

Achievements

Highlights

Pro

Organizations

Lists (1)

Sort

✨ Inspiration

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

showlab / MovieSeq

[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences

Jupyter Notebook 27 Updated Oct 1, 2024

showlab / Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 882 39 Updated Sep 30, 2024

bklieger-groq / g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 3,416 315 Updated Oct 1, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 4,860 397 Updated Oct 2, 2024

aimagelab / meshed-memory-transformer

Meshed-Memory Transformer for Image Captioning. CVPR 2020

Python 516 136 Updated Dec 21, 2022

kirill-vish / Beyond-INet

Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"

Python 94 5 Updated Sep 11, 2024

OpenBMB / MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 6,961 439 Updated Sep 28, 2024

Doragd / Algorithm-Practice-in-Industry

搜索、推荐、广告、用增等工业界实践文章收集（来源：知乎、Datafuntalk、技术公众号）

Python 2,266 288 Updated Oct 5, 2024

dogehhh / ReCLIP

Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation

Python 17 Updated Aug 18, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,456 138 Updated Oct 4, 2024

ntegrals / aura-voice

Aura is like Siri, but in your browser. An AI voice assistant optimized for low latency responses.

TypeScript 520 46 Updated Aug 26, 2024

gusye1234 / nano-graphrag

A simple, easy-to-hack GraphRAG implementation

Python 862 90 Updated Oct 1, 2024

mc-lan / ClearCLIP

[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Python 41 1 Updated Aug 21, 2024

SakanaAI / AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 7,695 1,045 Updated Sep 10, 2024

ZhangXu0963 / NPC

The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.

Python 16 2 Updated May 13, 2024

zjysteven / VLM-Visualizer

Visualizing the attention of vision-language models

Jupyter Notebook 40 3 Updated Aug 7, 2024

SooLab / CGFormer

The official PyTorch implementation of the CVPR 2023 paper "Contrastive Grouping with Transformer for Referring Image Segmentation".

Python 41 3 Updated Apr 17, 2024

baaivision / DIVA

Diffusion Feedback Helps CLIP See Better

Python 205 11 Updated Aug 24, 2024

LAION-AI / CLIP_benchmark

CLIP-like model evaluation

Jupyter Notebook 592 75 Updated Aug 16, 2024

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 826 41 Updated Sep 27, 2024

anuragajay / decision-diffuser

Python 283 42 Updated May 1, 2023

CleanDiffuserTeam / CleanDiffuser

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 317 28 Updated Sep 28, 2024

merveenoyan / siglip

Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗

Jupyter Notebook 130 10 Updated Jan 10, 2024

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,325 969 Updated Oct 5, 2024

ExplainableML / Vision_by_Language

[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"

Python 37 2 Updated Jul 4, 2024

showlab / UniVTG

[ICCV2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Python 315 28 Updated May 8, 2024

zhjohnchan / M3AE

[MICCAI-2022] This is the official implementation of Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training.

Python 112 10 Updated Sep 16, 2022

hila-chefer / Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…

Jupyter Notebook 781 107 Updated Aug 24, 2023

dvlab-research / Mr-Ben

This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"

Python 41 Updated Sep 26, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

29,153 1,599 Updated Aug 1, 2024

Starred topics

Code quality

Algorithm