Skip to content
View Cherishnoobs's full-sized avatar
🎯
less is more.
🎯
less is more.

Highlights

  • Pro

Organizations

@cczu-osa

Block or report Cherishnoobs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences

Jupyter Notebook 27 Updated Oct 1, 2024

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 882 39 Updated Sep 30, 2024

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 3,416 315 Updated Oct 1, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 4,860 397 Updated Oct 2, 2024

Meshed-Memory Transformer for Image Captioning. CVPR 2020

Python 516 136 Updated Dec 21, 2022

Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"

Python 94 5 Updated Sep 11, 2024

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 6,961 439 Updated Sep 28, 2024

搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)

Python 2,266 288 Updated Oct 5, 2024

Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation

Python 17 Updated Aug 18, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,456 138 Updated Oct 4, 2024

Aura is like Siri, but in your browser. An AI voice assistant optimized for low latency responses.

TypeScript 520 46 Updated Aug 26, 2024

A simple, easy-to-hack GraphRAG implementation

Python 862 90 Updated Oct 1, 2024

[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Python 41 1 Updated Aug 21, 2024

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 7,695 1,045 Updated Sep 10, 2024

The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.

Python 16 2 Updated May 13, 2024

Visualizing the attention of vision-language models

Jupyter Notebook 40 3 Updated Aug 7, 2024

The official PyTorch implementation of the CVPR 2023 paper "Contrastive Grouping with Transformer for Referring Image Segmentation".

Python 41 3 Updated Apr 17, 2024

Diffusion Feedback Helps CLIP See Better

Python 205 11 Updated Aug 24, 2024

CLIP-like model evaluation

Jupyter Notebook 592 75 Updated Aug 16, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 826 41 Updated Sep 27, 2024

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 317 28 Updated Sep 28, 2024

Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗

Jupyter Notebook 130 10 Updated Jan 10, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,325 969 Updated Oct 5, 2024

[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"

Python 37 2 Updated Jul 4, 2024

[ICCV2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Python 315 28 Updated May 8, 2024

[MICCAI-2022] This is the official implementation of Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training.

Python 112 10 Updated Sep 16, 2022

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…

Jupyter Notebook 781 107 Updated Aug 24, 2023

This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"

Python 41 Updated Sep 26, 2024

LLM101n: Let's build a Storyteller

29,153 1,599 Updated Aug 1, 2024
Next