iamlockelightning

C iamlockelightning

25 followers · 24 following

Achievements

Stars

baaivision / Emu3

Next-Token Prediction is All You Need

Python 799 22 Updated Sep 30, 2024

AIDC-AI / Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 359 19 Updated Sep 19, 2024

VectorSpaceLab / OmniGen

659 12 Updated Sep 18, 2024

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,251 147 Updated Aug 23, 2024

X-PLUG / mPLUG-HalOwl

mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating

Python 76 2 Updated Jan 29, 2024

facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 6,250 905 Updated Jul 3, 2024

bytedance / ibot

iBOT 🤖: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)

Jupyter Notebook 672 77 Updated Apr 14, 2022

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 608 22 Updated Oct 1, 2024

bronyayang / Law_of_Vision_Representation_in_MLLMs

Official implementation of the Law of Vision Representation in MLLMs

Python 121 7 Updated Sep 8, 2024

Zeyi-Lin / HivisionIDPhotos

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python 10,568 1,034 Updated Sep 28, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,453 138 Updated Oct 4, 2024

google-research-datasets / cvss

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

179 13 Updated Aug 26, 2022

apple / ml-slowfast-llava

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 134 9 Updated Sep 16, 2024

showlab / Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 882 39 Updated Sep 30, 2024

nttmdlab-nlp / SlideVQA

SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)

Python 74 7 Updated Oct 10, 2023

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,137 850 Updated Sep 13, 2024

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 7,847 731 Updated Oct 3, 2024

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 827 41 Updated Sep 27, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 13,629 1,249 Updated Oct 4, 2024

mayubo2333 / MMLongBench-Doc

Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations

Python 49 1 Updated Jul 15, 2024

VladM7 / Stack-Solver

Stack Solver is an app for the optimisation of palletizing and shipping items.

C# 232 2 Updated Jul 16, 2024

XPoet / picx

🏞️ PicX 是一款基于 GitHub API 开发的图床工具，提供图片上传托管、生成图片链接和常用图片工具箱服务。

TypeScript 4,548 469 Updated Aug 13, 2024

aseprite / aseprite

Animated sprite editor & pixel art tool (Windows, macOS, Linux)

C++ 28,914 5,830 Updated Oct 3, 2024

Anima-Lab / MaskDiT

Code for Fast Training of Diffusion Models with Masked Transformers

Python 356 14 Updated May 15, 2024

KwaiVGI / LivePortrait

Bring portraits to life!

Python 12,095 1,272 Updated Sep 6, 2024

tianyu-z / VCR

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

Python 23 1 Updated Sep 30, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 652 36 Updated Aug 5, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,704 112 Updated Sep 19, 2024

apple / ml-4m

4M: Massively Multimodal Masked Modeling

Python 1,568 90 Updated Jul 17, 2024

LTH14 / mage

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Python 507 26 Updated Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C iamlockelightning

Achievements

Achievements

Block or report iamlockelightning

Stars

baaivision / Emu3

AIDC-AI / Ovis

VectorSpaceLab / OmniGen

google-research / big_vision

X-PLUG / mPLUG-HalOwl

facebookresearch / dino

bytedance / ibot

lucidrains / transfusion-pytorch

bronyayang / Law_of_Vision_Representation_in_MLLMs

Zeyi-Lin / HivisionIDPhotos

QwenLM / Qwen2-VL

google-research-datasets / cvss

apple / ml-slowfast-llava

showlab / Show-o

nttmdlab-nlp / SlideVQA

OpenBMB / MiniCPM-V

THUDM / CogVideo

LTH14 / mar

Dao-AILab / flash-attention

mayubo2333 / MMLongBench-Doc

VladM7 / Stack-Solver

XPoet / picx

aseprite / aseprite

Anima-Lab / MaskDiT

KwaiVGI / LivePortrait

tianyu-z / VCR

GAIR-NLP / anole

cambrian-mllm / cambrian

apple / ml-4m

LTH14 / mage