Block or Report
Block or report zyddnys
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
GPT4V-level open-source multi-modal model based on Llama3-8B
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Hackable and optimized Transformers building blocks, supporting a composable construction.
A high-throughput and memory-efficient inference and serving engine for LLMs
a state-of-the-art-level open visual language model | 多模态预训练模型
A convenient and user-friendly anime-style image data processing library that integrates various advanced anime-style image processing models
A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment
set prompt to divided region
AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI
A Gradio web UI for Large Language Models.
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Robust Speech Recognition via Large-Scale Weak Supervision
Nightly release of ControlNet 1.1
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.