- Singapore
- in/bryan-siow
Block or Report
Block or report bryanSwk
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
This is the repo for our new project Highly Accurate Dichotomous Image Segmentation
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
Official implementation of ⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Low-code framework for building custom LLMs, neural networks, and other AI models
High-Resolution 3D Human Digitization from A Single Image.
Schedule-Free Optimization in PyTorch
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
Sequence Parallel Attention for Long Context LLM Model Training and Inference
Reference implementation of Megalodon 7B model
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new AI research
A simple but complete full-attention transformer with a set of promising experimental features from various papers
YaRN: Efficient Context Window Extension of Large Language Models
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHX…
A Comparative Framework for Multimodal Recommender Systems
Latte: Latent Diffusion Transformer for Video Generation.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.