Skip to content
View GuangyanZhang's full-sized avatar

Block or report GuangyanZhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,342 390 Updated Sep 28, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 11,714 2,447 Updated Oct 4, 2024

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream d…

Python 459 30 Updated Oct 3, 2024

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,142 173 Updated Sep 10, 2024

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python 341 31 Updated Feb 24, 2024

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 263 21 Updated Jul 2, 2024

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 240 27 Updated Sep 27, 2024

A primitive library for neural network

C++ 1,277 215 Updated Aug 18, 2024

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python 217 21 Updated Aug 27, 2024

[IEEE T-PAMI] Awesome BEV perception research and cookbook for all level audience in autonomous diriving

Python 1,180 100 Updated Jan 6, 2024
Python 70 5 Updated Dec 15, 2023

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 50,937 11,369 Updated Oct 3, 2024

大模型入门

16 5 Updated Mar 16, 2024

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 1,677 202 Updated Sep 21, 2024

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,854 309 Updated Oct 4, 2024

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,540 349 Updated Oct 4, 2024

✨✨Latest Advances on Multimodal Large Language Models

12,016 769 Updated Sep 25, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 21,762 2,105 Updated Aug 9, 2024
Python 1,177 170 Updated Sep 19, 2024

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

Python 315 21 Updated Mar 21, 2024

A pytorch quantization backend for optimum

Python 781 55 Updated Oct 4, 2024

Public repo for HF blog posts

Jupyter Notebook 2,303 714 Updated Oct 4, 2024

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,874 667 Updated Sep 6, 2024

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,614 99 Updated Sep 20, 2024

Ongoing research training transformer models at scale

Python 10,173 2,288 Updated Oct 4, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 1,215 115 Updated Oct 4, 2024

Comparison of Language Model Inference Engines

184 5 Updated Sep 2, 2024

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Python 767 36 Updated Jun 27, 2024

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models

Python 5,543 408 Updated Oct 4, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 25,406 5,263 Updated Oct 4, 2024
Next