[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models"

Jupyter Notebook 40 3 Updated Jun 4, 2024

Linwei-Chen / LIS

IJCV2023 Instance Segmentation in the Dark

Python 71 7 Updated Mar 13, 2024

chaiNNer-org / chaiNNer

A node-based image processing GUI aimed at making chaining image processing tasks easy and customizable. Born as an AI upscaling application, chaiNNer has grown into an extremely flexible and power…

Python 4,336 273 Updated Jul 18, 2024

Kai-Liu001 / 2DQuant

PyTorch code for our paper "2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution"

13 Updated Jun 10, 2024

OswaldHe / HMT-pytorch

Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"

Python 53 2 Updated May 22, 2024

Fanghua-Yu / SUPIR

SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.

Python 4,037 356 Updated Jul 30, 2024

bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Python 5,831 593 Updated Jul 31, 2024

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,105 251 Updated Jul 31, 2024

intel / auto-round

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

Python 135 19 Updated Jul 31, 2024

Guangxuan-Xiao / torch-int

This repository contains integer operators on GPUs for PyTorch.

Python 161 48 Updated Sep 29, 2023

JDAI-CV / dabnn

dabnn is an accelerated binary neural networks inference framework for mobile platform

C++ 765 100 Updated Nov 12, 2019

mit-han-lab / lmquant

Python 88 4 Updated Jun 12, 2024

IST-DASLab / CAP

Repository for Correlation Aware Prune (NeurIPS23) source and experimental code

Python 4 1 Updated Nov 29, 2023

microsoft / Moonlit

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

Python 67 7 Updated Apr 12, 2024

ROIM1998 / APT

[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference

Python 15 Updated Jun 4, 2024

daquexian / onnx-simplifier

Simplify your onnx model

C++ 3,710 378 Updated Jul 8, 2024

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,106 143 Updated Jul 31, 2024

nndeploy / nndeploy

nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为基础，致力为用户提供跨平台、简单易用、高性能的模型部署体验。

C++ 554 86 Updated Jul 30, 2024

LabShuHangGU / Adaptive-Token-Dictionary

CVPR2024 - Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary

Python 103 6 Updated Jun 29, 2024

mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,161 135 Updated Jul 31, 2024

ModelTC / TFMQ-DM

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

Jupyter Notebook 48 3 Updated Jul 25, 2024

usyd-fsalab / fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 162 13 Updated May 28, 2024

StiphyJay / LiDAR-PTQ

ICLR2024: LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection.

58 Updated Jul 9, 2024

hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 240 27 Updated Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vokkko vokkko

Block or report vokkko

Lists (1)

🚀 My stack

Stars

CompVis / latent-diffusion

BradyFU / Awesome-Multimodal-Large-Language-Models

zysxmu / ERQ

MyNiuuu / MOFA-Video

efeslab / Atom

1hunters / retraining-free-quantization

ThisisBillhe / EfficientDM