TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frame…

Python 345 20 Updated Jul 26, 2024

bytedance / ByteTransformer

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ 442 33 Updated Mar 15, 2024

huggingface / diffusion-fast

Faster generation with text-to-image diffusion models.

Python 173 9 Updated May 16, 2024

hnsywangxin / controlnet_stable_tensorrt

stable diffusion, controlnet, tensorrt, accelerate

Python 50 8 Updated Apr 28, 2023

NVIDIA / Stable-Diffusion-WebUI-TensorRT

TensorRT Extension for Stable Diffusion Web UI

Python 1,858 141 Updated Jun 14, 2024

shenlan2017 / TensorRT-ERNIE

Python 5 7 Updated Nov 25, 2023

lllyasviel / stable-diffusion-webui-forge

Python 5,470 550 Updated Jul 27, 2024

arnavdantuluri / StableTriton

The first open source triton inference engine for Stable Diffusion, specifically for sdxl

Python 11 1 Updated Nov 27, 2023

kamalkraj / stable-diffusion-tritonserver

Deploy stable diffusion model with onnx/tenorrt + tritonserver

Jupyter Notebook 118 21 Updated Aug 15, 2023

yuxiaoranyu / stable_diffusion_trt_triton

Python 18 1 Updated Dec 29, 2023

horseee / DeepCache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Python 712 32 Updated Jun 27, 2024

chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Python 1,069 60 Updated Jul 16, 2024

cuda-mode / lectures

Material for cuda-mode lectures

Jupyter Notebook 1,964 194 Updated Jun 13, 2024

siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

Python 1,488 89 Updated Jul 27, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

25,812 1,372 Updated Jul 21, 2024

CoatiSoftware / Sourcetrail

Sourcetrail - free and open-source interactive source explorer

C++ 14,431 1,340 Updated Dec 13, 2021

mindspore-courses / step_into_llm

MindSpore online courses: Step into LLM

Jupyter Notebook 387 82 Updated Jun 14, 2024

dengyecode / T-former_image_inpainting

Python 21 4 Updated Jun 25, 2023

Meta2ML / CloudMask

Cloud mask with Landsat 8 and Sentinel 2.

Python 7 1 Updated May 29, 2022

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,697 395 Updated Jul 15, 2024

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,125 182 Updated Jul 25, 2024

microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,785 168 Updated Jul 25, 2024

Tlntin / Qwen-TensorRT-LLM

Python 549 50 Updated Jun 19, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,696 836 Updated Jul 28, 2024

ming053l / DRCT

Accepted by New Trends in Image Restoration and Enhancement workshop (NTIRE), in conjunction with CVPR 2024.

Jupyter Notebook 116 10 Updated Jul 15, 2024

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 7,436 887 Updated Jul 25, 2024

state-spaces / mamba

Mamba SSM architecture

Python 11,935 995 Updated Jul 24, 2024

Stability-AI / StableCascade

Official Code for Stable Cascade

Jupyter Notebook 6,461 522 Updated Jul 25, 2024

javey-q

Block or report javey-q

Lists (17)

cloud removal

SAR-optical

竞赛方案

GAN

多模态

图像修复

强化学习

segmentation

detection

AI 绘画

Low level

NTIRE

Diffusion

工具

工作

部署优化

LLM

Stars