Skip to content
View javey-q's full-sized avatar

Block or report javey-q

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 240 27 Updated Sep 27, 2024

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,201 138 Updated Jul 12, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,342 390 Updated Sep 28, 2024

【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流

C++ 1,740 318 Updated Oct 3, 2024

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

C++ 8,626 1,659 Updated Sep 27, 2024

Stable Diffusion and Flux in pure C/C++

C++ 3,329 279 Updated Sep 2, 2024

使用Android手机的CPU推理stable diffusion

Java 138 27 Updated Dec 2, 2023

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 322 20 Updated Sep 19, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 10,614 2,110 Updated Oct 2, 2024

校招、秋招、春招、实习好项目,带你从零动手实现支持LLama的大模型推理框架。

C++ 189 37 Updated Oct 4, 2024

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 1,625 77 Updated Aug 5, 2024

Official inference repo for FLUX.1 models

Python 14,439 1,039 Updated Oct 3, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,078 539 Updated May 31, 2024

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 10,434 669 Updated Aug 14, 2024

A universal Stable-Diffusion toolbox

Python 899 75 Updated Sep 17, 2024

Compare different hardware platforms via the Roofline Model for LLM inference tasks.

Jupyter Notebook 73 3 Updated Mar 13, 2024

🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Cuda 1,229 133 Updated Oct 5, 2024

Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.

Rust 8,539 422 Updated Oct 4, 2024

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 2,928 339 Updated Aug 19, 2024

how to optimize some algorithm in cuda.

Cuda 1,479 122 Updated Oct 5, 2024

Ongoing research training transformer models at scale

Python 10,173 2,288 Updated Oct 5, 2024

VideoSys: An easy and efficient system for video generation

Python 1,681 114 Updated Oct 3, 2024

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters

Python 552 48 Updated Sep 28, 2024

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

Python 417 68 Updated Nov 20, 2023

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,178 328 Updated May 16, 2023

The Project of the Model Deployment course on ShenLan College

Python 3 4 Updated Mar 26, 2024

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,540 349 Updated Oct 5, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,454 924 Updated Sep 25, 2024

This is a framework to evaluate your stable diffusion model

Python 3 Updated Jul 18, 2024

Kolors Team

Python 3,664 242 Updated Sep 4, 2024
Next