Block or Report
Block or report javey-q
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLLM
LLM training code for Databricks foundation models
A high-throughput and memory-efficient inference and serving engine for LLMs
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Finetuning Large Language Models on One Consumer GPU in Under 4 Bits
Large Language Model Text Generation Inference
Universal LLM Deployment Engine with ML Compilation
Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
Hackable and optimized Transformers building blocks, supporting a composable construction.
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MindSpore online courses: Step into LLM
OneDiff: An out-of-the-box acceleration library for diffusion models.