Skip to content
View javey-q's full-sized avatar
Block or Report

Block or report javey-q

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

LLM

21 repositories

LLM training code for Databricks foundation models

Python 3,843 503 Updated Jul 10, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 22,640 3,190 Updated Jul 10, 2024

该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题

1,160 90 Updated Mar 31, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

1,933 134 Updated Jul 8, 2024

Finetuning Large Language Models on One Consumer GPU in Under 4 Bits

Python 687 74 Updated May 25, 2024

Large Language Model Text Generation Inference

Python 8,362 948 Updated Jul 10, 2024

Universal LLM Deployment Engine with ML Compilation

Python 17,756 1,411 Updated Jul 8, 2024
Python 50 16 Updated Jul 10, 2024

Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.

Python 576 147 Updated Jul 10, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,015 565 Updated Jul 9, 2024

LLM training in simple, raw C/CUDA

Cuda 21,521 2,335 Updated Jul 10, 2024

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Python 2,728 206 Updated Jul 10, 2024

Mamba SSM architecture

Python 11,603 950 Updated Jul 3, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,433 802 Updated Jul 10, 2024

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,760 164 Updated Jul 10, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,074 178 Updated Jul 9, 2024

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,647 407 Updated Jul 1, 2024

MindSpore online courses: Step into LLM

Jupyter Notebook 381 82 Updated Jun 14, 2024

LLM101n: Let's build a Storyteller

15,304 731 Updated Jun 28, 2024

OneDiff: An out-of-the-box acceleration library for diffusion models.

Python 1,441 85 Updated Jul 10, 2024